- 12 3月, 2014 2 次提交
-
-
由 sdong 提交于
Summary: To temp fix the log buffer flushing. Flush the buffer inside the lock. Clean the trunk before we find an eventual fix. Test Plan: make all check Reviewers: haobo, igor Reviewed By: igor CC: ljin, leveldb, yhchiang Differential Revision: https://reviews.facebook.net/D16791
-
由 Igor Canadi 提交于
Summary: Having code after SignalAll has already caused 2 bugs. Let's make sure this doesn't happen again. Test Plan: no test Reviewers: sdong, dhruba, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16785
-
- 11 3月, 2014 5 次提交
-
-
由 Igor Canadi 提交于
Summary: as title Test Plan: fixed the build failure http://ci-builds.fb.com/job/rocksdb_build/987/console Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16743
-
由 Haobo Xu 提交于
Summary: as title. also made info log output of file deletion a bit more descriptive. Test Plan: make check; db_bench and look at LOG output Reviewers: igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D16731
-
由 Lei Jin 提交于
Summary: (1) Fix SanitizeOptions() to also check HashLinkList. The current dynamic case just happens to work because the 2 classes have the same layout. (2) Do not delete SliceTransform object in HashSkipListFactory and HashLinkListFactory destructor. Reason: SanitizeOptions() enforces prefix_extractor and SliceTransform to be the same object when Hash**Factory is used. This makes the behavior strange: when Hash**Factory is used, prefix_extractor will be released by RocksDB. If other memtable factory is used, prefix_extractor should be released by user. Test Plan: db_bench && make asan_check Reviewers: haobo, igor, sdong Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D16587
-
由 Haobo Xu 提交于
Summary: KSVObsolete is no longer nullptr and needs to be checked explicitly. Also did some minor code cleanup and added a stat counter to track superversion cleanups incurred in the foreground. Test Plan: make check Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16701
-
由 Haobo Xu 提交于
Summary: Moved LogBuffer class to an internal header. Removed some unneccesary indirection. Enabled log buffer for BackgroundCallFlush. Forced log buffer flush right after Unlock to improve time ordering of info log. Test Plan: make check; db_bench compare LOG output Reviewers: sdong Reviewed By: sdong CC: leveldb, igor Differential Revision: https://reviews.facebook.net/D16707
-
- 08 3月, 2014 1 次提交
-
-
由 Lei Jin 提交于
Summary: Add a check at the end of GetImpl to release SuperVersion if it becomes obsolete. Also do Scrape() inside InstallSuperVersion so it happens more frequent. Test Plan: make all check running asan_check now Reviewers: igor, haobo, sdong, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16641
-
- 07 3月, 2014 1 次提交
-
-
由 Igor Canadi 提交于
Summary: Not deleting local SV caused some an crash test issue: http://ci-builds.fb.com/job/rocksdb_asan_crash_test/83/console Test Plan: ran unit tests Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16635
-
- 06 3月, 2014 1 次提交
-
-
由 sdong 提交于
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex. Test Plan: make all check check the log lines while running some tests that trigger compactions. Reviewers: haobo, igor, dhruba Reviewed By: dhruba CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg- Differential Revision: https://reviews.facebook.net/D16515
-
- 05 3月, 2014 1 次提交
-
-
由 sdong 提交于
Summary: Two changes: 1. DeletionState is only constructed when cleaning up is needed 2. Fix the bug of deletion state construction bug. A change was made in a previous patch: https://reviews.facebook.net/rROCKSDB774ed89c2405ee058086b099cbc8b29e243739cc#71a34e2e However, it somehow got lost when merging Test Plan: make all check Reviewers: kailiu, haobo, igor Reviewed By: igor CC: igor, dhruba, i.am.jin.lei, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D16233
-
- 01 3月, 2014 2 次提交
-
-
由 Igor Canadi 提交于
Summary: This diff does two things: (1) Log::Reader does not report a corruption when the last record in a log or manifest file is truncated (meaning that log writer died in the middle of the write). Inherited the code from LevelDB: https://code.google.com/p/leveldb/source/detail?r=269fc6ca9416129248db5ca57050cd5d39d177c8# (2) Turn off mmap writes for all writes to log and manifest files (2) is necessary because if we use mmap writes, the last record is not truncated, but is actually filled with zeros, making checksum fail. It is hard to recover from checksum failing. Test Plan: Added unit tests from LevelDB Actually recovered a "corrupted" MANIFEST file. Reviewers: dhruba, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16119
-
由 Yueh-Hsuan Chiang 提交于
Summary: Add an optional input parameter ReadOptions to DB::GetUpdateSince(), which allows the verification of checksums to be disabled by setting ReadOptions::verify_checksums to false. Test Plan: Tests are done off-line and will not be included in the regular unit test. Reviewers: igor Reviewed By: igor CC: leveldb, xjin, dhruba Differential Revision: https://reviews.facebook.net/D16305
-
- 28 2月, 2014 1 次提交
-
-
由 Lei Jin 提交于
Summary: as title Test Plan: asan_check will post results later Reviewers: haobo, igor, dhruba, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16257
-
- 26 2月, 2014 1 次提交
-
-
由 Igor Canadi 提交于
Summary: This will also help with avoiding the deadlock. If a flush failed and we're waiting for a memtable to be flushed, we should schedule a new flush and hope a new one succeedes. If paranoid_checks = false, Wait() will still hang on ENOSPC, but at least it will automatically continue when the space frees up. Current behavior both hangs and deadlocks. Also, I renamed some 'compaction' to 'flush'. 'compaction' was leveldb way of saying things. Test Plan: make check Reviewers: dhruba, haobo, ljin Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16281
-
- 25 2月, 2014 1 次提交
-
-
由 Igor Canadi 提交于
Summary: More info here: https://github.com/facebook/rocksdb/issues/89 If flush fails because of ENOSPC, we have a deadlock problem. This is a quick fix that will continue the normal operation when user deletes the file and frees up the space on the device. We need to address the issue more broadly with bg_error_ cleanup. Test Plan: make check Reviewers: dhruba, haobo, ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16275
-
- 20 2月, 2014 1 次提交
-
-
由 sdong 提交于
Summary: Currently, the first transaction log file ignore bytes_per_sync and other storage-related options. It is not consistent. Fix it. Test Plan: make all check. See the options set in GDB. Reviewers: haobo, kailiu Reviewed By: haobo CC: igor, ljin, yhchiang, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D16215
-
- 14 2月, 2014 1 次提交
-
-
由 kailiu 提交于
Summary: Provide a public API for users to access the table properties for each SSTable. Test Plan: Added a unit tests to test the function correctness under differnet conditions. Reviewers: haobo, dhruba, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16083
-
- 13 2月, 2014 4 次提交
-
-
由 Siying Dong 提交于
Summary: 1. Add some more implementation-aware tests for PlainTable 2. move from a hard-coded one index per 16 rows in one prefix to a configurable number. Also, make hash table ratio = 0 means binary search only. Also fixes some divide 0 risks. 3. Explicitly support total order (only use binary search) 4. some code cleaning up. Test Plan: make all check Reviewers: haobo, kailiu Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16023
-
由 Igor Canadi 提交于
Summary: Added a bit more information to compaction context, requested by internal team at FB. Test Plan: Modified CompactionFilter test to make sure is_manual_compaction is properly set. Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16095
-
由 Lei Jin 提交于
Summary: Clean up IOErrors so that it only indicates errors talking to device. Test Plan: make all check Reviewers: igor, haobo, dhruba, emayanke Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D15831
-
由 Lei Jin 提交于
Summary: This covers existing table files before DB open happens and avoids contention on table cache Test Plan: db_test Reviewers: haobo, sdong, igor, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16089
-
- 11 2月, 2014 1 次提交
-
-
由 Kai Liu 提交于
-
- 04 2月, 2014 2 次提交
-
-
由 Lei Jin 提交于
Summary: Use super_version insider NewIterator to avoid Ref() each component separately under mutex The new added bench shows NewIterator QPS increases from 515K to 719K No meaningful improvement for multiget I guess due to its relatively small cost comparing to 90 keys fetch in the test. Test Plan: unit test and db_bench Reviewers: igor, sdong Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D15609
-
由 Siying Dong 提交于
Summary: In PlainTable, use one single byte to represent 8 bytes of internal bytes, if seqID = 0 and it is value type (which should be common for bottom most files). It is to save 7 bytes for uncompressed cases. Test Plan: make all check Reviewers: haobo, dhruba, kailiu Reviewed By: haobo CC: igor, leveldb Differential Revision: https://reviews.facebook.net/D15489
-
- 03 2月, 2014 1 次提交
-
-
由 kailiu 提交于
Summary: Addressed all the issues in https://reviews.facebook.net/D15447. Now most table-related modules are hidden from user land. Test Plan: make check Reviewers: sdong, haobo, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15525
-
- 01 2月, 2014 2 次提交
-
-
由 Igor Canadi 提交于
Summary: VersionSet::next_file_number_ is always assumed to be strictly greater than VersionSet::log_number_. In our new recovery code, we artificially set log_number_ to be (log_number + 1), so that once we flush, we don't recover from the same log file again (this is important because of merge operator non-idempotence) When we set VersionSet::log_number_ to (log_number + 1), we also have to mark that file number used, such that next_file_number_ is increased to a legal level. Otherwise, VersionSet might assert. This has not be a problem so far because here's what happens: 1. assume next_file_number is 5, we're recovering log_number 10 2. in DBImpl::Recover() we call MarkFileNumberUsed with 10. This will set VersionSet::next_file_number_ to 11. 3. If there are some updates, we will call WriteTable0ForRecovery(), which will use file number 11 as a new table file and advance VersionSet::next_file_number_ to 12. 4. When we LogAndApply() with log_number 11, assertion is true: assert(11 <= 12); However, this was a lucky occurrence. Even though this diff doesn't cause a bug, I think the issue is important to fix. Test Plan: In column families I have different recovery logic and this code path asserted. When adding MarkFileNumberUsed(log_number + 1) assert is gone. Reviewers: dhruba, kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15783
-
由 Siying Dong 提交于
Summary: I didn't figure out the reason why the feature of zeroing out earlier sequence ID is disabled in universal compaction. I do see bottommost_level is set correctly. It should simply work if we remove the constraint of universal compaction. Test Plan: make all check Reviewers: haobo, dhruba, kailiu, igor Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15423
-
- 30 1月, 2014 2 次提交
-
-
由 Igor Canadi 提交于
Summary: In DBImpl we keep track of some statistics internally and expose them via GetProperty(). This diff encapsulates all the internal statistics into a class InternalStatisics. Most of it is copy/paste. Apart from cleaning up db_impl.cc, this diff is also necessary for Column families, since every column family should have its own CompactionStats, MakeRoomForWrite-stall stats, etc. It's much easier to keep track of it in every column family if it's nicely encapsulated in its own class. Test Plan: make check Reviewers: dhruba, kailiu, haobo, sdong, emayanke Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15273
-
由 Lei Jin 提交于
Summary: as title Test Plan: unit test Reviewers: haobo, igor, sdong, kailiu, dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15435
-
- 28 1月, 2014 3 次提交
-
-
由 Mark Callaghan 提交于
Summary: The new columns are msComp and msStall that provide average time per compaction and stall for that level in milliseconds. Level Files Size(MB) Score Time(sec) Read(MB) Write(MB) Rn(MB) Rnp1(MB) Wnew(MB) RW-Amplify Read(MB/s) Write(MB/s) Rn Rnp1 Wnp1 NewW Count msComp msStall Ln-stall Stall-cnt ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 0 8 15 1.5 2 0 30 0 0 30 0.0 0.0 15.5 0 0 0 0 16 112 0.2 1.3 7568 1 8 16 1.6 1 26 26 15 11 16 3.5 17.6 18.1 8 6 13 7 3 362 0.0 0.0 0 2 1 2 0.0 0 0 2 0 0 2 0.0 0.0 18.4 0 0 0 0 1 50 0.0 0.0 0 Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15345
-
由 Igor Canadi 提交于
Summary: @dhruba, I'm not sure where we need to sync the directory. I implemented the function in Env() and added the dir sync just after we close the newly created file in the builder. Should I also add FsyncDir() to new files that get created by a compaction? Test Plan: Confirmed that FsyncDir is returning Status::OK() Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D14751
-
由 Igor Canadi 提交于
Summary: There is no reason to have functions NeedCompaction(), MaxCompactionScore() and MaxCompactionScoreLevel() in VersionSet, since they don't access any data in VersionSet. Test Plan: make check Reviewers: kailiu, haobo, sdong Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15333
-
- 25 1月, 2014 2 次提交
-
-
由 Igor Canadi 提交于
Immutable tailing iterator doesn't set CleanupState::mem, so we don't have to unref it.
-
由 Igor Canadi 提交于
Summary: MemTableListVersion is to MemTableList what Version is to VersionSet. I took almost the same ideas to develop MemTableListVersion. The reason is to have copying std::list done in background, while flushing, rather than in foreground (MultiGet() and NewIterator()) under a mutex! Also, whenever we copied MemTableList, we copied also some MemTableList metadata (flush_requested_, commit_in_progress_, etc.), which was wasteful. This diff avoids std::list copy under a mutex in both MultiGet() and NewIterator(). I created a small database with some number of immutable memtables, and creating 100.000 iterators in a single-thread (!) decreased from {188739, 215703, 198028} to {154352, 164035, 159817}. A lot of the savings come from code under a mutex, so we should see much higher savings with multiple threads. Creating new iterator is very important to LogDevice team. I also think this diff will make SuperVersion obsolete for performance reasons. I will try it in the next diff. SuperVersion gave us huge savings on Get() code path, but I think that most of the savings came from copying MemTableList under a mutex. If we had MemTableListVersion, we would never need to copy the entire object (like we still do in NewIterator() and MultiGet()) Test Plan: `make check` works. I will also do `make valgrind_check` before commit Reviewers: dhruba, haobo, kailiu, sdong, emayanke, tnovak Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15255
-
- 24 1月, 2014 2 次提交
-
-
由 Lei Jin 提交于
Summary: as title Test Plan: make all check What else tests shall I cover? Reviewers: igor, haobo CC: Differential Revision: https://reviews.facebook.net/D15339
-
由 Tomislav Novak 提交于
Summary: This diff implements a special type of iterator that doesn't create a snapshot (can be used to read newly inserted data) and is optimized for doing sequential reads. TailingIterator uses current superversion number to determine whether to invalidate its internal iterators. If the version hasn't changed, it can often avoid doing expensive seeks over immutable structures (sst files and immutable memtables). Test Plan: * new unit tests * running LD with this patch Reviewers: igor, dhruba, haobo, sdong, kailiu Reviewed By: sdong CC: leveldb, lovro, march Differential Revision: https://reviews.facebook.net/D15285
-
- 23 1月, 2014 1 次提交
-
-
由 Igor Canadi 提交于
Summary: This diff does two things: * Rethinks how we call Recover() with read_only option. Before, we call it with pointer to memtable where we'd like to apply those changes to. This memtable is set in db_impl_readonly.cc and it's actually DBImpl::mem_. Why don't we just apply updates to mem_ right away? It seems more intuitive. * Changes when we apply updates to manifest. Before, the process is to recover all the logs, flush it to sst files and then do one giant commit that atomically adds all recovered sst files and sets the next log number. This works good enough, but causes some small troubles for my column family approach, since I can't have one VersionEdit apply to more than single column family[1]. The change here is to commit the files recovered from logs right away. Here is the state of the world before the change: 1. Recover log 5, add new sst files to edit 2. Recover log 7, add new sst files to edit 3. Recover log 8, add new sst files to edit 4. Commit all added sst files to manifest and mark log files 5, 7 and 8 as recoverd (via SetLogNumber(9) function) After the change, we'll do: 1. Recover log 5, commit the new sst files and set log 5 as recovered 2. Recover log 7, commit the new sst files and set log 7 as recovered 3. Recover log 8, commit the new sst files and set log 8 as recovered The added (small) benefit is that if we fail after (2), the new recovery will only have to recover log 8. In previous case, we'll have to restart the recovery from the beginning. The bigger benefit will be to enable easier integration of multiple column families in Recovery code path. [1] I'm happy to dicuss this decison, but I believe this is the cleanest way to go. It also makes backward compatibility much easier. We don't have a requirement of adding multiple column families atomically. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15237
-
- 18 1月, 2014 2 次提交
-
-
由 Mark Callaghan 提交于
Summary: This moves the use of versions_ to before the mutex is unlocked to avoid a possible race. Task ID: # Blame Rev: Test Plan: make check Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin *PUBLIC* platform impact section - Bugzilla: # - end platform impact - Reviewers: haobo, dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15279
-
由 Igor Canadi 提交于
Summary: I'm separating code-cleanup part of https://reviews.facebook.net/D14517. This will make D14517 easier to understand and this diff easier to review. Test Plan: make check Reviewers: haobo, kailiu, sdong, dhruba, tnovak Reviewed By: tnovak CC: leveldb Differential Revision: https://reviews.facebook.net/D15099
-