1. 25 3月, 2014 1 次提交
  2. 22 3月, 2014 1 次提交
    • S
      Fix data corruption by LogBuffer · 83ab62e2
      sdong 提交于
      Summary: LogBuffer::AddLogToBuffer() uses vsnprintf() in the wrong way, which might cause buffer overflow when log line is too line. Fix it.
      
      Test Plan: Add a unit test to cover most LogBuffer's most logic.
      
      Reviewers: igor, haobo, dhruba
      
      Reviewed By: igor
      
      CC: ljin, yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17103
      83ab62e2
  3. 21 3月, 2014 6 次提交
  4. 20 3月, 2014 7 次提交
    • S
      Add a unit test to verify compaction filter context · 752ec46c
      sdong 提交于
      Summary: Add unit tests to make sure CompactionFilterContext::is_manual_compaction_ and CompactionFilterContext::is_full_compaction_ are set correctly.
      
      Test Plan: run the new tests.
      
      Reviewers: haobo, igor, dhruba, yhchiang, ljin
      
      Reviewed By: haobo
      
      CC: nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17067
      752ec46c
    • I
      ComputeCompactionScore in CompactionPicker · fcd5c5e8
      Igor Canadi 提交于
      Summary:
      As it turns out, we need the call to ComputeCompactionScore (previously: Finalize) in CompactionPicker.
      
      The issue caused a deadlock in db_stress: http://ci-builds.fb.com/job/rocksdb_crashtest/290/console
      
      The last two lines before a deadlock were:
      2014/03/18-22:43:41.481029 7facafbee700 (Original Log Time 2014/03/18-22:43:41.480989) Compaction nothing to do
      2014/03/18-22:43:41.481041 7faccf7fc700 wait for fewer level0 files...
      
      "Compaction nothing to do" and other thread waiting for fewer level0 files. Hm hm.
      
      I moved the pre-sorting to SaveTo, which should fix both the original and the new issue.
      
      Test Plan: make check for now, will run db_stress in jenkins
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17037
      fcd5c5e8
    • K
      Fix two bugs in talbe format · 69f6cf43
      Kai Liu 提交于
      Summary:
      Previous code had two bugs:
      
      * didn't initialize the table_magic_number_ explicitly -- as a
        result a random junk number is stored for table_magic_number_, making
        HasInitializedMagicNumber() always return true.
      * if condition is inconrrect in set_table_magic_number(), and the return value is not checked.
        I replace if-else by a stronger requirement enforced by assert().
      
      Test Plan:
      Previous sst_dump failed to work.
      After the fix, things back to normal.
      
      Reviewers: yhchiang
      
      CC: haobo, sdong, igor, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17055
      69f6cf43
    • I
      Don't compact with zero input files · e493f2f5
      Igor Canadi 提交于
      Summary:
      We have an issue with internal service trying to run compaction with zero input files:
      2014/02/07-02:26:58.386531 7f79117ec700 Compaction start summary: Base version 1420 Base level 3, seek compaction:0, inputs:[ϛ~^Qy^?],[]
      2014/02/07-02:26:58.386539 7f79117ec700 Compacted 0@3 + 0@4 files => 0 bytes
      
      There are two issues:
      * inputsummary is printing out junk
      * it's constantly retrying (since I guess madeProgress is true), so it prints out a lot of data in the LOG file (40GB in one day).
      
      I read through the Level compaction picker and added some failure condition if input[0] is empty. I think PickCompaction() should not return compaction with zero input files with this change. I'm not confident enough to add an assertion though :)
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, sdong, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16005
      e493f2f5
    • I
      add tags to gitignore · 1ad0c2f9
      Igor Canadi 提交于
      1ad0c2f9
    • I
      Fix compile issue in Mac OS · 22507aff
      Igor Canadi 提交于
      Summary:
      Compile issues are:
      * Unused variable env_
      * Unused fallocate_with_keep_size_
      
      Test Plan: compiles
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17043
      22507aff
    • L
      avoid shared_ptr assignment in Version::Get() · 6dc940d4
      Lei Jin 提交于
      Summary:
      This is a 500ns operation while the whole Get() call takes only a few
      micro!
      
      Test Plan: ran db_bench, for a DB with 50M keys, QPS jumps from 5.2M/s to 7.2M/s
      
      Reviewers: haobo, igor, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17007
      6dc940d4
  5. 19 3月, 2014 5 次提交
    • S
      Add a DB property to indicate number of background errors encountered · 71e6a342
      sdong 提交于
      Summary: Add a property to calculate number of background errors encountered to help users build their monitoring
      
      Test Plan: Add a unit test. make all check
      
      Reviewers: haobo, igor, dhruba
      
      Reviewed By: igor
      
      CC: ljin, nkg-, yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D16959
      71e6a342
    • K
      Several easy-to-add properties related to compaction and flushes · 1ec72b37
      Kai Liu 提交于
      Summary: To partly address the request @nkg- raised, add three easy-to-add properties to compactions and flushes.
      
      Test Plan: run unit tests and add a new unit test to cover new properties.
      
      Reviewers: haobo, dhruba
      
      Reviewed By: dhruba
      
      CC: nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D13677
      1ec72b37
    • I
      Don't Finalize in CompactionPicker · 758fa8c3
      Igor Canadi 提交于
      Summary:
      Finalize re-sorts (read: mutates) the files_ in Version* and it is called by CompactionPicker during normal runtime. At the same time, this same Version* lives in the SuperVersion* and is accessed without the mutex in GetImpl() code path.
      
      Mutating the files_ in one thread and reading the same files_ in another thread is a bad idea. It caused this issue: http://ci-builds.fb.com/job/rocksdb_crashtest/285/console
      
      Long-term, we need to be more careful with method contracts and clearly document what state can be mutated when. Now that we are much faster because we don't lock in GetImpl(), we keep running into data races that were not a problem before when we were slower. db_stress has been very helpful in detecting those.
      
      Short-term, I removed Finalize() from CompactionPicker.
      
      Note: I believe this is an issue in current 2.7 version running in production.
      
      Test Plan:
      make check
      Will also run db_stress to see if issue is gone
      
      Reviewers: sdong, ljin, dhruba, haobo
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16983
      758fa8c3
    • L
      disable the log_number check in Recover() · 63cef900
      Lei Jin 提交于
      Summary:
      There is a chance that an old MANIFEST is corrupted in 2.7 but just not noticed.
      This check would fail them. Change it to log instead of returning a
      Corruption status.
      
      Test Plan: make
      
      Reviewers: haobo, igor
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16923
      63cef900
    • Y
      Fixed a typo in INSTALL.md · 7624f43e
      Yueh-Hsuan Chiang 提交于
      Summary: Replace "RocskDB" by "RocksDB" in INSTALL.md
      
      Test Plan: No code change.
      
      Reviewers: ljin, igor
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16977
      7624f43e
  6. 18 3月, 2014 8 次提交
    • I
      Optimize fallocation · f26cb0f0
      Igor Canadi 提交于
      Summary:
      Based on my recent findings (posted in our internal group), if we use fallocate without KEEP_SIZE flag, we get superior performance of fdatasync() in append-only workloads.
      
      This diff provides an option for user to not use KEEP_SIZE flag, thus optimizing his sync performance by up to 2x-3x.
      
      At one point we also just called posix_fallocate instead of fallocate, which isn't very fast: http://code.woboq.org/userspace/glibc/sysdeps/posix/posix_fallocate.c.html (tl;dr it manually writes out zero bytes to allocate storage). This diff also fixes that, by first calling fallocate and then posix_fallocate if fallocate is not supported.
      
      Test Plan: make check
      
      Reviewers: dhruba, sdong, haobo, ljin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16761
      f26cb0f0
    • I
      Fix race condition in manifest roll · ae25742a
      Igor Canadi 提交于
      Summary:
      When the manifest is getting rolled the following happens:
      1) manifest_file_number_ is assigned to a new manifest number (even though the old one is still current)
      2) mutex is unlocked
      3) SetCurrentFile() creates temporary file manifest_file_number_.dbtmp
      4) SetCurrentFile() renames manifest_file_number_.dbtmp to CURRENT
      5) mutex is locked
      
      If FindObsoleteFiles happens between (3) and (4) it will:
      1) Delete manifest_file_number_.dbtmp (because it's not in pending_outputs_)
      2) Delete old manifest (because the manifest_file_number_ already points to a new one)
      
      I introduce the concept of prev_manifest_file_number_ that will avoid the race condition.
      
      However, we should discuss the future of MANIFEST file rolling. We found some race conditions with it last week and who knows how many more are there. Nobody is using it in production because we don't trust the implementation. Should we even support it?
      
      Test Plan: make check
      
      Reviewers: ljin, dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16929
      ae25742a
    • I
      Check starts_with(prefix) in MultiPrefixIterate · 5601bc46
      Igor Canadi 提交于
      Summary: We switched to prefix_seek method of seeking. This means that anytime we check Valid(), we also need to check starts_with(prefix)
      
      Test Plan: ran db_stress
      
      Reviewers: ljin
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16953
      5601bc46
    • I
      keep_log_files option in BackupableDB · 9caeff51
      Igor Canadi 提交于
      Summary:
      Added an option to BackupableDB implementation that allows users to persist in-memory databases. When the restore happens with keep_log_files = true, it will
      *) Not delete existing log files in wal_dir
      *) Move log files from archive directory to wal_dir, so that DB can replay them if necessary
      
      Test Plan: Added an unit test
      
      Reviewers: dhruba, ljin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16941
      9caeff51
    • Y
      Correct the logic of MemTable::ShouldFlushNow(). · a5fafd4f
      Yueh-Hsuan Chiang 提交于
      Summary:
      Memtable will now be forced to flush if the one of the following
      conditions is met:
      1. Already allocated more than write_buffer_size + 60% arena block size.
         (the overflowing condition)
      2. Unable to safely allocate one more arena block without hitting the
         overflowing condition AND the unused allocated memory < 25% arena
         block size.
      
      Test Plan: make all check
      
      Reviewers: sdong, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16893
      a5fafd4f
    • I
      No prefix iterator in db_stress · 9b8a2b52
      Igor Canadi 提交于
      Summary: We're trying to deprecate prefix iterators, so no need to test them in db_stress
      
      Test Plan: ran it
      
      Reviewers: ljin
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16917
      9b8a2b52
    • S
      Fix a bug that Prev() can hang. · c61c9830
      sdong 提交于
      Summary: Prev() now can hang when there is a key with more than max_skipped number of appearance internally but all of them are newer than the sequence ID to seek. Add unit tests to confirm the bug and fix it.
      
      Test Plan: make all check
      
      Reviewers: igor, haobo
      
      Reviewed By: igor
      
      CC: ljin, yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D16899
      c61c9830
    • I
      Don't care about signed/unsigned compare · f9d05302
      Igor Canadi 提交于
      Summary:
      We need to stop these:
      https://github.com/facebook/rocksdb/pull/99
      https://github.com/facebook/rocksdb/pull/83
      
      Test Plan: no
      
      Reviewers: dhruba, haobo, sdong, ljin, yhchiang
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16905
      f9d05302
  7. 17 3月, 2014 1 次提交
  8. 16 3月, 2014 1 次提交
  9. 15 3月, 2014 9 次提交
    • L
      journal log_number correctly in MANIFEST · 453ec52c
      Lei Jin 提交于
      Summary:
      Here is what it can cause probelm:
      There is one memtable flush and one compaction. Both call LogAndApply(). If both edits are applied in the same batch with flush edit first and the compaction edit followed. LogAndApplyHelper() will assign compaction edit current VersionSet's log number(which should be smaller than the log number from flush edit). It cause log_numbers in MANIFEST to be not monotonic increasing, which violates the assume Recover() makes. What is more is after comitting to MANIFEST file, log_number_ in VersionSet is updated to the log_number from the last edit, which is the compaction one. It ends up not updating the log_number.
      
      Test Plan:
      make whitebox_crash_test
      got another assertion about iter->valid(), not sure if that is related
      to this.
      
      Reviewers: igor, haobo
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16875
      453ec52c
    • C
      Breaking line · f234dfd8
      Caio SBA 提交于
      f234dfd8
    • C
      Make it compile on Debian/GCC 4.7 · b9c78d2d
      Caio SBA 提交于
      b9c78d2d
    • I
      Merge pull request #97 from agchou/patch-1 · 5948a663
      Igor Canadi 提交于
      Fix copyright year
      5948a663
    • I
      Missing includes · 2bad3cb0
      Igor Canadi 提交于
      2bad3cb0
    • I
      unterminated conditional directive · 56dce9bf
      Igor Canadi 提交于
      56dce9bf
    • I
      Fix another Mac OS warning · f74659ac
      Igor Canadi 提交于
      f74659ac
    • I
      Fix HashSkipList and HashLinkedList SIGSEGV · 3c75cc15
      Igor Canadi 提交于
      Summary:
      Original Summary:
      Yesterday, @ljin and I were debugging various db_stress issues. We suspected one of them happens when we concurrently call NewIterator without prefix_seek on HashSkipList. This test demonstrates it.
      
      Update:
      Arena is not thread-safe!! When creating a new full iterator, we *have* to create a new arena, otherwise we're doomed.
      
      Test Plan: SIGSEGV and assertion-throwing test now works!
      
      Reviewers: ljin, haobo, sdong
      
      Reviewed By: sdong
      
      CC: leveldb, ljin
      
      Differential Revision: https://reviews.facebook.net/D16857
      3c75cc15
    • I
      Fix warning on Mac OS · 6c72079d
      Igor Canadi 提交于
      6c72079d
  10. 14 3月, 2014 1 次提交
    • S
      Fix extra compaction tasks scheduled after D16767 in some cases · 5aa81f04
      sdong 提交于
      Summary:
      With D16767, there is a case compaction tasks are scheduled infinitely:
      (1) no flush thread is configured and more than 1 compaction threads
      (2) a flush is going on by one compaction hread
      (3) the state of SST files is in the state that versions_->current()->NeedsCompaction() will generate a false positive (return true actually there is no work to be done)
      In that case, a infinite loop will be formed.
      
      This patch would fix it.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, ljin
      
      Reviewed By: igor
      
      CC: dhruba, yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D16863
      5aa81f04