1. 08 4月, 2014 6 次提交
  2. 05 4月, 2014 2 次提交
  3. 04 4月, 2014 1 次提交
  4. 03 4月, 2014 3 次提交
    • H
      [RocksDB] Fix a race condition in GetSortedWalFiles · 48bc0c6a
      Haobo Xu 提交于
      Summary: This patch fixed a race condition where a log file is moved to archived dir in the middle of GetSortedWalFiles. Without the fix, the log file would be missed in the result, which leads to transaction log iterator gap. A test utility SyncPoint is added to help reproducing the race condition.
      
      Test Plan: TransactionLogIteratorRace; make check
      
      Reviewers: dhruba, ljin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17121
      48bc0c6a
    • S
      Move a info logging out of DB Mutex · 158845ba
      sdong 提交于
      Summary: As we know, logging can be slow, or even hang for some file systems. Move one more logging out of DB mutex.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, ljin
      
      Reviewed By: igor
      
      CC: yhchiang, nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17427
      158845ba
    • S
      Compaction Filter V1 to use old context struct to keep backward compatible · 4af1954f
      sdong 提交于
      Summary: The previous change D15087 changed existing compaction filter, which makes the commonly used class not backward compatible. Revert the older interface. Use a new interface for V2 instead.
      
      Test Plan: make all check
      
      Reviewers: haobo, yhchiang, igor
      
      CC: danguo, dhruba, ljin, igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17223
      4af1954f
  5. 01 4月, 2014 1 次提交
  6. 28 3月, 2014 1 次提交
  7. 27 3月, 2014 2 次提交
  8. 26 3月, 2014 1 次提交
    • D
      [rocksdb] make init prefix more robust · d9ca83df
      Danny Guo 提交于
      Summary:
      Currently if client uses kNULLString as the prefix, it will confuse
      compaction filter v2. This diff added a bool to indicate if the prefix
      has been intialized. I also added a unit test to cover this case and
      make sure the new code path is hit.
      
      Test Plan: db_test
      
      Reviewers: igor, haobo
      
      Reviewed By: igor
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17151
      d9ca83df
  9. 25 3月, 2014 2 次提交
    • D
      [rocksdb] new CompactionFilterV2 API · b47812fb
      Danny Guo 提交于
      Summary:
      This diff adds a new CompactionFilterV2 API that roll up the
      decisions of kv pairs during compactions. These kv pairs must share the
      same key prefix. They are buffered inside the db.
      
          typedef std::vector<Slice> SliceVector;
          virtual std::vector<bool> Filter(int level,
                                       const SliceVector& keys,
                                       const SliceVector& existing_values,
                                       std::vector<std::string>* new_values,
                                       std::vector<bool>* values_changed
                                       ) const = 0;
      
      Application can override the Filter() function to operate
      on the buffered kv pairs. More details in the inline documentation.
      
      Test Plan:
      make check. Added unit tests to make sure Keep, Delete,
      Change all works.
      
      Reviewers: haobo
      
      CCs: leveldb
      
      Differential Revision: https://reviews.facebook.net/D15087
      b47812fb
    • Y
      Enhance partial merge to support multiple arguments · cda4006e
      Yueh-Hsuan Chiang 提交于
      Summary:
      * PartialMerge api now takes a list of operands instead of two operands.
      * Add min_pertial_merge_operands to Options, indicating the minimum
        number of operands to trigger partial merge.
      * This diff is based on Schalk's previous diff (D14601), but it also
        includes necessary changes such as updating the pure C api for
        partial merge.
      
      Test Plan:
      * make check all
      * develop tests for cases where partial merge takes more than two
        operands.
      
      TODOs (from Schalk):
      * Add test with min_partial_merge_operands > 2.
      * Perform benchmarks to measure the performance improvements (can probably
        use results of task #2837810.)
      * Add description of problem to doc/index.html.
      * Change wiki pages to reflect the interface changes.
      
      Reviewers: haobo, igor, vamsi
      
      Reviewed By: haobo
      
      CC: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D16815
      cda4006e
  10. 21 3月, 2014 2 次提交
  11. 19 3月, 2014 3 次提交
  12. 18 3月, 2014 2 次提交
    • I
      Optimize fallocation · f26cb0f0
      Igor Canadi 提交于
      Summary:
      Based on my recent findings (posted in our internal group), if we use fallocate without KEEP_SIZE flag, we get superior performance of fdatasync() in append-only workloads.
      
      This diff provides an option for user to not use KEEP_SIZE flag, thus optimizing his sync performance by up to 2x-3x.
      
      At one point we also just called posix_fallocate instead of fallocate, which isn't very fast: http://code.woboq.org/userspace/glibc/sysdeps/posix/posix_fallocate.c.html (tl;dr it manually writes out zero bytes to allocate storage). This diff also fixes that, by first calling fallocate and then posix_fallocate if fallocate is not supported.
      
      Test Plan: make check
      
      Reviewers: dhruba, sdong, haobo, ljin
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16761
      f26cb0f0
    • I
      Fix race condition in manifest roll · ae25742a
      Igor Canadi 提交于
      Summary:
      When the manifest is getting rolled the following happens:
      1) manifest_file_number_ is assigned to a new manifest number (even though the old one is still current)
      2) mutex is unlocked
      3) SetCurrentFile() creates temporary file manifest_file_number_.dbtmp
      4) SetCurrentFile() renames manifest_file_number_.dbtmp to CURRENT
      5) mutex is locked
      
      If FindObsoleteFiles happens between (3) and (4) it will:
      1) Delete manifest_file_number_.dbtmp (because it's not in pending_outputs_)
      2) Delete old manifest (because the manifest_file_number_ already points to a new one)
      
      I introduce the concept of prev_manifest_file_number_ that will avoid the race condition.
      
      However, we should discuss the future of MANIFEST file rolling. We found some race conditions with it last week and who knows how many more are there. Nobody is using it in production because we don't trust the implementation. Should we even support it?
      
      Test Plan: make check
      
      Reviewers: ljin, dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16929
      ae25742a
  13. 15 3月, 2014 1 次提交
  14. 14 3月, 2014 1 次提交
    • S
      Fix extra compaction tasks scheduled after D16767 in some cases · 5aa81f04
      sdong 提交于
      Summary:
      With D16767, there is a case compaction tasks are scheduled infinitely:
      (1) no flush thread is configured and more than 1 compaction threads
      (2) a flush is going on by one compaction hread
      (3) the state of SST files is in the state that versions_->current()->NeedsCompaction() will generate a false positive (return true actually there is no work to be done)
      In that case, a infinite loop will be formed.
      
      This patch would fix it.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, ljin
      
      Reviewed By: igor
      
      CC: dhruba, yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D16863
      5aa81f04
  15. 13 3月, 2014 5 次提交
  16. 12 3月, 2014 4 次提交
    • S
      Fix data race against logging data structure because of LogBuffer · bd45633b
      sdong 提交于
      Summary:
      @igor pointed out that there is a potential data race because of the way we use the newly introduced LogBuffer. After "bg_compaction_scheduled_--" or "bg_flush_scheduled_--", they can both become 0. As soon as the lock is released after that, DBImpl's deconstructor can go ahead and deconstruct all the states inside DB, including the info_log object hold in a shared pointer of the options object it keeps. At that point it is not safe anymore to continue using the info logger to write the delayed logs.
      
      With the patch, lock is released temporarily for log buffer to be flushed before "bg_compaction_scheduled_--" or "bg_flush_scheduled_--". In order to make sure we don't miss any pending flush or compaction, a new flag bg_schedule_needed_ is added, which is set to be true if there is a pending flush or compaction but not scheduled because of the max thread limit. If the flag is set to be true, the scheduling function will be called before compaction or flush thread finishes.
      
      Thanks @igor for this finding!
      
      Test Plan: make all check
      
      Reviewers: haobo, igor
      
      Reviewed By: haobo
      
      CC: dhruba, ljin, yhchiang, igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D16767
      bd45633b
    • I
      [CF] db_stress for column families · 457c78eb
      Igor Canadi 提交于
      Summary:
      I had this diff for a while to test column families implementation. Last night, I ran it sucessfully for 10 hours with the command:
      
         time ./db_stress --threads=30 --ops_per_thread=200000000 --max_key=5000 --column_families=20 --clear_column_family_one_in=3000000 --verify_before_write=1  --reopen=50 --max_background_compactions=10 --max_background_flushes=10 --db=/tmp/db_stress
      
      It is ready to be committed :)
      
      Test Plan: Ran it for 10 hours
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16797
      457c78eb
    • S
      Temp Fix of LogBuffer flushing · 6c66bc08
      sdong 提交于
      Summary: To temp fix the log buffer flushing. Flush the buffer inside the lock. Clean the trunk before we find an eventual fix.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor
      
      Reviewed By: igor
      
      CC: ljin, leveldb, yhchiang
      
      Differential Revision: https://reviews.facebook.net/D16791
      6c66bc08
    • I
      Add a comment after SignalAll() · cb980216
      Igor Canadi 提交于
      Summary: Having code after SignalAll has already caused 2 bugs. Let's make sure this doesn't happen again.
      
      Test Plan: no test
      
      Reviewers: sdong, dhruba, haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16785
      cb980216
  17. 11 3月, 2014 3 次提交