1. 02 10月, 2014 1 次提交
    • L
      make compaction related options changeable · 5ec53f3e
      Lei Jin 提交于
      Summary:
      make compaction related options changeable. Most of changes are tedious,
      following the same convention: grabs MutableCFOptions at the beginning
      of compaction under mutex, then pass it throughout the job and register
      it in SuperVersion at the end.
      
      Test Plan: make all check
      
      Reviewers: igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23349
      5ec53f3e
  2. 01 10月, 2014 7 次提交
  3. 30 9月, 2014 7 次提交
  4. 27 9月, 2014 1 次提交
  5. 26 9月, 2014 3 次提交
    • L
      CompactedDBImpl::MultiGet() for better CuckooTable performance · fbd2dafc
      Lei Jin 提交于
      Summary:
      Add the MultiGet API to allow prefetching.
      With file size of 1.5G, I configured it to have 0.9 hash ratio that can
      fill With 115M keys and result in 2 hash functions, the lookup QPS is
      ~4.9M/s  vs. 3M/s for Get().
      It is tricky to set the parameters right. Since files size is determined
      by power-of-two factor, that means # of keys is fixed in each file. With
      big file size (thus smaller # of files), we will have more chance to
      waste lot of space in the last file - lower space utilization as a
      result. Using smaller file size can improve the situation, but that
      harms lookup speed.
      
      Test Plan: db_bench
      
      Reviewers: yhchiang, sdong, igor
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23673
      fbd2dafc
    • L
      CompactedDBImpl · 3c680061
      Lei Jin 提交于
      Summary:
      Add a CompactedDBImpl that will enabled when calling OpenForReadOnly()
      and the DB only has one level (>0) of files. As a performan comparison,
      CuckooTable performs 2.1M/s with CompactedDBImpl vs. 1.78M/s with
      ReadOnlyDBImpl.
      
      Test Plan: db_bench
      
      Reviewers: yhchiang, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23553
      3c680061
    • I
      Fix double deletes · f7375f39
      Igor Canadi 提交于
      Summary: While debugging clients compaction issues, I noticed bunch of delete bugs: P16329995. MakeTableName returns sst file with "/" prefix. We also need "/" prefix when we get the files though GetChildren(), so that we can properly dedup the files.
      
      Test Plan: none
      
      Reviewers: sdong, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23457
      f7375f39
  6. 25 9月, 2014 1 次提交
  7. 24 9月, 2014 1 次提交
  8. 23 9月, 2014 3 次提交
  9. 19 9月, 2014 4 次提交
    • V
      RocksDB: Format uint64 using PRIu64 in db_impl.cc · f4459474
      Venkatesh Radhakrishnan 提交于
      Summary: Use PRIu64 to format uint64 in a portable manner
      
      Test Plan: Run "make all check"
      
      Reviewers: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23595
      f4459474
    • I
      Fix unit tests errors · 90b8c07b
      Igor Canadi 提交于
      Summary: Those were introduced with https://github.com/facebook/rocksdb/commit/2fb1fea30fd027bbd824a26b682d04d91a8661dc because the flushing behavior changed when max_background_flushes is > 0.
      
      Test Plan: make check
      
      Reviewers: ljin, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23577
      90b8c07b
    • L
      CuckooTable: add one option to allow identity function for the first hash function · 51af7c32
      Lei Jin 提交于
      Summary:
      MurmurHash becomes expensive when we do millions Get() a second in one
      thread. Add this option to allow the first hash function to use identity
      function as hash function. It results in QPS increase from 3.7M/s to
      ~4.3M/s. I did not observe improvement for end to end RocksDB
      performance. This may be caused by other bottlenecks that I will address
      in a separate diff.
      
      Test Plan:
      ```
      [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=0
      ==== Test CuckooReaderTest.WhenKeyExists
      ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator
      ==== Test CuckooReaderTest.CheckIterator
      ==== Test CuckooReaderTest.CheckIteratorUint64
      ==== Test CuckooReaderTest.WhenKeyNotFound
      ==== Test CuckooReaderTest.TestReadPerformance
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.272us (3.7 Mqps) with batch size of 0, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.138us (7.2 Mqps) with batch size of 10, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.142us (7.1 Mqps) with batch size of 25, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.142us (7.0 Mqps) with batch size of 50, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.144us (6.9 Mqps) with batch size of 100, # of found keys 125829120
      
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.201us (5.0 Mqps) with batch size of 0, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.121us (8.3 Mqps) with batch size of 10, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.123us (8.1 Mqps) with batch size of 25, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.121us (8.3 Mqps) with batch size of 50, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.112us (8.9 Mqps) with batch size of 100, # of found keys 104857600
      
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.251us (4.0 Mqps) with batch size of 0, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.107us (9.4 Mqps) with batch size of 10, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.099us (10.1 Mqps) with batch size of 25, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.100us (10.0 Mqps) with batch size of 50, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.116us (8.6 Mqps) with batch size of 100, # of found keys 83886080
      
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.189us (5.3 Mqps) with batch size of 0, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.095us (10.5 Mqps) with batch size of 10, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.096us (10.4 Mqps) with batch size of 25, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.098us (10.2 Mqps) with batch size of 50, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.105us (9.5 Mqps) with batch size of 100, # of found keys 73400320
      
      [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=1
      ==== Test CuckooReaderTest.WhenKeyExists
      ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator
      ==== Test CuckooReaderTest.CheckIterator
      ==== Test CuckooReaderTest.CheckIteratorUint64
      ==== Test CuckooReaderTest.WhenKeyNotFound
      ==== Test CuckooReaderTest.TestReadPerformance
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.230us (4.3 Mqps) with batch size of 0, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.086us (11.7 Mqps) with batch size of 10, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.088us (11.3 Mqps) with batch size of 25, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.083us (12.1 Mqps) with batch size of 50, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.083us (12.1 Mqps) with batch size of 100, # of found keys 125829120
      
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.159us (6.3 Mqps) with batch size of 0, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 10, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.080us (12.6 Mqps) with batch size of 25, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.080us (12.5 Mqps) with batch size of 50, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.082us (12.2 Mqps) with batch size of 100, # of found keys 104857600
      
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.154us (6.5 Mqps) with batch size of 0, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 10, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.077us (12.9 Mqps) with batch size of 25, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 50, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.079us (12.6 Mqps) with batch size of 100, # of found keys 83886080
      
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.218us (4.6 Mqps) with batch size of 0, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.083us (12.0 Mqps) with batch size of 10, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.085us (11.7 Mqps) with batch size of 25, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.086us (11.6 Mqps) with batch size of 50, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 100, # of found keys 73400320
      ```
      
      Reviewers: sdong, igor, yhchiang
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23451
      51af7c32
    • I
      Fix syncronization issues · 2fb1fea3
      Igor Canadi 提交于
      2fb1fea3
  10. 18 9月, 2014 2 次提交
  11. 16 9月, 2014 1 次提交
  12. 14 9月, 2014 1 次提交
  13. 13 9月, 2014 3 次提交
    • I
      WriteThread · dee91c25
      Igor Canadi 提交于
      Summary: This diff just moves the write thread control out of the DBImpl. I will need this as I will control column family data concurrency by only accessing some data in the write thread. That way, we won't have to lock our accesses to column family hash table (mappings from IDs to CFDs).
      
      Test Plan: make check
      
      Reviewers: sdong, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23301
      dee91c25
    • I
      Fix WAL synced · 540a257f
      Igor Canadi 提交于
      Summary: Uhm...
      
      Test Plan: nope
      
      Reviewers: sdong, yhchiang, tnovak, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23343
      540a257f
    • C
      Fix build issue under macosx · 49fe329e
      Chilledheart 提交于
      49fe329e
  14. 12 9月, 2014 4 次提交
    • F
      add_wrapped_bloom_test · 0352a9fa
      Feng Zhu 提交于
      Summary:
      1. wrap a filter policy like what fbcode/multifeed/rocksdb/MultifeedRocksDbKey.h
         to ensure that rocksdb works fine after filterpolicy interface change
      
      Test Plan: 1. valgrind ./bloom_test
      
      Reviewers: ljin, igor, yhchiang, dhruba, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23229
      0352a9fa
    • I
      Don't run background jobs (flush, compactions) when bg_error_ is set · 9c0e66ce
      Igor Canadi 提交于
      Summary:
      If bg_error_ is set, that means that we mark DB read only. However, current behavior still continues the flushes and compactions, even though bg_error_ is set.
      
      On the other hand, if bg_error_ is set, we will return Status::OK() from CompactRange(), although the compaction didn't actually succeed.
      
      This is clearly not desired behavior. I found this when I was debugging t5132159, although I'm pretty sure these aren't related.
      
      Also, when we're shutting down, it's dangerous to exit RunManualCompaction(), since that will destruct ManualCompaction object. Background compaction job might still hold a reference to manual_compaction_ and this will lead to undefined behavior. I changed the behavior so that we only exit RunManualCompaction when manual compaction job is marked done.
      
      Test Plan: make check
      
      Reviewers: sdong, ljin, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23223
      9c0e66ce
    • I
      Fix valgrind test · a9639bda
      Igor Canadi 提交于
      Summary: Get valgrind to stop complaining about uninitialized value
      
      Test Plan: valgrind not complaining anymore
      
      Reviewers: sdong, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23289
      a9639bda
    • I
      Relax FlushSchedule test · d1f24dc7
      Igor Canadi 提交于
      Summary: The test makes sure that we don't call flush too often. For that, it's ok to check if we have less than 10 table files. Otherwise, the test is flaky because it's hard to estimate number of entries in the memtable before it gets flushed (any ideas?)
      
      Test Plan: Still works, but hopefully less flaky.
      
      Reviewers: ljin, sdong, yhchiang
      
      Reviewed by: yhchiang
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23241
      d1f24dc7
  15. 11 9月, 2014 1 次提交
    • I
      Push model for flushing memtables · 3d9e6f77
      Igor Canadi 提交于
      Summary:
      When memtable is full it calls the registered callback. That callback then registers column family as needing the flush. Every write checks if there are some column families that need to be flushed. This completely eliminates the need for MakeRoomForWrite() function and simplifies our Write code-path.
      
      There is some complexity with the concurrency when the column family is dropped. I made it a bit less complex by dropping the column family from the write thread in https://reviews.facebook.net/D22965. Let me know if you want to discuss this.
      
      Test Plan: make check works. I'll also run db_stress with creating and dropping column families for a while.
      
      Reviewers: yhchiang, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23067
      3d9e6f77