1. 30 3月, 2020 1 次提交
    • Z
      Use FileChecksumGenFactory for SST file checksum (#6600) · e8d332d9
      Zhichao Cao 提交于
      Summary:
      In the current implementation, sst file checksum is calculated by a shared checksum function object, which may make some checksum function hard to be applied here such as SHA1. In this implementation, each sst file will have its own checksum generator obejct, created by FileChecksumGenFactory. User needs to implement its own FilechecksumGenerator and Factory to plugin the in checksum calculation method.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6600
      
      Test Plan: tested with make asan_check
      
      Reviewed By: riversand963
      
      Differential Revision: D20717670
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 2a74c1c280ac11a07a1980185b43b671acaa71c6
      e8d332d9
  2. 28 3月, 2020 1 次提交
    • Z
      Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487) · 42468881
      Zhichao Cao 提交于
      Summary:
      In the current code base, we use Status to get and store the returned status from the call. Specifically, for IO related functions, the current Status cannot reflect the IO Error details such as error scope, error retryable attribute, and others. With the implementation of https://github.com/facebook/rocksdb/issues/5761, we have the new Wrapper for IO, which returns IOStatus instead of Status. However, the IOStatus is purged at the lower level of write path and transferred to Status.
      
      The first job of this PR is to pass the IOStatus to the write path (flush, WAL write, and Compaction). The second job is to identify the Retryable IO Error as HardError, and set the bg_error_ as HardError. In this case, the DB Instance becomes read only. User is informed of the Status and need to take actions to deal with it (e.g., call db->Resume()).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6487
      
      Test Plan: Added the testing case to error_handler_fs_test. Pass make asan_check
      
      Reviewed By: anand1976
      
      Differential Revision: D20685017
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: ff85f042896243abcd6ef37877834e26f36b6eb0
      42468881
  3. 21 2月, 2020 1 次提交
    • S
      Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) · fdf882de
      sdong 提交于
      Summary:
      When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
      
      Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
      
      Differential Revision: D19977691
      
      fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
      fdf882de
  4. 11 2月, 2020 1 次提交
    • Z
      Checksum for each SST file and stores in MANIFEST (#6216) · 4369f2c7
      Zhichao Cao 提交于
      Summary:
      In the current code base, RocksDB generate the checksum for each block and verify the checksum at usage. Current PR enable SST file checksum. After a SST file is generated by Flush or Compaction, RocksDB generate the SST file checksum and store the checksum value and checksum method name in the vs_info and MANIFEST as part for the FileMetadata.
      
      Added the enable_sst_file_checksum to Options to enable or disable file checksum. Added sst_file_checksum to Options such that user can plugin their own SST file checksum calculate method via overriding the SstFileChecksum class. The checksum information inlcuding uint32_t checksum value and a checksum name (string).  A new tool is added to LDB such that user can dump out a list of file checksum information from MANIFEST. If user enables the file checksum but does not provide the sst_file_checksum instance, RocksDB will use the default crc32checksum implemented in table/sst_file_checksum_crc32c.h
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6216
      
      Test Plan: Added the testing case in table_test and ldb_cmd_test to verify checksum is correct in different level. Pass make asan_check.
      
      Differential Revision: D19171461
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: b2e53479eefc5bb0437189eaa1941670e5ba8b87
      4369f2c7
  5. 17 9月, 2019 1 次提交
    • S
      Divide file_reader_writer.h and .cc (#5803) · b931f84e
      sdong 提交于
      Summary:
      file_reader_writer.h and .cc contain several files and helper function, and it's hard to navigate. Separate it to multiple files and put them under file/
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5803
      
      Test Plan: Build whole project using make and cmake.
      
      Differential Revision: D17374550
      
      fbshipit-source-id: 10efca907721e7a78ed25bbf74dc5410dea05987
      b931f84e
  6. 31 5月, 2019 1 次提交
  7. 31 10月, 2018 1 次提交
    • A
      Promote rocksdb.{deleted.keys,merge.operands} to main table properties (#4594) · eaaf1a6f
      Abhishek Madan 提交于
      Summary:
      Since the number of range deletions are reported in
      TableProperties, it is confusing to not report the number of merge
      operands and point deletions as top-level properties; they are
      accessible through the public API, but since they are not the "main"
      properties, they do not appear in aggregated table properties, or the
      string representation of table properties.
      
      This change promotes those two property keys to
      `rocksdb/table_properties.h`, adds corresponding uint64 members for
      them, deprecates the old access methods `GetDeletedKeys()` and
      `GetMergeOperands()` (though they are still usable for now), and removes
      `InternalKeyPropertiesCollector`. The property key strings are the same
      as before this change, so this should be able to read DBs written from older
      versions (though I haven't tested this yet).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4594
      
      Differential Revision: D12826893
      
      Pulled By: abhimadan
      
      fbshipit-source-id: 9e4e4fbdc5b0da161c89582566d184101ba8eb68
      eaaf1a6f
  8. 06 9月, 2018 1 次提交
  9. 20 10月, 2017 1 次提交
  10. 16 7月, 2017 1 次提交
  11. 28 4月, 2017 1 次提交
  12. 07 4月, 2016 1 次提交
    • A
      Embed column family name in SST file · 2391ef72
      Andrew Kryczka 提交于
      Summary:
      Added the column family name to the properties block. This property
      is omitted only if the property is unavailable, such as when RepairDB()
      writes SST files.
      
      In a next diff, I will change RepairDB to use this new property for
      deciding to which column family an existing SST file belongs. If this
      property is missing, it will add it to the "unknown" column family (same
      as its existing behavior).
      
      Test Plan:
      New unit test:
      
        $ ./db_table_properties_test --gtest_filter=DBTablePropertiesTest.GetColumnFamilyNameProperty
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D55605
      2391ef72
  13. 10 2月, 2016 1 次提交
  14. 19 12月, 2015 1 次提交
  15. 12 12月, 2015 1 次提交
  16. 18 7月, 2015 1 次提交
    • S
      Move rate_limiter, write buffering, most perf context instrumentation and most... · 6e9fbeb2
      sdong 提交于
      Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env
      
      Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
      
      Test Plan: Run all existing unit tests.
      
      Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42321
      6e9fbeb2
  17. 14 1月, 2015 1 次提交
  18. 25 11月, 2014 1 次提交
  19. 12 11月, 2014 1 次提交
    • I
      Turn on -Wshorten-64-to-32 and fix all the errors · 767777c2
      Igor Canadi 提交于
      Summary:
      We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.
      
      This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.
      
      Test Plan: compiles
      
      Reviewers: ljin, rven, yhchiang, sdong
      
      Reviewed By: yhchiang
      
      Subscribers: bobbaldwin, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D28689
      767777c2
  20. 01 11月, 2014 1 次提交
    • I
      Turn on -Wshadow · 9f7fc3ac
      Igor Canadi 提交于
      Summary:
      ...and fix all the errors :)
      
      Jim suggested turning on -Wshadow because it helped him fix number of critical bugs in fbcode. I think it's a good idea to be -Wshadow clean.
      
      Test Plan: compiles
      
      Reviewers: yhchiang, rven, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D27711
      9f7fc3ac
  21. 30 9月, 2014 1 次提交
    • L
      handle kDelete type in cuckoo builder · 2dc6f62b
      Lei Jin 提交于
      Summary:
      when I changed std::vector<std::string, std::string> to std::string to
      store key/value pairs in builder, I missed the handling for kDeletion
      type. As a result, value_size_ can be wrong if the first add key is for
      deletion.
      The is captured by ./cuckoo_table_db_test
      
      Test Plan:
      ./cuckoo_table_db_test
      ./cuckoo_table_reader_test
      ./cuckoo_table_builder_test
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D24045
      2dc6f62b
  22. 26 9月, 2014 3 次提交
    • L
      reduce memory usage of cuckoo table builder · 94997eab
      Lei Jin 提交于
      Summary:
      builder currently buffers all key value pairs as a vector of
      pair<string, string>. That is too much due to std::string
      overhead. It wasn't able to fit 1B key/values (12bytes total) in 100GB
      of ram. Switch to use a plain string to store the key/value sequence and
      use only 12GB of ram as a result.
      
      Test Plan: db_bench
      
      Reviewers: igor, sdong, yhchiang
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23763
      94997eab
    • L
      improve memory efficiency of cuckoo reader · c6275956
      Lei Jin 提交于
      Summary:
      When creating a new iterator, instead of storing mapping from key to
      bucket id for sorting, store only bucket id and read key from mmap file
      based on the id. This reduces from 20 bytes per entry to only 4 bytes.
      
      Test Plan: db_bench
      
      Reviewers: igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23757
      c6275956
    • L
      option to choose module when calculating CuckooTable hash · 581442d4
      Lei Jin 提交于
      Summary:
      Using module to calculate hash makes lookup ~8% slower. But it has its
      benefit: file size is more predictable, more space enffient
      
      Test Plan: db_bench
      
      Reviewers: igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23691
      581442d4
  23. 19 9月, 2014 1 次提交
    • L
      CuckooTable: add one option to allow identity function for the first hash function · 51af7c32
      Lei Jin 提交于
      Summary:
      MurmurHash becomes expensive when we do millions Get() a second in one
      thread. Add this option to allow the first hash function to use identity
      function as hash function. It results in QPS increase from 3.7M/s to
      ~4.3M/s. I did not observe improvement for end to end RocksDB
      performance. This may be caused by other bottlenecks that I will address
      in a separate diff.
      
      Test Plan:
      ```
      [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=0
      ==== Test CuckooReaderTest.WhenKeyExists
      ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator
      ==== Test CuckooReaderTest.CheckIterator
      ==== Test CuckooReaderTest.CheckIteratorUint64
      ==== Test CuckooReaderTest.WhenKeyNotFound
      ==== Test CuckooReaderTest.TestReadPerformance
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.272us (3.7 Mqps) with batch size of 0, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.138us (7.2 Mqps) with batch size of 10, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.142us (7.1 Mqps) with batch size of 25, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.142us (7.0 Mqps) with batch size of 50, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.144us (6.9 Mqps) with batch size of 100, # of found keys 125829120
      
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.201us (5.0 Mqps) with batch size of 0, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.121us (8.3 Mqps) with batch size of 10, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.123us (8.1 Mqps) with batch size of 25, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.121us (8.3 Mqps) with batch size of 50, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.112us (8.9 Mqps) with batch size of 100, # of found keys 104857600
      
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.251us (4.0 Mqps) with batch size of 0, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.107us (9.4 Mqps) with batch size of 10, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.099us (10.1 Mqps) with batch size of 25, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.100us (10.0 Mqps) with batch size of 50, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.116us (8.6 Mqps) with batch size of 100, # of found keys 83886080
      
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.189us (5.3 Mqps) with batch size of 0, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.095us (10.5 Mqps) with batch size of 10, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.096us (10.4 Mqps) with batch size of 25, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.098us (10.2 Mqps) with batch size of 50, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.105us (9.5 Mqps) with batch size of 100, # of found keys 73400320
      
      [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=1
      ==== Test CuckooReaderTest.WhenKeyExists
      ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator
      ==== Test CuckooReaderTest.CheckIterator
      ==== Test CuckooReaderTest.CheckIteratorUint64
      ==== Test CuckooReaderTest.WhenKeyNotFound
      ==== Test CuckooReaderTest.TestReadPerformance
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.230us (4.3 Mqps) with batch size of 0, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.086us (11.7 Mqps) with batch size of 10, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.088us (11.3 Mqps) with batch size of 25, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.083us (12.1 Mqps) with batch size of 50, # of found keys 125829120
      With 125829120 items, utilization is 93.75%, number of hash functions: 2.
      Time taken per op is 0.083us (12.1 Mqps) with batch size of 100, # of found keys 125829120
      
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.159us (6.3 Mqps) with batch size of 0, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 10, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.080us (12.6 Mqps) with batch size of 25, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.080us (12.5 Mqps) with batch size of 50, # of found keys 104857600
      With 104857600 items, utilization is 78.12%, number of hash functions: 2.
      Time taken per op is 0.082us (12.2 Mqps) with batch size of 100, # of found keys 104857600
      
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.154us (6.5 Mqps) with batch size of 0, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 10, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.077us (12.9 Mqps) with batch size of 25, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 50, # of found keys 83886080
      With 83886080 items, utilization is 62.50%, number of hash functions: 2.
      Time taken per op is 0.079us (12.6 Mqps) with batch size of 100, # of found keys 83886080
      
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.218us (4.6 Mqps) with batch size of 0, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.083us (12.0 Mqps) with batch size of 10, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.085us (11.7 Mqps) with batch size of 25, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.086us (11.6 Mqps) with batch size of 50, # of found keys 73400320
      With 73400320 items, utilization is 54.69%, number of hash functions: 2.
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 100, # of found keys 73400320
      ```
      
      Reviewers: sdong, igor, yhchiang
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23451
      51af7c32
  24. 18 9月, 2014 1 次提交
  25. 06 9月, 2014 1 次提交
  26. 30 8月, 2014 1 次提交
    • R
      Improve Cuckoo Table Reader performance. Inlined hash function and number of... · d20b8cfa
      Radheshyam Balasundaram 提交于
      Improve Cuckoo Table Reader performance. Inlined hash function and number of buckets a power of two.
      
      Summary:
      Use inlined hash functions instead of function pointer. Make number of buckets a power of two and use bitwise and instead of mod.
      After these changes, we get almost 50% improvement in performance.
      
      Results:
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.231us (4.3 Mqps) with batch size of 0
      Time taken per op is 0.229us (4.4 Mqps) with batch size of 0
      Time taken per op is 0.185us (5.4 Mqps) with batch size of 0
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.108us (9.3 Mqps) with batch size of 10
      Time taken per op is 0.100us (10.0 Mqps) with batch size of 10
      Time taken per op is 0.103us (9.7 Mqps) with batch size of 10
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.101us (9.9 Mqps) with batch size of 25
      Time taken per op is 0.098us (10.2 Mqps) with batch size of 25
      Time taken per op is 0.097us (10.3 Mqps) with batch size of 25
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.100us (10.0 Mqps) with batch size of 50
      Time taken per op is 0.097us (10.3 Mqps) with batch size of 50
      Time taken per op is 0.097us (10.3 Mqps) with batch size of 50
      With 120000000 items, utilization is 89.41%, number of hash functions: 2.
      Time taken per op is 0.102us (9.8 Mqps) with batch size of 100
      Time taken per op is 0.098us (10.2 Mqps) with batch size of 100
      Time taken per op is 0.115us (8.7 Mqps) with batch size of 100
      
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.201us (5.0 Mqps) with batch size of 0
      Time taken per op is 0.155us (6.5 Mqps) with batch size of 0
      Time taken per op is 0.152us (6.6 Mqps) with batch size of 0
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.089us (11.3 Mqps) with batch size of 10
      Time taken per op is 0.084us (11.9 Mqps) with batch size of 10
      Time taken per op is 0.086us (11.6 Mqps) with batch size of 10
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.087us (11.5 Mqps) with batch size of 25
      Time taken per op is 0.085us (11.7 Mqps) with batch size of 25
      Time taken per op is 0.093us (10.8 Mqps) with batch size of 25
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.094us (10.6 Mqps) with batch size of 50
      Time taken per op is 0.094us (10.7 Mqps) with batch size of 50
      Time taken per op is 0.093us (10.8 Mqps) with batch size of 50
      With 100000000 items, utilization is 74.51%, number of hash functions: 2.
      Time taken per op is 0.092us (10.9 Mqps) with batch size of 100
      Time taken per op is 0.089us (11.2 Mqps) with batch size of 100
      Time taken per op is 0.088us (11.3 Mqps) with batch size of 100
      
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.154us (6.5 Mqps) with batch size of 0
      Time taken per op is 0.168us (6.0 Mqps) with batch size of 0
      Time taken per op is 0.190us (5.3 Mqps) with batch size of 0
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.081us (12.4 Mqps) with batch size of 10
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 10
      Time taken per op is 0.083us (12.1 Mqps) with batch size of 10
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 25
      Time taken per op is 0.073us (13.7 Mqps) with batch size of 25
      Time taken per op is 0.073us (13.7 Mqps) with batch size of 25
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.076us (13.1 Mqps) with batch size of 50
      Time taken per op is 0.072us (13.8 Mqps) with batch size of 50
      Time taken per op is 0.072us (13.8 Mqps) with batch size of 50
      With 80000000 items, utilization is 59.60%, number of hash functions: 2.
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 100
      Time taken per op is 0.074us (13.6 Mqps) with batch size of 100
      Time taken per op is 0.073us (13.6 Mqps) with batch size of 100
      
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.190us (5.3 Mqps) with batch size of 0
      Time taken per op is 0.186us (5.4 Mqps) with batch size of 0
      Time taken per op is 0.184us (5.4 Mqps) with batch size of 0
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.079us (12.7 Mqps) with batch size of 10
      Time taken per op is 0.070us (14.2 Mqps) with batch size of 10
      Time taken per op is 0.072us (14.0 Mqps) with batch size of 10
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.080us (12.5 Mqps) with batch size of 25
      Time taken per op is 0.072us (14.0 Mqps) with batch size of 25
      Time taken per op is 0.071us (14.1 Mqps) with batch size of 25
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.082us (12.1 Mqps) with batch size of 50
      Time taken per op is 0.071us (14.1 Mqps) with batch size of 50
      Time taken per op is 0.073us (13.6 Mqps) with batch size of 50
      With 70000000 items, utilization is 52.15%, number of hash functions: 2.
      Time taken per op is 0.080us (12.5 Mqps) with batch size of 100
      Time taken per op is 0.077us (13.0 Mqps) with batch size of 100
      Time taken per op is 0.078us (12.8 Mqps) with batch size of 100
      
      Test Plan:
      make check all
      make valgrind_check
      make asan_check
      
      Reviewers: sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22539
      d20b8cfa
  27. 29 8月, 2014 1 次提交
    • R
      Implementing a cache friendly version of Cuckoo Hash · 7f714483
      Radheshyam Balasundaram 提交于
      Summary: This implements a cache friendly version of Cuckoo Hash in which, in case of collission, we try to insert in next few locations. The size of the neighborhood to check is taken as an input parameter in builder and stored in the table.
      
      Test Plan:
      make check all
      cuckoo_table_{db,reader,builder}_test
      
      Reviewers: sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22455
      7f714483
  28. 28 8月, 2014 1 次提交
  29. 12 8月, 2014 1 次提交
    • R
      Integrating Cuckoo Hash SST Table format into RocksDB · 9674c11d
      Radheshyam Balasundaram 提交于
      Summary:
      Contains the following changes:
      - Implementation of cuckoo_table_factory
      - Adding cuckoo table into AdaptiveTableFactory
      - Adding cuckoo_table_db_test, similar to lines of plain_table_db_test
      - Minor fixes to Reader: When a key is found in the table, return the key found instead of the search key.
      - Minor fixes to Builder: Add table properties that are required by Version::UpdateTemporaryStats() during Get operation. Don't define curr_node as a reference variable as the memory locations may get reassigned during tree.push_back operation, leading to invalid memory access.
      
      Test Plan:
      cuckoo_table_reader_test --enable_perf
      cuckoo_table_builder_test
      cuckoo_table_db_test
      make check all
      make valgrind_check
      make asan_check
      
      Reviewers: sdong, igor, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D21219
      9674c11d
  30. 06 8月, 2014 1 次提交
    • R
      Changing implementaiton of CuckooTableBuilder to not take file_size, key_length, value_length. · 606a1267
      Radheshyam Balasundaram 提交于
      Summary:
       - Maintain a list of key-value pairs as vectors during Add operation.
       - Start building hash table only when Finish() is called.
       - This approach takes more time and space but avoids taking file_size, key and value lengths.
       - Rewrote cuckoo_table_builder_test
      
      I did not know about IterKey while writing this diff. I shall change places where IterKey could be used instead of std::string tomorrow. Please review rest of the logic.
      
      Test Plan:
      cuckoo_table_reader_test --enable_perf
      cuckoo_table_builder_test
      valgrind_check
      asan_check
      
      Reviewers: sdong, igor, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D20907
      606a1267
  31. 01 8月, 2014 1 次提交
  32. 29 7月, 2014 1 次提交
    • R
      Minor changes to CuckooTableBuilder · 91c01485
      Radheshyam Balasundaram 提交于
      Summary:
      - Copy the key and value to in-memory hash table during Add operation. Also modified cuckoo_table_reader_test to use this.
      - Store only the user_key in in-memory hash table if it is last level file.
      - Handle Carryover while chosing unused key in Finish() method in case unused key was never found before Finish() call.
      
      Test Plan:
      cuckoo_table_reader_test --enable_perf
      cuckoo_table_builder_test
      valgrind_check
      asan_check
      
      Reviewers: sdong, yhchiang, igor, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D20715
      91c01485
  33. 26 7月, 2014 1 次提交
    • R
      Implementation of CuckooTableReader · 62f9b071
      Radheshyam Balasundaram 提交于
      Summary:
      Contains:
      - Implementation of TableReader based on Cuckoo Hashing
      - Unittests for CuckooTableReader
      - Performance test for TableReader
      
      Test Plan:
      make cuckoo_table_reader_test
      ./cuckoo_table_reader_test
      make valgrind_check
      make asan_check
      
      Reviewers: yhchiang, sdong, igor, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D20511
      62f9b071
  34. 25 7月, 2014 1 次提交
    • R
      Addressing TODOs in CuckooTableBuilder · 07a7d870
      Radheshyam Balasundaram 提交于
      Summary:
      Contains the following changes in CuckooTableBuilder:
      - Take an extra parameter in constructor to identify last level file.
      - Implement a better way to identify if a bucket has been inserted into the tree already during BFS search.
      - Minor typos
      
      Test Plan:
      make cuckoo_table_builder
      ./cuckoo_table_builder
      make valgrind_check
      
      Reviewers: sdong, igor, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D20445
      07a7d870
  35. 22 7月, 2014 2 次提交