1. 24 8月, 2018 1 次提交
  2. 10 8月, 2018 1 次提交
    • M
      Index value delta encoding (#3983) · caf0f53a
      Maysam Yabandeh 提交于
      Summary:
      Given that index value is a BlockHandle, which is basically an <offset, size> pair we can apply delta encoding on the values. The first value at each index restart interval encoded the full BlockHandle but the rest encode only the size. Refer to IndexBlockIter::DecodeCurrentValue for the detail of the encoding. This reduces the index size which helps using the  block cache more efficiently. The feature is enabled with using format_version 4.
      
      The feature comes with a bit of cpu overhead which should be paid back by the higher cache hits due to smaller index block size.
      Results with sysbench read-only using 4k blocks and using 16 index restart interval:
      Format 2:
      19585   rocksdb read-only range=100
      Format 3:
      19569   rocksdb read-only range=100
      Format 4:
      19352   rocksdb read-only range=100
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/3983
      
      Differential Revision: D8361343
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: f882ee082322acac32b0072e2bdbb0b5f854e651
      caf0f53a
  3. 17 7月, 2018 1 次提交
    • S
      Separate some IndexBlockIter logic from BlockIter (#4136) · 8f06b4fa
      Siying Dong 提交于
      Summary:
      Some logic only related to IndexBlockIter is separated from BlockIter to IndexBlockIter. This is done by writing an exclusive Seek() and SeekForPrev() for DataBlockIter, and all metadata block iter and tombstone block iter now use data block iter. Dealing with the BinarySeek() sharing problem by passing in the comparator to use.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4136
      
      Reviewed By: maysamyabandeh
      
      Differential Revision: D8859673
      
      Pulled By: siying
      
      fbshipit-source-id: 703e5e6824b82b7cbf4721f3594b94127797ca9e
      8f06b4fa
  4. 13 7月, 2018 1 次提交
    • M
      Refactor BlockIter (#4121) · d4ad32d7
      Maysam Yabandeh 提交于
      Summary:
      BlockIter is getting crowded including details that specific only to either index or data blocks. The patch moves down such details to DataBlockIter and IndexBlockIter, both inheriting from BlockIter.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4121
      
      Differential Revision: D8816832
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d492e74155c11d8a0c1c85cd7ee33d24c7456197
      d4ad32d7
  5. 26 5月, 2018 1 次提交
    • M
      Exclude seq from index keys · 402b7aa0
      Maysam Yabandeh 提交于
      Summary:
      Index blocks have the same format as data blocks. The keys therefore similarly to the keys in the data blocks are internal keys, which means that in addition to the user key it also has 8 bytes that encodes sequence number and value type. This extra 8 bytes however is not necessary in index blocks since the index keys act as an separator between two data blocks. The only exception is when the last key of a block and the first key of the next block share the same user key, in which the sequence number is required to act as a separator.
      The patch excludes the sequence from index keys only if the above special case does not happen for any of the index keys. It then records that in the property block. The reader looks at the property block to see if it should expect sequence numbers in the keys of the index block.s
      Closes https://github.com/facebook/rocksdb/pull/3894
      
      Differential Revision: D8118775
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 915479f028b5799ca91671d67455ecdefbd873bd
      402b7aa0
  6. 06 3月, 2018 1 次提交
  7. 23 2月, 2018 2 次提交
  8. 03 1月, 2018 1 次提交
    • S
      Speed up BlockTest.BlockReadAmpBitmap · ccc095a0
      Siying Dong 提交于
      Summary:
      BlockTest.BlockReadAmpBitmap is too slow and times out in some environments. Speed it up by:
      (1) improve the way the verification is done. With this it is 5 times faster
      (2) run fewer tests for large blocks. This cut it down by another 10 times.
      Now it can finish in similar time as other tests.
      Closes https://github.com/facebook/rocksdb/pull/3313
      
      Differential Revision: D6643711
      
      Pulled By: siying
      
      fbshipit-source-id: c2397d666eab5421a78ca87e1e45491e0f832a6d
      ccc095a0
  9. 22 7月, 2017 2 次提交
  10. 16 7月, 2017 1 次提交
  11. 10 5月, 2017 1 次提交
    • A
      unbiase readamp bitmap · 259a00ea
      Aaron Gao 提交于
      Summary:
      Consider BlockReadAmpBitmap with bytes_per_bit = 32. Suppose bytes [a, b) were used, while bytes [a-32, a)
       and [b+1, b+33) weren't used; more formally, the union of ranges passed to BlockReadAmpBitmap::Mark() contains [a, b) and doesn't intersect with [a-32, a) and [b+1, b+33). Then bits [floor(a/32), ceil(b/32)] will be set, and so the number of useful bytes will be estimated as (ceil(b/32) - floor(a/32)) * 32, which is on average equal to b-a+31.
      
      An extreme example: if we use 1 byte from each block, it'll be counted as 32 bytes from each block.
      
      It's easy to remove this bias by slightly changing the semantics of the bitmap. Currently each bit represents a byte range [i*32, (i+1)*32).
      
      This diff makes each bit represent a single byte: i*32 + X, where X is a random number in [0, 31] generated when bitmap is created. So, e.g., if you read a single byte at random, with probability 31/32 it won't be counted at all, and with probability 1/32 it will be counted as 32 bytes; so, on average it's counted as 1 byte.
      
      *But there is one exception: the last bit will always set with the old way.*
      
      (*) - assuming read_amp_bytes_per_bit = 32.
      Closes https://github.com/facebook/rocksdb/pull/2259
      
      Differential Revision: D5035652
      
      Pulled By: lightmark
      
      fbshipit-source-id: bd98b1b9b49fbe61f9e3781d07f624e3cbd92356
      259a00ea
  12. 28 4月, 2017 1 次提交
  13. 07 4月, 2017 1 次提交
  14. 19 10月, 2016 1 次提交
    • I
      Support SST files with Global sequence numbers [reland] · b88f8e87
      Islam AbdelRahman 提交于
      Summary:
      reland https://reviews.facebook.net/D62523
      
      - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno`
      - Update TableProperties to be aware of the offset of each property in the file
      - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file
      
      Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks
      
      Test Plan: unit tests
      
      Reviewers: sdong, yhchiang
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65211
      b88f8e87
  15. 08 10月, 2016 1 次提交
  16. 04 10月, 2016 1 次提交
    • I
      Support SST files with Global sequence numbers · ab01da54
      Islam AbdelRahman 提交于
      Summary:
      - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno`
      - Update TableProperties to be aware of the offset of each property in the file
      - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file
      
      Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks
      
      Test Plan: unit tests
      
      Reviewers: andrewkr, yhchiang, yiwu, sdong
      
      Reviewed By: sdong
      
      Subscribers: hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62523
      ab01da54
  17. 17 9月, 2016 1 次提交
  18. 06 9月, 2016 1 次提交
  19. 01 9月, 2016 1 次提交
  20. 27 8月, 2016 1 次提交
    • I
      Introduce Read amplification bitmap (read amp statistics) · b49b92cf
      Islam AbdelRahman 提交于
      Summary:
      Add ReadOptions::read_amp_bytes_per_bit option which allow us to create a bitmap for every data block we read
      the bitmap will contain (block_size / read_amp_bytes_per_bit) bits.
      
      We will use this bitmap to mark which bytes have been used of the block so we can calculate the read amplification
      
      Test Plan: added new tests
      
      Reviewers: andrewkr, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: yiwu, leveldb, march, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D58707
      b49b92cf
  21. 21 5月, 2016 1 次提交
  22. 10 2月, 2016 1 次提交
  23. 14 10月, 2015 1 次提交
    • S
      Seperate InternalIterator from Iterator · 35ad531b
      sdong 提交于
      Summary:
      Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.
      
      This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
      At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.
      
      Test Plan: Run all existing tests.
      
      Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven
      
      Reviewed By: rven
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48549
      35ad531b
  24. 18 3月, 2015 1 次提交
    • I
      rocksdb: switch to gtest · b4b69e4f
      Igor Sugak 提交于
      Summary:
      Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different.
      In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest.
      
      There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides.
      
      ```lang=bash
      % cat ~/transform
      #!/bin/sh
      files=$(git ls-files '*test\.cc')
      for file in $files
      do
        if grep -q "rocksdb::test::RunAllTests()" $file
        then
          if grep -Eq '^class \w+Test {' $file
          then
            perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file
            perl -pi -e 's/^(TEST)/${1}_F/g' $file
          fi
          perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file
          perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file
        fi
      done
      % sh ~/transform
      % make format
      ```
      
      Second iteration of this diff contains only scripted changes.
      
      Third iteration contains manual changes to fix last errors and make it compilable.
      
      Test Plan:
      Build and notice no errors.
      ```lang=bash
      % USE_CLANG=1 make check -j55
      ```
      Tests are still testing.
      
      Reviewers: meyering, sdong, rven, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D35157
      b4b69e4f
  25. 12 11月, 2014 1 次提交
    • I
      Turn on -Wshorten-64-to-32 and fix all the errors · 767777c2
      Igor Canadi 提交于
      Summary:
      We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.
      
      This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.
      
      Test Plan: compiles
      
      Reviewers: ljin, rven, yhchiang, sdong
      
      Reviewed By: yhchiang
      
      Subscribers: bobbaldwin, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D28689
      767777c2
  26. 18 9月, 2014 2 次提交
  27. 03 9月, 2014 1 次提交
    • I
      Fix compile · 076bd01a
      Igor Canadi 提交于
      Summary: gcc on our dev boxes is not happy about __attribute__((unused))
      
      Test Plan: compiles now
      
      Reviewers: sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22707
      076bd01a
  28. 26 8月, 2014 2 次提交
  29. 16 5月, 2014 1 次提交
  30. 11 4月, 2014 1 次提交
  31. 04 2月, 2014 1 次提交
  32. 17 11月, 2013 1 次提交
    • K
      Make the options in table_builder/block_builder less misleading · 7604e2f7
      Kai Liu 提交于
      Summary:
      By original design, the regular `block options` and index `block options` in table_builder is mutable. We can use ChangeOptions to change the options directly.
      
      However, with my last change, `BlockBuilder` no longer hold the reference to the index_block_options -- as a result, any changes made after the creation of index block builder will be of no effect.
      
      But still the code is very error-prone and developers can easily fall into the trap without aware of it. To avoid this problem from happening in the future, I deleted the `ChangeOptions` and the `index_block_options`, as well as many other changes to make it less misleading.
      
      Test Plan:
      make
      make check
      make release
      
      Reviewers: dhruba, haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13707
      7604e2f7
  33. 29 10月, 2013 1 次提交
    • S
      Make "Table" pluggable · d4eec30e
      Siying Dong 提交于
      Summary: This patch makes Table and TableBuilder a abstract class and make all the implementation of the current table into BlockedBasedTable and BlockedBasedTable Builder.
      
      Test Plan: Make db_test.cc to work with block based table. Add a new test simple_table_db_test.cc where a different simple table format is implemented.
      
      Reviewers: dhruba, haobo, kailiu, emayanke, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D13521
      d4eec30e
  34. 17 10月, 2013 1 次提交
  35. 05 10月, 2013 1 次提交
  36. 24 8月, 2013 1 次提交