1. 24 10月, 2020 1 次提交
    • Y
      Allow compaction iterator to perform garbage collection (#7556) · 65952679
      Yanqin Jin 提交于
      Summary:
      Add a threshold timestamp, full_history_ts_low_ of type `std::string*` to
      `CompactionIterator`, so that RocksDB can also perform garbage collection during
      compaction.
      * If full_history_ts_low_ is nullptr, then compaction iterator does not perform
        GC, preserving all timestamp history for all keys. Compaction iterator will
      treat user key with different timestamps as different user keys.
      * If full_history_ts_low_ is not nullptr, then compaction iterator performs
        GC. GC will look at keys older than `*full_history_ts_low_` and determine their
        eligibility based on factors including snapshots.
      
      Current rules of GC:
       * If an internal key is in the same snapshot as a previous counterpart
          with the same user key, and this key is eligible for GC, and the key is
          not single-delete or merge operand, then this key can be dropped. Note
          that the previous internal key cannot be a merge operand either.
       * If a tombstone is the most recent one in the earliest snapshot and it
          is eligible for GC, and keyNotExistsBeyondLevel() is true, then this
          tombstone can be dropped.
       * If a tombstone is the most recent one in a snapshot and it is eligible
          for GC, and the compaction is at bottommost level, then all other older
          internal keys of the same user key must also be eligible for GC, thus
          can be dropped
      * Single-delete, delete-range and merge are not currently supported.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7556
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D24507728
      
      Pulled By: riversand963
      
      fbshipit-source-id: 3c09c7301f41eed76dfcf4d1527e68cf6e0a8bb3
      65952679
  2. 10 7月, 2020 1 次提交
    • M
      More Makefile Cleanup (#7097) · c7c7b07f
      mrambacher 提交于
      Summary:
      Cleans up some of the dependencies on test code in the Makefile while building tools:
      - Moves the test::RandomString, DBBaseTest::RandomString into Random
      - Moves the test::RandomHumanReadableString into Random
      - Moves the DestroyDir method into file_utils
      - Moves the SetupSyncPointsToMockDirectIO into sync_point.
      - Moves the FaultInjection Env and FS classes under env
      
      These changes allow all of the tools to build without dependencies on test_util, thereby simplifying the build dependencies.  By moving the FaultInjection code, the dependency in db_stress on different libraries for debug vs release was eliminated.
      
      Tested both release and debug builds via Make and CMake for both static and shared libraries.
      
      More work remains to clean up how the tools are built and remove some unnecessary dependencies.  There is also more work that should be done to get the Makefile and CMake to align in their builds -- what is in the libraries and the sizes of the executables are different.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7097
      
      Reviewed By: riversand963
      
      Differential Revision: D22463160
      
      Pulled By: pdillinger
      
      fbshipit-source-id: e19462b53324ab3f0b7c72459dbc73165cc382b2
      c7c7b07f
  3. 06 6月, 2020 1 次提交
    • A
      Check iterator status BlockBasedTableReader::VerifyChecksumInBlocks() (#6909) · 98b0cbea
      anand76 提交于
      Summary:
      The ```for``` loop in ```VerifyChecksumInBlocks``` only checks ```index_iter->Valid()``` which could be ```false``` either due to reaching the end of the index or, in case of partitioned index, it could be due to a checksum mismatch error when reading a 2nd level index block. Instead of throwing away the index iterator status, we need to return any errors back to the caller.
      
      Tests:
      Add a test in block_based_table_reader_test.cc.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6909
      
      Reviewed By: pdillinger
      
      Differential Revision: D21833922
      
      Pulled By: anand1976
      
      fbshipit-source-id: bc778ebf1121dbbdd768689de5183f07a9f0beae
      98b0cbea
  4. 09 5月, 2020 1 次提交
    • A
      prototype status check enforcement (#6798) · 1c846604
      Andrew Kryczka 提交于
      Summary:
      Tried making Status object enforce that it is checked in some way. In cases it is not checked, `PermitUncheckedError()` must be called explicitly.
      
      Added a way to run tests (`ASSERT_STATUS_CHECKED=1 make -j48 check`) on a
      whitelist. The effort appears significant to get each test to pass with
      this assertion, so I only fixed up enough to get one test (`options_test`)
      working and added it to the whitelist.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6798
      
      Reviewed By: pdillinger
      
      Differential Revision: D21377404
      
      Pulled By: ajkr
      
      fbshipit-source-id: 73236f9c8df38f01cf24ecac4a6d1661b72d077e
      1c846604
  5. 25 4月, 2020 2 次提交
    • C
      Disable O_DIRECT in stress test when db directory does not support direct IO (#6727) · 0a776178
      Cheng Chang 提交于
      Summary:
      In crash test, the db directory might be set to /dev/shm or /tmp, in certain environments such as internal testing infrastructure, neither of these directories support direct IO, so direct IO is never enabled in crash test.
      
      This PR sets up SyncPoints in direct IO related code paths to disable O_DIRECT flag in calls to `open`, so the direct IO code paths will be executed, all direct IO related assertions will be checked, but no real direct IO request will be issued to the file system.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6727
      
      Test Plan:
      export CRASH_TEST_EXT_ARGS="--use_direct_reads=1 --mmap_read=0"
      make -j24 crash_test
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D21139250
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: db9adfe78d91aa4759835b1af91c5db7b27b62ee
      0a776178
    • C
      Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) · 40497a87
      Cheng Chang 提交于
      Summary:
      In https://github.com/facebook/rocksdb/pull/6455, we modified the interface of `RandomAccessFileReader::Read` to be able to get rid of memcpy in direct IO mode.
      This PR applies the new interface to `BlockFetcher` when reading blocks from SST files in direct IO mode.
      
      Without this PR, in direct IO mode, when fetching and uncompressing compressed blocks, `BlockFetcher` will first copy the raw compressed block into `BlockFetcher::compressed_buf_` or `BlockFetcher::stack_buf_` inside `RandomAccessFileReader::Read` depending on the block size. then during uncompressing, it will copy the uncompressed block into `BlockFetcher::heap_buf_`.
      
      In this PR, we get rid of the first memcpy and directly uncompress the block from `direct_io_buf_` to `heap_buf_`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6689
      
      Test Plan: A new unit test `block_fetcher_test` is added.
      
      Reviewed By: anand1976
      
      Differential Revision: D21006729
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 2370b92c24075692423b81277415feb2aed5d980
      40497a87
  6. 11 4月, 2020 1 次提交
    • Y
      Compaction with timestamp: input boundaries (#6645) · 0c05624d
      Yanqin Jin 提交于
      Summary:
      Towards making compaction logic compatible with user timestamp.
      When computing boundaries and overlapping ranges for inputs of compaction, We need to compare SSTs by user key without timestamp.
      
      Test plan (devserver):
      ```
      make check
      ```
      Several individual tests:
      ```
      ./version_set_test --gtest_filter=VersionStorageInfoTimestampTest.GetOverlappingInputs
      ./db_with_timestamp_compaction_test
      ./db_with_timestamp_basic_test
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6645
      
      Reviewed By: ltamasi
      
      Differential Revision: D20960012
      
      Pulled By: riversand963
      
      fbshipit-source-id: ad377fa9eb481bf7a8a3e1824aaade48cdc653a4
      0c05624d
  7. 21 2月, 2020 1 次提交
    • S
      Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) · fdf882de
      sdong 提交于
      Summary:
      When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
      
      Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
      
      Differential Revision: D19977691
      
      fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
      fdf882de
  8. 14 2月, 2020 1 次提交
    • C
      Fix flaky test DecreaseNumBgThreads (#6393) · 46516778
      Cheng Chang 提交于
      Summary:
      The DecreaseNumBgThreads test keeps failing on Windows in AppVeyor.
      It fails because it depends on a timed wait for the tasks to be dequeued from the threadpool's internal queue, but within the specified time, the task might have not been scheduled onto the newly created threads.
      https://github.com/facebook/rocksdb/pull/6232 tries to fix this by waiting for longer time to let the threads scheduled.
      This PR tries to fix this by replacing the timed wait with a synchronization on the task's internal conditional variable.
      When the number of threads increases, instead of guessing the time needed for the task to be scheduled, it directly blocks on the conditional variable until the task starts running.
      But when thread number is reduced, it still does a timed wait, but this does not lead to the flakiness now, will try to remove these timed waits in a future PR.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6393
      
      Test Plan: Wait to see whether AppVeyor tests pass.
      
      Differential Revision: D19890928
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 4e56e4addf625c98c0876e62d9d57a6f0a156f76
      46516778
  9. 08 2月, 2020 1 次提交
    • S
      Allow readahead when reading option files. (#6372) · 876c2dbf
      sdong 提交于
      Summary:
      Right, when reading from option files, no readahead is used and 8KB buffer is used. It might introduce high latency if the file system provide high latency and doesn't do readahead. Instead, introduce a readahead to the file. When calling inside DB, infer the value from options.log_readahead. Otherwise, a default 512KB readahead size is used.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6372
      
      Test Plan: Add --log_readahead_size in db_bench. Run it with several options and observe read size from option files using strace.
      
      Differential Revision: D19727739
      
      fbshipit-source-id: e6d8053b0a64259abc087f1f388b9cd66fa8a583
      876c2dbf
  10. 14 12月, 2019 1 次提交
    • A
      Introduce a new storage specific Env API (#5761) · afa2420c
      anand76 提交于
      Summary:
      The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc.
      
      This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO.
      
      The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before.
      
      This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection.
      
      The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761
      
      Differential Revision: D18868376
      
      Pulled By: anand1976
      
      fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f
      afa2420c
  11. 17 7月, 2019 1 次提交
  12. 10 7月, 2019 1 次提交
  13. 04 6月, 2019 1 次提交
  14. 31 5月, 2019 2 次提交
  15. 14 11月, 2018 1 次提交
    • A
      Backup engine support for direct I/O reads (#4640) · ea945470
      Andrew Kryczka 提交于
      Summary:
      Use the `DBOptions` that the backup engine already holds to figure out the right `EnvOptions` to use when reading the DB files. This means that, if a user opened a DB instance with `use_direct_reads=true`, then using `BackupEngine` to back up that DB instance will use direct I/O to read files when calculating checksums and copying. Currently the WALs and manifests would still be read using buffered I/O to prevent mixing direct I/O reads with concurrent buffered I/O writes.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4640
      
      Differential Revision: D13015268
      
      Pulled By: ajkr
      
      fbshipit-source-id: 77006ad6f3e00ce58374ca4793b785eea0db6269
      ea945470
  16. 10 11月, 2018 1 次提交
    • S
      Update all unique/shared_ptr instances to be qualified with namespace std (#4638) · dc352807
      Sagar Vemuri 提交于
      Summary:
      Ran the following commands to recursively change all the files under RocksDB:
      ```
      find . -type f -name "*.cc" -exec sed -i 's/ unique_ptr/ std::unique_ptr/g' {} +
      find . -type f -name "*.cc" -exec sed -i 's/<unique_ptr/<std::unique_ptr/g' {} +
      find . -type f -name "*.cc" -exec sed -i 's/ shared_ptr/ std::shared_ptr/g' {} +
      find . -type f -name "*.cc" -exec sed -i 's/<shared_ptr/<std::shared_ptr/g' {} +
      ```
      Running `make format` updated some formatting on the files touched.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4638
      
      Differential Revision: D12934992
      
      Pulled By: sagar0
      
      fbshipit-source-id: 45a15d23c230cdd64c08f9c0243e5183934338a8
      dc352807
  17. 06 9月, 2018 1 次提交
  18. 24 8月, 2018 1 次提交
  19. 15 8月, 2018 1 次提交
  20. 21 6月, 2018 1 次提交
  21. 05 6月, 2018 1 次提交
    • M
      Extend some tests to format_version=3 (#3942) · d0c38c0c
      Maysam Yabandeh 提交于
      Summary:
      format_version=3 changes the format of SST index. This is however not being tested currently since tests only work with the default format_version which is currently 2. The patch extends the most related tests to also test for format_version=3.
      Closes https://github.com/facebook/rocksdb/pull/3942
      
      Differential Revision: D8238413
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 915725f55753dd8e9188e802bf471c23645ad035
      d0c38c0c
  22. 26 5月, 2018 1 次提交
    • M
      Exclude seq from index keys · 402b7aa0
      Maysam Yabandeh 提交于
      Summary:
      Index blocks have the same format as data blocks. The keys therefore similarly to the keys in the data blocks are internal keys, which means that in addition to the user key it also has 8 bytes that encodes sequence number and value type. This extra 8 bytes however is not necessary in index blocks since the index keys act as an separator between two data blocks. The only exception is when the last key of a block and the first key of the next block share the same user key, in which the sequence number is required to act as a separator.
      The patch excludes the sequence from index keys only if the above special case does not happen for any of the index keys. It then records that in the property block. The reader looks at the property block to see if it should expect sequence numbers in the keys of the index block.s
      Closes https://github.com/facebook/rocksdb/pull/3894
      
      Differential Revision: D8118775
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 915479f028b5799ca91671d67455ecdefbd873bd
      402b7aa0
  23. 22 5月, 2018 1 次提交
    • A
      Assert keys/values pinned by range deletion meta-block iterators · 7b655214
      Andrew Kryczka 提交于
      Summary:
      `RangeDelAggregator` holds the pointers returned by `BlockIter::key()` and `BlockIter::value()` so requires the data to which they point is pinned. `BlockIter::key()` points into block memory and is guaranteed to be pinned if and only if prefix encoding is disabled (or, equivalently, restart interval is set to one). I think `BlockIter::value()` is always pinned. Added an assert for these and removed the wrong TODO about increasing restart interval, which would enable key prefix encoding and break the assertion.
      Closes https://github.com/facebook/rocksdb/pull/3875
      
      Differential Revision: D8063667
      
      Pulled By: ajkr
      
      fbshipit-source-id: 60b5ebcc0cdd610dd6aad9e74a23378793672c41
      7b655214
  24. 06 3月, 2018 1 次提交
  25. 23 2月, 2018 2 次提交
  26. 12 9月, 2017 1 次提交
    • S
      Make InternalKeyComparator final and directly use it in merging iterator · 64b6452e
      Siying Dong 提交于
      Summary:
      Merging iterator invokes InternalKeyComparator.Compare() frequently to heap merge. By making InternalKeyComparator final and merging iterator to directly use InternalKeyComparator rather than through Iterator interface, we can give compiler a choice to avoid one more virtual function call if possible. I ran readseq benchmark in memory-only use case to make sure the performance at least doesn't regress.
      
      I have to disable the final key word in debug build, as a hack test class depends on overriding the class.
      Closes https://github.com/facebook/rocksdb/pull/2860
      
      Differential Revision: D5800461
      
      Pulled By: siying
      
      fbshipit-source-id: ab876f22a09bb5c560740911412336e0e25ccb53
      64b6452e
  27. 22 7月, 2017 2 次提交
  28. 16 7月, 2017 1 次提交
  29. 28 4月, 2017 1 次提交
  30. 19 10月, 2016 1 次提交
    • I
      Support SST files with Global sequence numbers [reland] · b88f8e87
      Islam AbdelRahman 提交于
      Summary:
      reland https://reviews.facebook.net/D62523
      
      - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno`
      - Update TableProperties to be aware of the offset of each property in the file
      - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file
      
      Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks
      
      Test Plan: unit tests
      
      Reviewers: sdong, yhchiang
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D65211
      b88f8e87
  31. 08 10月, 2016 1 次提交
  32. 04 10月, 2016 2 次提交
    • I
      Fix Mac build · 9d6c9613
      Islam AbdelRahman 提交于
      9d6c9613
    • I
      Support SST files with Global sequence numbers · ab01da54
      Islam AbdelRahman 提交于
      Summary:
      - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno`
      - Update TableProperties to be aware of the offset of each property in the file
      - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file
      
      Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks
      
      Test Plan: unit tests
      
      Reviewers: andrewkr, yhchiang, yiwu, sdong
      
      Reviewed By: sdong
      
      Subscribers: hcz, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D62523
      ab01da54
  33. 28 9月, 2016 1 次提交
    • A
      Add SeekForPrev() to Iterator · f517d9dd
      Aaron Gao 提交于
      Summary:
      Add new Iterator API, `SeekForPrev`: find the last key that <= target key
      support prefix_extractor
      support prefix_same_as_start
      support upper_bound
      not supported in iterators without Prev()
      
      Also add tests in db_iter_test and db_iterator_test
      
      Pass all tests
      Cheers!
      
      Test Plan: make all check -j64
      
      Reviewers: andrewkr, yiwu, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64149
      f517d9dd
  34. 08 9月, 2016 1 次提交
  35. 21 7月, 2016 1 次提交
    • I
      Introduce FullMergeV2 (eliminate memcpy from merge operators) · 68a8e6b8
      Islam AbdelRahman 提交于
      Summary:
      This diff update the code to pin the merge operator operands while the merge operation is done, so that we can eliminate the memcpy cost, to do that we need a new public API for FullMerge that replace the std::deque<std::string> with std::vector<Slice>
      
      This diff is stacked on top of D56493 and D56511
      
      In this diff we
      - Update FullMergeV2 arguments to be encapsulated in MergeOperationInput and MergeOperationOutput which will make it easier to add new arguments in the future
      - Replace std::deque<std::string> with std::vector<Slice> to pass operands
      - Replace MergeContext std::deque with std::vector (based on a simple benchmark I ran https://gist.github.com/IslamAbdelRahman/78fc86c9ab9f52b1df791e58943fb187)
      - Allow FullMergeV2 output to be an existing operand
      
      ```
      [Everything in Memtable | 10K operands | 10 KB each | 1 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=10000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000
      
      [FullMergeV2]
      readseq      :       0.607 micros/op 1648235 ops/sec; 16121.2 MB/s
      readseq      :       0.478 micros/op 2091546 ops/sec; 20457.2 MB/s
      readseq      :       0.252 micros/op 3972081 ops/sec; 38850.5 MB/s
      readseq      :       0.237 micros/op 4218328 ops/sec; 41259.0 MB/s
      readseq      :       0.247 micros/op 4043927 ops/sec; 39553.2 MB/s
      
      [master]
      readseq      :       3.935 micros/op 254140 ops/sec; 2485.7 MB/s
      readseq      :       3.722 micros/op 268657 ops/sec; 2627.7 MB/s
      readseq      :       3.149 micros/op 317605 ops/sec; 3106.5 MB/s
      readseq      :       3.125 micros/op 320024 ops/sec; 3130.1 MB/s
      readseq      :       4.075 micros/op 245374 ops/sec; 2400.0 MB/s
      ```
      
      ```
      [Everything in Memtable | 10K operands | 10 KB each | 10 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=1000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000
      
      [FullMergeV2]
      readseq      :       3.472 micros/op 288018 ops/sec; 2817.1 MB/s
      readseq      :       2.304 micros/op 434027 ops/sec; 4245.2 MB/s
      readseq      :       1.163 micros/op 859845 ops/sec; 8410.0 MB/s
      readseq      :       1.192 micros/op 838926 ops/sec; 8205.4 MB/s
      readseq      :       1.250 micros/op 800000 ops/sec; 7824.7 MB/s
      
      [master]
      readseq      :      24.025 micros/op 41623 ops/sec;  407.1 MB/s
      readseq      :      18.489 micros/op 54086 ops/sec;  529.0 MB/s
      readseq      :      18.693 micros/op 53495 ops/sec;  523.2 MB/s
      readseq      :      23.621 micros/op 42335 ops/sec;  414.1 MB/s
      readseq      :      18.775 micros/op 53262 ops/sec;  521.0 MB/s
      
      ```
      
      ```
      [Everything in Block cache | 10K operands | 10 KB each | 1 operand per key]
      
      [FullMergeV2]
      $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions
      readseq      :      14.741 micros/op 67837 ops/sec;  663.5 MB/s
      readseq      :       1.029 micros/op 971446 ops/sec; 9501.6 MB/s
      readseq      :       0.974 micros/op 1026229 ops/sec; 10037.4 MB/s
      readseq      :       0.965 micros/op 1036080 ops/sec; 10133.8 MB/s
      readseq      :       0.943 micros/op 1060657 ops/sec; 10374.2 MB/s
      
      [master]
      readseq      :      16.735 micros/op 59755 ops/sec;  584.5 MB/s
      readseq      :       3.029 micros/op 330151 ops/sec; 3229.2 MB/s
      readseq      :       3.136 micros/op 318883 ops/sec; 3119.0 MB/s
      readseq      :       3.065 micros/op 326245 ops/sec; 3191.0 MB/s
      readseq      :       3.014 micros/op 331813 ops/sec; 3245.4 MB/s
      ```
      
      ```
      [Everything in Block cache | 10K operands | 10 KB each | 10 operand per key]
      
      DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10-operands-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions
      
      [FullMergeV2]
      readseq      :      24.325 micros/op 41109 ops/sec;  402.1 MB/s
      readseq      :       1.470 micros/op 680272 ops/sec; 6653.7 MB/s
      readseq      :       1.231 micros/op 812347 ops/sec; 7945.5 MB/s
      readseq      :       1.091 micros/op 916590 ops/sec; 8965.1 MB/s
      readseq      :       1.109 micros/op 901713 ops/sec; 8819.6 MB/s
      
      [master]
      readseq      :      27.257 micros/op 36687 ops/sec;  358.8 MB/s
      readseq      :       4.443 micros/op 225073 ops/sec; 2201.4 MB/s
      readseq      :       5.830 micros/op 171526 ops/sec; 1677.7 MB/s
      readseq      :       4.173 micros/op 239635 ops/sec; 2343.8 MB/s
      readseq      :       4.150 micros/op 240963 ops/sec; 2356.8 MB/s
      ```
      
      Test Plan: COMPILE_WITH_ASAN=1 make check -j64
      
      Reviewers: yhchiang, andrewkr, sdong
      
      Reviewed By: sdong
      
      Subscribers: lovro, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D57075
      68a8e6b8