1. 30 8月, 2022 3 次提交
    • P
      Don't wait for indirect flush in read-only DB (#10569) · c5afbbfe
      Peter Dillinger 提交于
      Summary:
      Some APIs for getting live files, which are used by Checkpoint
      and BackupEngine, can optionally trigger and wait for a flush. These
      would deadlock when used on a read-only DB. Here we fix that by assuming
      the user wants the overall operation to succeed and is OK without
      flushing (because the DB is read-only).
      
      Follow-up work: the same or other issues can be hit by directly invoking
      some DB functions that are clearly not appropriate for read-only
      instance, but are not covered by overrides in DBImplReadOnly and
      CompactedDBImpl. These should be fixed to avoid similar problems on
      accidental misuse. (Long term, it would be nice to have a DBReadOnly
      class without those members, like BackupEngineReadOnly.)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10569
      
      Test Plan: tests updated to catch regression (hang before the fix)
      
      Reviewed By: riversand963
      
      Differential Revision: D38995759
      
      Pulled By: pdillinger
      
      fbshipit-source-id: f5f8bc7123e13cb45bd393dd974d7d6eda20bc68
      c5afbbfe
    • C
      Verify Iterator/Get() against expected state in only `no_batched_ops_test` (#10590) · 5532b462
      Changyu Bi 提交于
      Summary:
      https://github.com/facebook/rocksdb/issues/10538 added `TestIterateAgainstExpected()` in `no_batched_ops_test` to verify iterator correctness against the in memory expected state. It is not compatible when run after some other stress tests, e.g. `TestPut()` in `batched_op_stress`, that either do not set expected state when writing to DB or use keys that cannot be parsed by `GetIntVal()`. The assert [here](https://github.com/facebook/rocksdb/blob/d17be55aab80b856f96f4af89f8d18fef96646b4/db_stress_tool/db_stress_common.h#L520) could fail. This PR fixed this issue by setting iterator upperbound to `max_key` when `destroy_db_initially=0` to avoid the key space that `batched_op_stress` touches.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10590
      
      Test Plan:
      ```
      # set up DB with batched_op_stress
      ./db_stress --test_batches_snapshots=1 --verify_iterator_with_expected_state_one_in=1 --max_key_len=3 --max_key=100000000 --skip_verifydb=1 --continuous_verification_interval=0 --writepercent=85 --delpercent=3 --delrangepercent=0 --iterpercent=10 --nooverwritepercent=1 --prefixpercent=0 --readpercent=2 --key_len_percent_dist=1,30,69
      
      # Before this PR, the following test will fail the asserts with error msg like the following
      # Assertion failed: (size_key <= key_gen_ctx.weights.size() * sizeof(uint64_t)), function GetIntVal, file db_stress_common.h, line 524.
      ./db_stress --verify_iterator_with_expected_state_one_in=1 --max_key_len=3 --max_key=100000000 --skip_verifydb=1 --continuous_verification_interval=0 --writepercent=0 --delpercent=3 --delrangepercent=0 --iterpercent=95 --nooverwritepercent=1 --prefixpercent=0 --readpercent=2 --key_len_percent_dist=1,30,69 --destroy_db_initially=0
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D39085243
      
      Pulled By: cbi42
      
      fbshipit-source-id: a7dfee2320c330773b623b442d730fd014ec7056
      5532b462
    • L
      Use the default metadata charge policy when creating an LRU cache via the Java API (#10577) · 64e74723
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10577
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D39035884
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 48f116f8ca172b7eb5eb3651f39ddb891a7ffade
      64e74723
  2. 28 8月, 2022 1 次提交
  3. 27 8月, 2022 2 次提交
    • Z
      Make header more natural. (#10580) · d17be55a
      zhangenming 提交于
      Summary:
      Fixed #10381 for blog's navigation bar UI.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10580
      
      Reviewed By: hx235
      
      Differential Revision: D39079045
      
      Pulled By: cbi42
      
      fbshipit-source-id: 922cf2624f201c0af42815b23d97361fc0151d93
      d17be55a
    • L
      Improve the accounting of memory used by cached blobs (#10583) · 23376aa5
      Levi Tamasi 提交于
      Summary:
      The patch improves the bookkeeping around the memory usage of
      cached blobs in two ways: 1) it uses `malloc_usable_size`, which accounts
      for allocator bin sizes etc., and 2) it also considers the memory usage
      of the `BlobContents` object in addition to the blob itself. Note: some unit
      tests had been relying on the cache charge being equal to the size of the
      cached blob; these were updated.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10583
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D39060680
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 3583adce2b4ce6e84861f3fadccbfd2e5a3cc482
      23376aa5
  4. 26 8月, 2022 7 次提交
    • B
      fix trace_analyzer_tool args column position (#10576) · 7670fdd6
      bilyz 提交于
      Summary:
      The column  meaning explanation is not correct according to the parsed human-readable trace file.
      
      Following are the results data from parsed trace human-readable file format.
      The key is in the first column.
      
      ```
      0x00000005 6 1 0 1661317998095439
      0x00000007 0 1 0 1661317998095479
      0x00000008 6 1 0 1661317998095493
      0x0000000300000001 1 1 6 1661317998101508
      0x0000000300000000 1 1 6 1661317998101508
      0x0000000300000001 0 1 0 1661317998106486
      0x0000000300000000 0 1 0 1661317998106498
      0x0000000A 6 1 0 1661317998106515
      0x00000007 0 1 0 1661317998111887
      0x00000001 6 1 0 1661317998111923
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10576
      
      Reviewed By: ajkr
      
      Differential Revision: D39039110
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: eade6394c7870005b717846af09a848be6f677ce
      7670fdd6
    • J
      Fix periodic_task unable to re-register the same task type (#10379) · d9e71fb2
      Jay Zhuang 提交于
      Summary:
      Timer has a limitation that it cannot re-register a task with the same name,
      because the cancel only mark the task as invalid and wait for the Timer thread
      to clean it up later, before the task is cleaned up, the same task name cannot
      be added. Which makes the task option update likely to fail, which basically
      cancel and re-register the same task name. Change the periodic task name to a
      random unique id and store it in periodic_task_scheduler.
      
      Also refactor the `periodic_work` to `periodic_task` to make each job function
      as a `task`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10379
      
      Test Plan: unittests
      
      Reviewed By: ajkr
      
      Differential Revision: D38000615
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: e4135f9422e3b53aaec8eda54f4e18ce633a279e
      d9e71fb2
    • L
      Introduce a dedicated class to represent blob values (#10571) · 3f57d84a
      Levi Tamasi 提交于
      Summary:
      The patch introduces a new class called `BlobContents`, which represents
      a single uncompressed blob value. We currently use `std::string` for this
      purpose; `BlobContents` is somewhat smaller but the primary reason for a
      dedicated class is that it enables certain improvements and optimizations
      like eliding a copy when inserting a blob into the cache, using custom
      allocators, or more control over and better accounting of the memory usage
      of cached blobs (see https://github.com/facebook/rocksdb/issues/10484).
      (We plan to implement these in subsequent PRs.)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10571
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D39000965
      
      Pulled By: ltamasi
      
      fbshipit-source-id: f296eddf9dec4fc3e11cad525b462bdf63c78f96
      3f57d84a
    • B
      Support CompactionPri::kRoundRobin in RocksJava (#10572) · 418b36a9
      Brendan MacDonell 提交于
      Summary:
      Pretty trivial — this PR just adds the new compaction priority to the Java API.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10572
      
      Reviewed By: hx235
      
      Differential Revision: D39006523
      
      Pulled By: ajkr
      
      fbshipit-source-id: ea8d665817e7b05826c397afa41c3abcda81484e
      418b36a9
    • B
      Update the javadoc for setforceConsistencyChecks (#10574) · 9f290a5d
      Brendan MacDonell 提交于
      Summary:
      As of v6.14 (released in 2020), force_consistency_checks is enabled by default. However, the Java documentation does not seem to have been updated to reflect the change at the time.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10574
      
      Reviewed By: hx235
      
      Differential Revision: D39006566
      
      Pulled By: ajkr
      
      fbshipit-source-id: c7b029484d62deaa1f260ec55084049fe39eb84a
      9f290a5d
    • A
      Ensure writes to WAL tail during `FlushWAL(true /* sync */)` will be synced (#10560) · 7ad4b386
      Andrew Kryczka 提交于
      Summary:
      WAL append and switch can both happen between `FlushWAL(true /* sync */)`'s sync operations and its call to `MarkLogsSynced()`. We permit this since locks need to be released for the sync operations. Such an appended/switched WAL is both inactive and incompletely synced at the time `MarkLogsSynced()` processes it.
      
      Prior to this PR, `MarkLogsSynced()` assumed all inactive WALs were fully synced and removed them from consideration for future syncs. That was wrong in the scenario described above and led to the latest append(s) never being synced. This PR changes `MarkLogsSynced()` to only remove inactive WALs from consideration for which all flushed data has been synced.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10560
      
      Test Plan: repro unit test for the scenario described above. Without this PR, it fails on "key2" not found
      
      Reviewed By: riversand963
      
      Differential Revision: D38957391
      
      Pulled By: ajkr
      
      fbshipit-source-id: da77175eba97ff251a4219b227b3bb2d4843ed26
      7ad4b386
    • A
      CI benchmarks refine configuration (#10514) · 7fbee01f
      Alan Paxton 提交于
      Summary:
      CI benchmarks refine configuration
      
      Run only “essential” benchmarks, but for longer
      Fix (reduce) the NUM_KEYS to ensure cached behaviour
      Reduce level size to try to ensure more levels
      
      Refine test durations again, more time per test, but fewer tests.
      In CI benchmark mode, the only read test is readrandom.
      There are still 3 mostly-read tests.
      
      Goal is to squeeze complete run a little bit inside 1 hour so it doesn’t clash with the next run (cron scheduled for main branch), but it gets to run as long as possible, so that results are as credible as possible.
      
      Reduce thread count to physical capacity, in an attempt to reduce throughput variance for write heavy tests. See Mark Callaghan’s comments in related documentation..
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10514
      
      Reviewed By: ajkr
      
      Differential Revision: D38952469
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 72fa6bba897cc47066ced65facd1fd36e28f30a8
      7fbee01f
  5. 25 8月, 2022 4 次提交
  6. 24 8月, 2022 10 次提交
  7. 23 8月, 2022 1 次提交
    • H
      Add missing synchronization in TestFSWritableFile (#10544) · b16655a5
      Hui Xiao 提交于
      Summary:
      **Context:**
      ajkr's command revealed an existing TSAN data race between `TestFSWritableFile::Append` and `TestFSWritableFile::Sync` on `TestFSWritableFile::state_`
      
      ```
      $ make clean && COMPILE_WITH_TSAN=1 make -j56 db_stress
      $ python3 tools/db_crashtest.py blackbox --simple --duration=3600 --interval=10 --sync_fault_injection=1 --disable_wal=0 --max_key=10000 --checkpoint_one_in=1000
      ```
      
      The race is due to concurrent access from [checkpoint's WAL sync](https://github.com/facebook/rocksdb/blob/7.4.fb/utilities/fault_injection_fs.cc#L324) and [db put's WAL write when ‘sync_fault_injection=1 ‘](https://github.com/facebook/rocksdb/blob/7.4.fb/utilities/fault_injection_fs.cc#L208) to the `state_` on the same WAL `TestFSWritableFile` under the missing synchronization.
      
      ```
      WARNING: ThreadSanitizer: data race (pid=11275)
      Write of size 8 at 0x7b480003d850 by thread T23 (mutexes: write M69230):
      #0 rocksdb::TestFSWritableFile::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:297 (db_stress+0x716004)
      https://github.com/facebook/rocksdb/issues/1 rocksdb::(anonymous namespace)::CompositeWritableFileWrapper::Sync() internal_repo_rocksdb/repo/env/composite_env.cc:154 (db_stress+0x4dfa78)
      https://github.com/facebook/rocksdb/issues/2 rocksdb::(anonymous namespace)::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/env.cc:280 (db_stress+0x6dfd24)
      https://github.com/facebook/rocksdb/issues/3 rocksdb::WritableFileWriter::SyncInternal(bool) internal_repo_rocksdb/repo/file/writable_file_writer.cc:460 (db_stress+0xa1b98c)
      https://github.com/facebook/rocksdb/issues/4 rocksdb::WritableFileWriter::SyncWithoutFlush(bool) internal_repo_rocksdb/repo/file/writable_file_writer.cc:435 (db_stress+0xa1e441)
      https://github.com/facebook/rocksdb/issues/5 rocksdb::DBImpl::SyncWAL() internal_repo_rocksdb/repo/db/db_impl/db_impl.cc:1385 (db_stress+0x529458)
      https://github.com/facebook/rocksdb/issues/6 rocksdb::DBImpl::FlushWAL(bool) internal_repo_rocksdb/repo/db/db_impl/db_impl.cc:1339 (db_stress+0x54f82a)
      https://github.com/facebook/rocksdb/issues/7 rocksdb::DBImpl::GetLiveFilesStorageInfo(rocksdb::LiveFilesStorageInfoOptions const&, std::vector<rocksdb::LiveFileStorageInfo, std::allocator<rocksdb::LiveFileStorageInfo> >*) internal_repo_rocksdb/repo/db/db_filesnapshot.cc:387 (db_stress+0x5c831d)
      https://github.com/facebook/rocksdb/issues/8 rocksdb::CheckpointImpl::CreateCustomCheckpoint(std::function<rocksdb::Status (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileType)>, std::function<rocksdb::Status (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, rocksdb::FileType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Temperature)>, std::function<rocksdb::Status (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileType)>, unsigned long*, unsigned long, bool) internal_repo_rocksdb/repo/utilities/checkpoint/checkpoint_impl.cc:214 (db_stress+0x4c0343)
      https://github.com/facebook/rocksdb/issues/9 rocksdb::CheckpointImpl::CreateCheckpoint(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, unsigned long*) internal_repo_rocksdb/repo/utilities/checkpoint/checkpoint_impl.cc:123 (db_stress+0x4c237e)
      https://github.com/facebook/rocksdb/issues/10 rocksdb::StressTest::TestCheckpoint(rocksdb::ThreadState*, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:1699 (db_stress+0x328340)
      https://github.com/facebook/rocksdb/issues/11 rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:825 (db_stress+0x33921f)
      https://github.com/facebook/rocksdb/issues/12 rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33 (db_stress+0x354857)
      https://github.com/facebook/rocksdb/issues/13 rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:447 (db_stress+0x6eb2ad)
      
      Previous read of size 8 at 0x7b480003d850 by thread T64 (mutexes: write M980798978697532600, write M253744503184415024, write M1262):
      #0 memcpy <null> (db_stress+0xbc9696)
      https://github.com/facebook/rocksdb/issues/1 operator= internal_repo_rocksdb/repo/utilities/fault_injection_fs.h:35 (db_stress+0x70d5f1)
      https://github.com/facebook/rocksdb/issues/2 rocksdb::FaultInjectionTestFS::WritableFileAppended(rocksdb::FSFileState const&) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:827 (db_stress+0x70d5f1)
      https://github.com/facebook/rocksdb/issues/3 rocksdb::TestFSWritableFile::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:173 (db_stress+0x7143af)
      https://github.com/facebook/rocksdb/issues/4 rocksdb::(anonymous namespace)::CompositeWritableFileWrapper::Append(rocksdb::Slice const&) internal_repo_rocksdb/repo/env/composite_env.cc:115 (db_stress+0x4de3ab)
      https://github.com/facebook/rocksdb/issues/5 rocksdb::(anonymous namespace)::LegacyWritableFileWrapper::Append(rocksdb::Slice const&, rocksdb::IOOptions const&, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/env.cc:248 (db_stress+0x6df44b)
      https://github.com/facebook/rocksdb/issues/6 rocksdb::WritableFileWriter::WriteBuffered(char const*, unsigned long, rocksdb::Env::IOPriority) internal_repo_rocksdb/repo/file/writable_file_writer.cc:551 (db_stress+0xa1a953)
      https://github.com/facebook/rocksdb/issues/7 rocksdb::WritableFileWriter::Flush(rocksdb::Env::IOPriority) internal_repo_rocksdb/repo/file/writable_file_writer.cc:327 (db_stress+0xa16ee8)
      https://github.com/facebook/rocksdb/issues/8 rocksdb::log::Writer::AddRecord(rocksdb::Slice const&, rocksdb::Env::IOPriority) internal_repo_rocksdb/repo/db/log_writer.cc:147 (db_stress+0x7f121f)
      https://github.com/facebook/rocksdb/issues/9 rocksdb::DBImpl::WriteToWAL(rocksdb::WriteBatch const&, rocksdb::log::Writer*, unsigned long*, unsigned long*, rocksdb::Env::IOPriority, rocksdb::DBImpl::LogFileNumberSize&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:1285 (db_stress+0x695042)
      https://github.com/facebook/rocksdb/issues/10 rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long, rocksdb::DBImpl::LogFileNumberSize&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:1328 (db_stress+0x6907e8)
      https://github.com/facebook/rocksdb/issues/11 rocksdb::DBImpl::PipelinedWriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:731 (db_stress+0x68e8a7)
      https://github.com/facebook/rocksdb/issues/12 rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*, rocksdb::PostMemTableCallback*) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:283 (db_stress+0x688370)
      https://github.com/facebook/rocksdb/issues/13 rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:126 (db_stress+0x69a7b5)
      https://github.com/facebook/rocksdb/issues/14 rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:2247 (db_stress+0x698634)
      https://github.com/facebook/rocksdb/issues/15 rocksdb::DBImpl::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) internal_repo_rocksdb/repo/db/db_impl/db_impl_write.cc:37 (db_stress+0x699868)
      https://github.com/facebook/rocksdb/issues/16 rocksdb::NonBatchedOpsStressTest::TestPut(rocksdb::ThreadState*, rocksdb::WriteOptions&, rocksdb::ReadOptions const&, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&, char (&) [100], std::unique_ptr<rocksdb::MutexLock, std::default_delete<rocksdb::MutexLock> >&) internal_repo_rocksdb/repo/db_stress_tool/no_batched_ops_stress.cc:681 (db_stress+0x38d20c)
      https://github.com/facebook/rocksdb/issues/17 rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:897 (db_stress+0x3399ec)
      https://github.com/facebook/rocksdb/issues/18 rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33 (db_stress+0x354857)
      https://github.com/facebook/rocksdb/issues/19 rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:447 (db_stress+0x6eb2ad)
      
      Location is heap block of size 352 at 0x7b480003d800 allocated by thread T23:
      #0 operator new(unsigned long) <null> (db_stress+0xb685dc)
      https://github.com/facebook/rocksdb/issues/1 rocksdb::FaultInjectionTestFS::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileOptions const&, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/utilities/fault_injection_fs.cc:506 (db_stress+0x711192)
      https://github.com/facebook/rocksdb/issues/2 rocksdb::CompositeEnv::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::WritableFile, std::default_delete<rocksdb::WritableFile> >*, rocksdb::EnvOptions const&) internal_repo_rocksdb/repo/env/composite_env.cc:329 (db_stress+0x4d33fa)
      https://github.com/facebook/rocksdb/issues/3 rocksdb::EnvWrapper::NewWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::WritableFile, std::default_delete<rocksdb::WritableFile> >*, rocksdb::EnvOptions const&) internal_repo_rocksdb/repo/include/rocksdb/env.h:1425 (db_stress+0x300662)
      ...
      ```
      
      **Summary:**
      - Added the missing lock in functions mentioned above along with three other functions with a similar need in TestFSWritableFile
      - Added clarification comment
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10544
      
      Test Plan: - Past the above race condition repro
      
      Reviewed By: ajkr
      
      Differential Revision: D38886634
      
      Pulled By: hx235
      
      fbshipit-source-id: 0571bae9615f35b16fbd8168204607e306b1b486
      b16655a5
  8. 22 8月, 2022 1 次提交
    • B
      Post 7.6 branch cut changes (#10546) · b0048b67
      Bo Wang 提交于
      Summary:
      After branch 7.6.fb branch is cut, following release process, upgrade version number to 7.7 and add 7.6.fb to format compatibility check.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10546
      
      Test Plan: Watch CI
      
      Reviewed By: ajkr
      
      Differential Revision: D38892023
      
      Pulled By: gitbw95
      
      fbshipit-source-id: 94e96dedbd973f5f9713e73d3bed336e4678565b
      b0048b67
  9. 21 8月, 2022 1 次提交
  10. 20 8月, 2022 3 次提交
    • A
      MultiGet async IO across multiple levels (#10535) · 35cdd3e7
      anand76 提交于
      Summary:
      This PR exploits parallelism in MultiGet across levels. It applies only to the coroutine version of MultiGet. Previously, MultiGet file reads from SST files in the same level were parallelized. With this PR, MultiGet batches with keys distributed across multiple levels are read in parallel. This is accomplished by splitting the keys not present in a level (determined by bloom filtering) into a separate batch, and processing the new batch in parallel with the original batch.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10535
      
      Test Plan:
      1. Ensure existing MultiGet unit tests pass, updating them as necessary
      2. New unit tests - TODO
      3. Run stress test - TODO
      
      No noticeable regression (<1%) without async IO -
      Without PR: `multireadrandom :       7.261 micros/op 1101724 ops/sec 60.007 seconds 66110936 operations;  571.6 MB/s (8168992 of 8168992 found)`
      With PR: `multireadrandom :       7.305 micros/op 1095167 ops/sec 60.007 seconds 65717936 operations;  568.2 MB/s (8271992 of 8271992 found)`
      
      For a fully cached DB, but with async IO option on, no regression observed (<1%) -
      Without PR: `multireadrandom :       5.201 micros/op 1538027 ops/sec 60.005 seconds 92288936 operations;  797.9 MB/s (11540992 of 11540992 found) `
      With PR: `multireadrandom :       5.249 micros/op 1524097 ops/sec 60.005 seconds 91452936 operations;  790.7 MB/s (11649992 of 11649992 found) `
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D38774009
      
      Pulled By: anand1976
      
      fbshipit-source-id: c955e259749f1c091590ade73105b3ee46cd0007
      35cdd3e7
    • L
      Add support for wide-column point lookups (#10540) · 81388b36
      Levi Tamasi 提交于
      Summary:
      The patch adds a new API `GetEntity` that can be used to perform
      wide-column point lookups. It also extends the `Get` code path and
      the `MemTable` / `MemTableList` and `Version` / `GetContext` logic
      accordingly so that wide-column entities can be served from both
      memtables and SSTs. If the result of a lookup is a wide-column entity
      (`kTypeWideColumnEntity`), it is passed to the application in deserialized
      form; if it is a plain old key-value (`kTypeValue`), it is presented as a
      wide-column entity with a single default (anonymous) column.
      (In contrast, regular `Get` returns plain old key-values as-is, and
      returns the value of the default column for wide-column entities, see
      https://github.com/facebook/rocksdb/issues/10483 .)
      
      The result of `GetEntity` is a self-contained `PinnableWideColumns` object.
      `PinnableWideColumns` contains a `PinnableSlice`, which either stores the
      underlying data in its own buffer or holds on to a cache handle. It also contains
      a `WideColumns` instance, which indexes the contents of the `PinnableSlice`,
      so applications can access the values of columns efficiently.
      
      There are several pieces of functionality which are currently not supported
      for wide-column entities: there is currently no `MultiGetEntity` or wide-column
      iterator; also, `Merge` and `GetMergeOperands` are not supported, and there
      is no `GetEntity` implementation for read-only and secondary instances.
      We plan to implement these in future PRs.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10540
      
      Test Plan: `make check`
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D38847474
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 42311a34ccdfe88b3775e847a5e2a5296e002b5b
      81388b36
    • A
      Revert "Avoid dynamic memory allocation on read path (#10453)" (#10541) · 2553d1ef
      anand76 提交于
      Summary:
      This reverts commit 0d885e80. The original commit causes a ASAN stack-use-after-return failure due to the `CreateCallback` being allocated on stack and then used in another thread when a secondary cache object is promoted to the primary cache.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10541
      
      Reviewed By: gitbw95
      
      Differential Revision: D38850039
      
      Pulled By: anand1976
      
      fbshipit-source-id: 810c592b7de2523693f5bb267159b23b0ee9132c
      2553d1ef
  11. 19 8月, 2022 2 次提交
    • B
      Fix the memory leak in db_stress tests that are caused by... · 13cb7a84
      Bo Wang 提交于
      Fix the memory leak in db_stress tests that are caused by `FaultInjectionSecondaryCache` and add `CompressedSecondaryCache` into stress tests. (#10523)
      
      Summary:
      1. Fix the memory leak in db_stress tests that are caused by `FaultInjectionSecondaryCache`. To address the test requirements for both CompressedSecondaryCache and CachlibWrapper, a new class variable `base_is_compressed_sec_cache_` is added to determine the different behaviors in `Lookup()` and `WaitAll()`.
      2. Add `CompressedSecondaryCache` into stress tests.
      
      Before this PR, memory leak is reported during crash tests if  `CompressedSecondaryCache` is in stress tests. One example is shown as follows:
      ```
      ==70722==ERROR: LeakSanitizer: detected memory leaks
      
      Direct leak of 6648240 byte(s) in 83103 object(s) allocated from:
          #0 0x13de9d7 in operator new(unsigned long) (/data/sandcastle/boxes/eden-trunk-hg-fbcode-fbsource/fbcode/buck-out/dbgo/gen/aab7ed39/internal_repo_rocksdb/repo/db_stress+0x13de9d7)
          https://github.com/facebook/rocksdb/issues/1 0x9084c7 in rocksdb::BlocklikeTraits<rocksdb::Block>::Create(rocksdb::BlockContents&&, unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*) internal_repo_rocksdb/repo/table/block_based/block_like_traits.h:128
          https://github.com/facebook/rocksdb/issues/2 0x9084c7 in std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)::operator()(void const*, unsigned long, void**, unsigned long*) const internal_repo_rocksdb/repo/table/block_based/block_like_traits.h:34
          https://github.com/facebook/rocksdb/issues/3 0x9082c9 in rocksdb::Block std::__invoke_impl<rocksdb::Status, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>(std::__invoke_other, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:61
          https://github.com/facebook/rocksdb/issues/4 0x90825d in std::enable_if<is_invocable_r_v<rocksdb::Block, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>, rocksdb::Block>::type std::__invoke_r<rocksdb::Status, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*, unsigned long, void**, unsigned long*>(std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/invoke.h:114
          https://github.com/facebook/rocksdb/issues/5 0x9081b0 in std::_Function_handler<rocksdb::Status (void const*, unsigned long, void**, unsigned long*), std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> rocksdb::GetCreateCallback<rocksdb::Block>(unsigned long, rocksdb::Statistics*, bool, rocksdb::FilterPolicy const*)::'lambda'(void const*, unsigned long, void**, unsigned long*)>::_M_invoke(std::_Any_data const&, void const*&&, unsigned long&&, void**&&, unsigned long*&&) third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_function.h:291
          https://github.com/facebook/rocksdb/issues/6 0x991f2c in std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)>::operator()(void const*, unsigned long, void**, unsigned long*) const third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_function.h:560
          https://github.com/facebook/rocksdb/issues/7 0x990277 in rocksdb::CompressedSecondaryCache::Lookup(rocksdb::Slice const&, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, bool, bool&) internal_repo_rocksdb/repo/cache/compressed_secondary_cache.cc:77
          https://github.com/facebook/rocksdb/issues/8 0xd3aa4d in rocksdb::FaultInjectionSecondaryCache::Lookup(rocksdb::Slice const&, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, bool, bool&) internal_repo_rocksdb/repo/utilities/fault_injection_secondary_cache.cc:92
          https://github.com/facebook/rocksdb/issues/9 0xeadaab in rocksdb::lru_cache::LRUCacheShard::Lookup(rocksdb::Slice const&, unsigned int, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority, bool, rocksdb::Statistics*) internal_repo_rocksdb/repo/cache/lru_cache.cc:445
          https://github.com/facebook/rocksdb/issues/10 0x1064573 in rocksdb::ShardedCache::Lookup(rocksdb::Slice const&, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority, bool, rocksdb::Statistics*) internal_repo_rocksdb/repo/cache/sharded_cache.cc:89
          https://github.com/facebook/rocksdb/issues/11 0x8be0df in rocksdb::BlockBasedTable::GetEntryFromCache(rocksdb::CacheTier const&, rocksdb::Cache*, rocksdb::Slice const&, rocksdb::BlockType, bool, rocksdb::GetContext*, rocksdb::Cache::CacheItemHelper const*, std::function<rocksdb::Status (void const*, unsigned long, void**, unsigned long*)> const&, rocksdb::Cache::Priority) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:389
          https://github.com/facebook/rocksdb/issues/12 0x905790 in rocksdb::Status rocksdb::BlockBasedTable::GetDataBlockFromCache<rocksdb::Block>(rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::UncompressionDict const&, rocksdb::BlockType, bool, rocksdb::GetContext*) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1263
          https://github.com/facebook/rocksdb/issues/13 0x8b9259 in rocksdb::Status rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache<rocksdb::Block>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, bool, bool, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::BlockContents*, bool) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1559
          https://github.com/facebook/rocksdb/issues/14 0x8b710c in rocksdb::Status rocksdb::BlockBasedTable::RetrieveBlock<rocksdb::Block>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::CachableEntry<rocksdb::Block>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, bool, bool, bool, bool) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader.cc:1726
          https://github.com/facebook/rocksdb/issues/15 0x8c329f in rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::FilePrefetchBuffer*, bool, bool, rocksdb::Status&) const internal_repo_rocksdb/repo/table/block_based/block_based_table_reader_impl.h:58
          https://github.com/facebook/rocksdb/issues/16 0x920117 in rocksdb::BlockBasedTableIterator::InitDataBlock() internal_repo_rocksdb/repo/table/block_based/block_based_table_iterator.cc:262
          https://github.com/facebook/rocksdb/issues/17 0x920d42 in rocksdb::BlockBasedTableIterator::MaterializeCurrentBlock() internal_repo_rocksdb/repo/table/block_based/block_based_table_iterator.cc:332
          https://github.com/facebook/rocksdb/issues/18 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78
          https://github.com/facebook/rocksdb/issues/19 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78
          https://github.com/facebook/rocksdb/issues/20 0xef9f6c in rocksdb::MergingIterator::PrepareValue() internal_repo_rocksdb/repo/table/merging_iterator.cc:260
          https://github.com/facebook/rocksdb/issues/21 0xc6a201 in rocksdb::IteratorWrapperBase<rocksdb::Slice>::PrepareValue() internal_repo_rocksdb/repo/table/iterator_wrapper.h:78
          https://github.com/facebook/rocksdb/issues/22 0xc67bcd in rocksdb::DBIter::FindNextUserEntryInternal(bool, rocksdb::Slice const*) internal_repo_rocksdb/repo/db/db_iter.cc:326
          https://github.com/facebook/rocksdb/issues/23 0xc66d36 in rocksdb::DBIter::FindNextUserEntry(bool, rocksdb::Slice const*) internal_repo_rocksdb/repo/db/db_iter.cc:234
          https://github.com/facebook/rocksdb/issues/24 0xc7ab47 in rocksdb::DBIter::Next() internal_repo_rocksdb/repo/db/db_iter.cc:161
          https://github.com/facebook/rocksdb/issues/25 0x70d938 in rocksdb::BatchedOpsStressTest::TestPrefixScan(rocksdb::ThreadState*, rocksdb::ReadOptions const&, std::vector<int, std::allocator<int> > const&, std::vector<long, std::allocator<long> > const&) internal_repo_rocksdb/repo/db_stress_tool/batched_ops_stress.cc:320
          https://github.com/facebook/rocksdb/issues/26 0x6dc6a8 in rocksdb::StressTest::OperateDb(rocksdb::ThreadState*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_test_base.cc:907
          https://github.com/facebook/rocksdb/issues/27 0x6867de in rocksdb::ThreadBody(void*) internal_repo_rocksdb/repo/db_stress_tool/db_stress_driver.cc:33
          https://github.com/facebook/rocksdb/issues/28 0xce4cc2 in rocksdb::(anonymous namespace)::StartThreadWrapper(void*) internal_repo_rocksdb/repo/env/env_posix.cc:461
          https://github.com/facebook/rocksdb/issues/29 0x7f23f9068c0e in start_thread /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/nptl/pthread_create.c:434:8
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10523
      
      Test Plan:
      ```
      $COMPILE_WITH_ASAN=1  make -j 24
      $db_stress J=40 crash_test_with_txn
      ```
      
      Reviewed By: anand1976
      
      Differential Revision: D38646839
      
      Pulled By: gitbw95
      
      fbshipit-source-id: 9452895c7dc95481a9d7afe83b15193cf5b1c43e
      13cb7a84
    • A
      Add initial_auto_readahead_size and max_auto_readahead_size to db_bench (#10539) · 5956ef00
      Akanksha Mahajan 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10539
      
      Reviewed By: anand1976
      
      Differential Revision: D38837111
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: eb845c6e15a3c823ff6113395817388ff15a20b1
      5956ef00
  12. 18 8月, 2022 2 次提交
    • A
      Prevent a case of WriteBufferManager flush thrashing (#6364) · 91166012
      Andrew Kryczka 提交于
      Summary:
      Previously, the flushes triggered by `WriteBufferManager` could affect
      the same CF repeatedly if it happens to get consecutive writes. Such
      flushes are not particularly useful for reducing memory usage since
      they switch nearly-empty memtables to immutable while they've just begun
      filling their first arena block. In fact they may not even reduce the
      mutable memory count if they involve replacing one mutable memtable containing
      one arena block with a new mutable memtable containing one arena block.
      Further, if such switches happen even a few times before a flush finishes,
      the immutable memtable limit will be reached and writes will stall.
      
      This PR adds a heuristic to not switch memtables to immutable for CFs
      that already have one or more immutable memtables awaiting flush. There
      is a memory usage regression if the user continues writing to the same
      CF, that DB does not have any CFs eligible for switching, flushes
      are not finishing, and the `WriteBufferManager` was constructed with
      `allow_stall=false`. Before, it would grow by switching nearly empty
      memtables until writes stall. Now, it would grow by filling memtables
      until writes stall. This feels like an acceptable behavior change because
      users who prefer to stall over violate the memory limit should be using
      `allow_stall=true`, which is unaffected by this PR.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6364
      
      Test Plan:
      - Command:
      
      `rm -rf /dev/shm/dbbench/ && TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num_multi_db=8 -num_column_families=2 -write_buffer_size=4194304 -db_write_buffer_size=16777216 -compression_type=none -statistics=true -target_file_size_base=4194304 -max_bytes_for_level_base=16777216`
      
      - `rocksdb.db.write.stall` count before this PR: 175
      - `rocksdb.db.write.stall` count after this PR: 0
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D20167197
      
      Pulled By: ajkr
      
      fbshipit-source-id: 4a64064e9bc33d57c0a35f15547542d0191d0cb7
      91166012
    • A
      Fix range deletion handling in async MultiGet (#10534) · 65814a4a
      anand76 提交于
      Summary:
      The fix in https://github.com/facebook/rocksdb/issues/10513 was not complete w.r.t range deletion handling. It didn't handle the case where a file with a range tombstone covering a key also overlapped another key in the batch. In that case, ```mget_range``` would be non-empty. However, ```mget_range``` would only have the second key and, therefore, the first key would be skipped when iterating through the range tombstones in ```TableCache::MultiGet```.
      
      Test plan -
      1. Add a unit test
      2. Run stress tests
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10534
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D38773880
      
      Pulled By: anand1976
      
      fbshipit-source-id: dae491dbe52e18bbce5179b77b63f20771a66c00
      65814a4a
  13. 13 8月, 2022 3 次提交
    • G
      Add a blob-specific cache priority (#10461) · 275cd80c
      Gang Liao 提交于
      Summary:
      RocksDB's `Cache` abstraction currently supports two priority levels for items: high (used for frequently accessed/highly valuable SST metablocks like index/filter blocks) and low (used for SST data blocks). Blobs are typically lower-value targets for caching than data blocks, since 1) with BlobDB, data blocks containing blob references conceptually form an index structure which has to be consulted before we can read the blob value, and 2) cached blobs represent only a single key-value, while cached data blocks generally contain multiple KVs. Since we would like to make it possible to use the same backing cache for the block cache and the blob cache, it would make sense to add a new, lower-than-low cache priority level (bottom level) for blobs so data blocks are prioritized over them.
      
      This task is a part of https://github.com/facebook/rocksdb/issues/10156
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10461
      
      Reviewed By: siying
      
      Differential Revision: D38672823
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 90cf7362036563d79891f47be2cc24b827482743
      275cd80c
    • S
      Fix two extra headers (#10525) · bc575c61
      sdong 提交于
      Summary:
      Fix copyright for two more extra headers to make internal tool happy.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10525
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D38661390
      
      fbshipit-source-id: ab2d055bfd145dfe82b5bae7a6c25cc338c8de94
      bc575c61
    • C
      Add memtable per key-value checksum (#10281) · fd165c86
      Changyu Bi 提交于
      Summary:
      Append per key-value checksum to internal key. These checksums are verified on read paths including Get, Iterator and during Flush. Get and Iterator will return `Corruption` status if there is a checksum verification failure. Flush will make DB become read-only upon memtable entry checksum verification failure.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10281
      
      Test Plan:
      - Added new unit test cases: `make check`
      - Benchmark on memtable insert
      ```
      TEST_TMPDIR=/dev/shm/memtable_write ./db_bench -benchmarks=fillseq -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100
      
      # avg over 10 runs
      Baseline: 1166936 ops/sec
      memtable 2 bytes kv checksum : 1.11674e+06 ops/sec (-4%)
      memtable 2 bytes kv checksum + write batch 8 bytes kv checksum: 1.08579e+06 ops/sec (-6.95%)
      write batch 8 bytes kv checksum: 1.17979e+06 ops/sec (+1.1%)
      ```
      -  Benchmark on only memtable read: ops/sec dropped 31% for `readseq` due to time spend on verifying checksum.
      ops/sec for `readrandom` dropped ~6.8%.
      ```
      # Readseq
      sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillseq,readseq"[-X20]" -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100
      
      readseq [AVG    20 runs] : 7432840 (± 212005) ops/sec;  822.3 (± 23.5) MB/sec
      readseq [MEDIAN 20 runs] : 7573878 ops/sec;  837.9 MB/sec
      
      With -memtable_protection_bytes_per_key=2:
      
      readseq [AVG    20 runs] : 5134607 (± 119596) ops/sec;  568.0 (± 13.2) MB/sec
      readseq [MEDIAN 20 runs] : 5232946 ops/sec;  578.9 MB/sec
      
      # Readrandom
      sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillrandom,readrandom"[-X10]" -disable_wal=true -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=100
      readrandom [AVG    10 runs] : 140236 (± 3938) ops/sec;    9.8 (± 0.3) MB/sec
      readrandom [MEDIAN 10 runs] : 140545 ops/sec;    9.8 MB/sec
      
      With -memtable_protection_bytes_per_key=2:
      readrandom [AVG    10 runs] : 130632 (± 2738) ops/sec;    9.1 (± 0.2) MB/sec
      readrandom [MEDIAN 10 runs] : 130341 ops/sec;    9.1 MB/sec
      ```
      
      - Stress test: `python3 -u tools/db_crashtest.py whitebox --duration=1800`
      
      Reviewed By: ajkr
      
      Differential Revision: D37607896
      
      Pulled By: cbi42
      
      fbshipit-source-id: fdaefb475629d2471780d4a5f5bf81b44ee56113
      fd165c86