1. 15 2月, 2022 1 次提交
    • A
      Transaction multiGet convert to list-based (#9522) · eed71dfa
      Alan Paxton 提交于
      Summary:
      Transaction multiGet convert to list-based.
      
      RocksDB Java (non-transactional) has multiGetAsList() methods to expose multiGet(). These return a list of results. These methods replaced multiGet() methods returning an array of results, which were deprecated in Rocks 6 and are being removed in Rocks 7.
      
      The transactional API still presents multiGet() methods returning arrays, so in Rocks 7 we replace these with multiGetAsList()methods and deprecate the multiGet() methods.
      
      This does not require any changes to the supporting JNI/C++ code, only to the wrappers which present the Java API.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9522
      
      Reviewed By: mrambacher
      
      Differential Revision: D34114373
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: cb22d6095934d951b6aee4aed3e07923d3c18007
      eed71dfa
  2. 12 2月, 2022 6 次提交
    • P
      Hide deprecated, inefficient block-based filter from public API (#9535) · 479eb1aa
      Peter Dillinger 提交于
      Summary:
      This change removes the ability to configure the deprecated,
      inefficient block-based filter in the public API. Options that would
      have enabled it now use "full" (and optionally partitioned) filters.
      Existing block-based filters can still be read and used, and a "back
      door" way to build them still exists, for testing and in case of trouble.
      
      About the only way this removal would cause an issue for users is if
      temporary memory for filter construction greatly increases. In
      HISTORY.md we suggest a few possible mitigations: partitioned filters,
      smaller SST files, or setting reserve_table_builder_memory=true.
      
      Or users who have customized a FilterPolicy using the
      CreateFilter/KeyMayMatch mechanism removed in https://github.com/facebook/rocksdb/issues/9501 will have to upgrade
      their code. (It's long past time for people to move to the new
      builder/reader customization interface.)
      
      This change also introduces some internal-use-only configuration strings
      for testing specific filter implementations while bypassing some
      compatibility / intelligence logic. This is intended to hint at a path
      toward making FilterPolicy Customizable, but it also gives us a "back
      door" way to configure block-based filter.
      
      Aside: updated db_bench so that -readonly implies -use_existing_db
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9535
      
      Test Plan:
      Unit tests updated. Specifically,
      
      * BlockBasedTableTest.BlockReadCountTest is tweaked to validate the back
      door configuration interface and ignoring of `use_block_based_builder`.
      * BlockBasedTableTest.TracingGetTest is migrated from testing
      block-based filter access pattern to full filter access patter, by
      re-ordering some things.
      * Options test (pretty self-explanatory)
      
      Performance test - create with `./db_bench -db=/dev/shm/rocksdb1 -bloom_bits=10 -cache_index_and_filter_blocks=1 -benchmarks=fillrandom -num=10000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0` with and without `-use_block_based_filter`, which creates a DB with 21 SST files in L0. Read with `./db_bench -db=/dev/shm/rocksdb1 -readonly -bloom_bits=10 -cache_index_and_filter_blocks=1 -benchmarks=readrandom -num=10000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -duration=30`
      
      Without -use_block_based_filter: readrandom 464 ops/sec, 689280 KB DB
      With -use_block_based_filter: readrandom 169 ops/sec, 690996 KB DB
      No consistent difference with fillrandom
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D34153871
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 31f4a933c542f8f09aca47fa64aec67832a69738
      479eb1aa
    • Y
      Add commit_timestamp and read_timestamp to Pessimistic transaction (#9537) · d6e1e6f3
      Yanqin Jin 提交于
      Summary:
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9537
      
      Add `Transaction::SetReadTimestampForValidation()` and
      `Transaction::SetCommitTimestamp()` APIs with default implementation
      returning `Status::NotSupported()`. Currently, calling these two APIs do not
      have any effect.
      
      Also add checks to `PessimisticTransactionDB`
      to enforce that column families in the same db either
      - disable user-defined timestamp
      - enable 64-bit timestamp
      
      Just to clarify, a `PessimisticTransactionDB` can have some column families without
      timestamps as well as column families that enable timestamp.
      
      Each `PessimisticTransaction` can have two optional timestamps, `read_timestamp_`
      used for additional validation and `commit_timestamp_` which denotes when the transaction commits.
      For now, we are going to support `WriteCommittedTxn` (in a series of subsequent PRs)
      
      Once set, we do not allow decreasing `read_timestamp_`. The `commit_timestamp_` must be
       greater than `read_timestamp_` for each transaction and must be set before commit, unless
      the transaction does not involve any column family that enables user-defined timestamp.
      
      TransactionDB builds on top of RocksDB core `DB` layer. Though `DB` layer assumes
      that user-defined timestamps are byte arrays, `TransactionDB` uses uint64_t to store
      timestamps. When they are passed down, they are still interpreted as
      byte-arrays by `DB`.
      
      Reviewed By: ltamasi
      
      Differential Revision: D31567959
      
      fbshipit-source-id: b0b6b69acab5d8e340cf174f33e8b09f1c3d3502
      d6e1e6f3
    • M
      Add STATIC_AVOID_DESTRUCTION for ObjectLibrary/Registry (#9464) · 81ada95b
      mrambacher 提交于
      Summary:
      This change should guarantee that the default ObjectLibrary/Registry are long-lived and not destroyed while the process is running.  This will prevent some issues of them being referenced after they were destroyed via the static destruction.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9464
      
      Reviewed By: pdillinger
      
      Differential Revision: D33849876
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 7a69177d7c58c81be293fc7ef8e600d47ddbc14b
      81ada95b
    • A
      Fix failure in c_test (#9547) · 5c53b900
      Akanksha Mahajan 提交于
      Summary:
      When tests are run with TMPD, c_test may fail because TMPD
      is not created by the test. It results in IO error: No such file
      or directory: While mkdir if missing:
      /tmp/rocksdb_test_tmp/rocksdb_c_test-0: No such file or directory
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9547
      
      Test Plan:
      make -j32 c_test;
       TEST_TMPDIR=/tmp/rocksdb_test  ./c_test
      
      Reviewed By: riversand963
      
      Differential Revision: D34173298
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 5b5a01f5b842c2487b05b0708c8e9532241db7f8
      5c53b900
    • E
      Avoid unnecessary copy of sample_slice map (#9551) · 95d9cb83
      Ezgi Çiçek 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9551
      
      Reviewed By: riversand963
      
      Differential Revision: D34169574
      
      Pulled By: ezgicicek
      
      fbshipit-source-id: 2e88db59b65bda269917a9b0bed17181a4afd281
      95d9cb83
    • L
      Rework VersionStorageInfo::ComputeFilesMarkedForForcedBlobGC a bit (#9548) · a1203edc
      Levi Tamasi 提交于
      Summary:
      We had a bug in `VersionStorageInfo::ComputeFilesMarkedForForcedBlobGC`
      related to the edge case where all blob files are part of the "oldest batch",
      i.e. where only the very oldest file has any linked SSTs. (See https://github.com/facebook/rocksdb/issues/9542)
      This PR tries to make the logic in this method clearer and also adds a unit test
      for the problematic case.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9548
      
      Test Plan: `make check`
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D34158959
      
      Pulled By: ltamasi
      
      fbshipit-source-id: fbab6d749c569728382aa04f7b7c60c92cca7650
      a1203edc
  3. 11 2月, 2022 4 次提交
    • M
      Return different Status based on ObjectRegistry::NewObject calls (#9333) · fe9d4951
      mrambacher 提交于
      Summary:
      This fix addresses https://github.com/facebook/rocksdb/issues/9299.
      
      If attempting to create a new object via the ObjectRegistry and a factory is not found, the ObjectRegistry will return a "NotSupported" status.  This is the same behavior as previously.
      
      If the factory is found but could not successfully create the object, an "InvalidArgument" status is returned.  If the factory returned a reason why (in the errmsg), this message will be in the returned status.
      
      In practice, there are two options in the ConfigOptions that control how these errors are propagated:
      - If "ignore_unknown_options=true", then both InvalidArgument and NotSupported status codes will be swallowed internally.  Both cases will return success
      - If "ignore_unsupported_options=true", then having no factory will return success but a failing factory will return an error
      - If both options are false, both cases (no and failing factory) will return errors.
      
      In practice this likely only changes Customizable that may be partially available.  For example, the JEMallocMemoryAllocator is a built-in allocator that is registered with the system but may not be compiled in.  In this case, the status code for this allocator changed from NotSupported("JEMalloc not available") to InvalidArgumen("JEMalloc not available").  Other Customizable builtins/plugins would have the same semantics.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9333
      
      Reviewed By: pdillinger
      
      Differential Revision: D33517681
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 8033052d4a4a7b88c2d9f90147b1b4467e51f6fd
      fe9d4951
    • L
      Log blob file space amp and expose it via the rocksdb.blob-stats DB property (#9538) · 073ac547
      Levi Tamasi 提交于
      Summary:
      Extend the periodic statistics in the info log with the total amount of garbage
      in blob files and the space amplification pertaining to blob files, where the
      latter is defined as `total_blob_file_size / (total_blob_file_size - total_blob_garbage_size)`.
      Also expose the space amp via the `rocksdb.blob-stats` DB property.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9538
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D34126855
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 3153e7a0fe0eca440322db273f4deaabaccc51b2
      073ac547
    • L
      Fix off-by-one bug in VersionStorageInfo::ComputeFilesMarkedForForcedBlobGC (#9542) · b2423f8d
      Levi Tamasi 提交于
      Summary:
      Fixes a bug introduced in https://github.com/facebook/rocksdb/issues/9526 where we index one position past the
      end of a `vector`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9542
      
      Test Plan:
      `make asan_check`
      
      Will add a unit test in a separate PR.
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D34145825
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 4e87c948407dee489d669a3e41f59e2fcc1228d8
      b2423f8d
    • H
      Fix TSAN data race in EventListenerTest.MultiCF (#9528) · c5cd31c1
      Hui Xiao 提交于
      Summary:
      **Context:**
      `EventListenerTest.MultiCF` occasionally failed on TSAN data race as below:
      ```
      WARNING: ThreadSanitizer: data race (pid=2047633)
        Read of size 8 at 0x7b6000001440 by main thread:
          #0 std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> >::size() const /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_vector.h:916:40 (listener_test+0x52337c)
          https://github.com/facebook/rocksdb/issues/1 rocksdb::EventListenerTest_MultiCF_Test::TestBody() /home/circleci/project/db/listener_test.cc:384:7 (listener_test+0x52337c)
      
        Previous write of size 8 at 0x7b6000001440 by thread T2:
          #0 void std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> >::_M_realloc_insert<rocksdb::DB* const&>(__gnu_cxx::__normal_iterator<rocksdb::DB**, std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> > >, rocksdb::DB* const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/vector.tcc:503:31 (listener_test+0x550654)
          https://github.com/facebook/rocksdb/issues/1 std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> >::push_back(rocksdb::DB* const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_vector.h:1195:4 (listener_test+0x550654)
          https://github.com/facebook/rocksdb/issues/2 rocksdb::TestFlushListener::OnFlushCompleted(rocksdb::DB*, rocksdb::FlushJobInfo const&) /home/circleci/project/db/listener_test.cc:255:18 (listener_test+0x550654)
      ```
      
      After investigation, it is due to the following:
      (1) `ASSERT_OK(Flush(i));` before the read `std::vector::size()` is supposed to be [blocked on `DB::Impl::bg_cv_` for memtable flush to finish](https://github.com/facebook/rocksdb/blob/320d9a8e8a1b6998f92934f87fc71ad8bd6d4596/db/db_impl/db_impl_compaction_flush.cc#L2319) and get signaled [at the end of background flush ](https://github.com/facebook/rocksdb/blob/320d9a8e8a1b6998f92934f87fc71ad8bd6d4596/db/db_impl/db_impl_compaction_flush.cc#L2830), which happens after the write `std::vector::push_back()` . So the sequence of execution should have been synchronized as `call flush() -> write -> return from flush() -> read` and would not cause any TSAN data race.
      - The subsequent `ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable());` serves a similar purpose based on [the previous attempt to deflake the test.](https://github.com/facebook/rocksdb/pull/9084)
      
      (2) However, there are multiple places in the code can signal this `DB::Impl::bg_cv_` and mistakenly wake up `ASSERT_OK(Flush(i));`  (or `ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable());`) too early (and with the lock available to them), resulting in non-synchronized read and write thus a TSAN data race.
      - Reproduced by the following, suggested by ajkr:
      ```
       diff --git a/db/db_impl/db_impl_compaction_flush.cc b/db/db_impl/db_impl_compaction_flush.cc
      index 4ff87c1e4..52492e9cf 100644
       --- a/db/db_impl/db_impl_compaction_flush.cc
      +++ b/db/db_impl/db_impl_compaction_flush.cc
      @@ -22,7 +22,7 @@
       #include "test_util/sync_point.h"
       #include "util/cast_util.h"
       #include "util/concurrent_task_limiter_impl.h"
       namespace ROCKSDB_NAMESPACE {
      
       bool DBImpl::EnoughRoomForCompaction(
      @@ -855,6 +855,7 @@ void DBImpl::NotifyOnFlushCompleted(
              mutable_cf_options.level0_stop_writes_trigger);
         // release lock while notifying events
         mutex_.Unlock();
      +  bg_cv_.SignalAll();
      ```
      
      **Summary:**
      - Added synchornization between read and write by ` ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->LoadDependency()` mechanism
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9528
      
      Test Plan:
      `./listener_test --gtest_filter=EventListenerTest.MultiCF --gtest_repeat=10`
      - pre-fix:
      ```
      Repeating all tests (iteration 3)
      Note: Google Test filter = EventListenerTest.MultiCF
      [==========] Running 1 test from 1 test case.
      [----------] Global test environment set-up.
      [----------] 1 test from EventListenerTest
      [ RUN      ] EventListenerTest.MultiCF
      ==================
      WARNING: ThreadSanitizer: data race (pid=3377137)
        Read of size 8 at 0x7b6000000840 by main thread:
          #0 std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> >::size()
          https://github.com/facebook/rocksdb/issues/1 rocksdb::EventListenerTest_MultiCF_Test::TestBody() db/listener_test.cc:384 (listener_test+0x4bb300)
      
        Previous write of size 8 at 0x7b6000000840 by thread T2:
          #0 void std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> >::_M_realloc_insert<rocksdb::DB* const&>(__gnu_cxx::__normal_iterator<rocksdb::DB**, std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> > >, rocksdb::DB* const&)
          https://github.com/facebook/rocksdb/issues/1 std::vector<rocksdb::DB*, std::allocator<rocksdb::DB*> >::push_back(rocksdb::DB* const&)
          https://github.com/facebook/rocksdb/issues/2 rocksdb::TestFlushListener::OnFlushCompleted(rocksdb::DB*, rocksdb::FlushJobInfo const&) db/listener_test.cc:255 (listener_test+0x4e820f)
      ```
      - post-fix: `All passed`
      
      Reviewed By: ajkr
      
      Differential Revision: D34085791
      
      Pulled By: hx235
      
      fbshipit-source-id: f877aa687ea1d5cb6f31ef8c4772625d22868e8b
      c5cd31c1
  4. 10 2月, 2022 3 次提交
    • L
      Use a sorted vector instead of a map to store blob file metadata (#9526) · 320d9a8e
      Levi Tamasi 提交于
      Summary:
      The patch replaces `std::map` with a sorted `std::vector` for
      `VersionStorageInfo::blob_files_` and preallocates the space
      for the `vector` before saving the `BlobFileMetaData` into the
      new `VersionStorageInfo` in `VersionBuilder::Rep::SaveBlobFilesTo`.
      These changes reduce the time the DB mutex is held while
      saving new `Version`s, and using a sorted `vector` also makes
      lookups faster thanks to better memory locality.
      
      In addition, the patch introduces helper methods
      `VersionStorageInfo::GetBlobFileMetaData` and
      `VersionStorageInfo::GetBlobFileMetaDataLB` that can be used by
      clients to perform lookups in the `vector`, and does some general
      cleanup in the parts of code where blob file metadata are used.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9526
      
      Test Plan:
      Ran `make check` and the crash test script for a while.
      
      Performance was tested using a load-optimized benchmark (`fillseq` with vector memtable, no WAL) and small file sizes so that a significant number of files are produced:
      
      ```
      numactl --interleave=all ./db_bench --benchmarks=fillseq --allow_concurrent_memtable_write=false --level0_file_num_compaction_trigger=4 --level0_slowdown_writes_trigger=20 --level0_stop_writes_trigger=30 --max_background_jobs=8 --max_write_buffer_number=8 --db=/data/ltamasi-dbbench --wal_dir=/data/ltamasi-dbbench --num=800000000 --num_levels=8 --key_size=20 --value_size=400 --block_size=8192 --cache_size=51539607552 --cache_numshardbits=6 --compression_max_dict_bytes=0 --compression_ratio=0.5 --compression_type=lz4 --bytes_per_sync=8388608 --cache_index_and_filter_blocks=1 --cache_high_pri_pool_ratio=0.5 --benchmark_write_rate_limit=0 --write_buffer_size=16777216 --target_file_size_base=16777216 --max_bytes_for_level_base=67108864 --verify_checksum=1 --delete_obsolete_files_period_micros=62914560 --max_bytes_for_level_multiplier=8 --statistics=0 --stats_per_interval=1 --stats_interval_seconds=20 --histogram=1 --memtablerep=skip_list --bloom_bits=10 --open_files=-1 --subcompactions=1 --compaction_style=0 --min_level_to_compress=3 --level_compaction_dynamic_level_bytes=true --pin_l0_filter_and_index_blocks_in_cache=1 --soft_pending_compaction_bytes_limit=167503724544 --hard_pending_compaction_bytes_limit=335007449088 --min_level_to_compress=0 --use_existing_db=0 --sync=0 --threads=1 --memtablerep=vector --allow_concurrent_memtable_write=false --disable_wal=1 --enable_blob_files=1 --blob_file_size=16777216 --min_blob_size=0 --blob_compression_type=lz4 --enable_blob_garbage_collection=1 --seed=<some value>
      ```
      
      Final statistics before the patch:
      
      ```
      Cumulative writes: 0 writes, 700M keys, 0 commit groups, 0.0 writes per commit group, ingest: 284.62 GB, 121.27 MB/s
      Interval writes: 0 writes, 334K keys, 0 commit groups, 0.0 writes per commit group, ingest: 139.28 MB, 72.46 MB/s
      ```
      
      With the patch:
      
      ```
      Cumulative writes: 0 writes, 760M keys, 0 commit groups, 0.0 writes per commit group, ingest: 308.66 GB, 131.52 MB/s
      Interval writes: 0 writes, 445K keys, 0 commit groups, 0.0 writes per commit group, ingest: 185.35 MB, 93.15 MB/s
      ```
      
      Total time to complete the benchmark is 2611 seconds with the patch, down from 2986 secs.
      
      Reviewed By: riversand963
      
      Differential Revision: D34082728
      
      Pulled By: ltamasi
      
      fbshipit-source-id: fc598abf676dce436734d06bb9d2d99a26a004fc
      320d9a8e
    • A
      remove deprecated dispose() for Rocks JNI interface Java objects. (#9523) · 99d86252
      Alan Paxton 提交于
      Summary:
      For RocksDB 7. Remove deprecated dispose() And as a consequence remove finalize(), which is good Modern Java hygiene.
      
      It is extremely non-deterministic when `finalize()` is called on an object, and resource closure/recovery of underlying native/C++ objects and/or non-memory resource cannot be adequately controlled through GC finalization. The RocksDB Java/JNI interface provides and encourages the use of AutoCloseable objects with close() methods, allowing predictable disposal of resources at exit from try-with-resource blocks.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9523
      
      Reviewed By: mrambacher
      
      Differential Revision: D34079843
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: d1f0463a89a548b5d57bfaa50154379e722d189a
      99d86252
    • Y
      Remove timestamp from key in expected state (#9525) · 685044df
      Yanqin Jin 提交于
      Summary:
      The keys as part of write batch read from trace file can contain trailing timestamps.
      This PR removes them before calling `ExpectedState`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9525
      
      Test Plan:
      make check
      make crash_test_with_ts
      
      Reviewed By: ajkr
      
      Differential Revision: D34082358
      
      Pulled By: riversand963
      
      fbshipit-source-id: 78c925659e2a19e4a8278fb4a8ddf5070e265c04
      685044df
  5. 09 2月, 2022 5 次提交
    • A
      Remove deprecated option new_table_reader_for_compaction_inputs (#9443) · 9745c68e
      Akanksha Mahajan 提交于
      Summary:
      In RocksDB option new_table_reader_for_compaction_inputs has
      not effect on Compaction or on the behavior of RocksDB library.
      Therefore, we are removing it in the upcoming 7.0 release.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9443
      
      Test Plan: CircleCI
      
      Reviewed By: ajkr
      
      Differential Revision: D33788508
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 324ca6f12bfd019e9bd5e1b0cdac39be5c3cec7d
      9745c68e
    • L
      Remove cat_ignore_eagain (#9531) · 2ee25e88
      Levi Tamasi 提交于
      Summary:
      ... since it was only necessary to work around a bug on certain Ubuntu
      16.04 images (and we now use 20.04 across the board).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9531
      
      Test Plan: Watch CI.
      
      Reviewed By: ajkr
      
      Differential Revision: D34089424
      
      Pulled By: ltamasi
      
      fbshipit-source-id: f15f86332c119099f61b9bdc74604657fc5d964e
      2ee25e88
    • P
      FilterPolicy API changes for 7.0 (#9501) · 68a9c186
      Peter Dillinger 提交于
      Summary:
      * Inefficient block-based filter is no longer customizable in the public
      API, though (for now) can still be enabled.
        * Removed deprecated FilterPolicy::CreateFilter() and
        FilterPolicy::KeyMayMatch()
        * Removed `rocksdb_filterpolicy_create()` from C API
      * Change meaning of nullptr return from GetBuilderWithContext() from "use
      block-based filter" to "generate no filter in this case." This is a
      cleaner solution to the proposal in https://github.com/facebook/rocksdb/issues/8250.
        * Also, when user specifies bits_per_key < 0.5, we now round this down
        to "no filter" because we expect a filter with >= 80% FP rate is
        unlikely to be worth the CPU cost of accessing it (esp with
        cache_index_and_filter_blocks=1 or partition_filters=1).
        * bits_per_key >= 0.5 and < 1.0 is still rounded up to 1.0 (for 62% FP
        rate)
        * This also gives us some support for configuring filters from OPTIONS
        file as currently saved: `filter_policy=rocksdb.BuiltinBloomFilter`.
        Opening from such an options file will enable reading filters (an
        improvement) but not writing new ones. (See Customizable follow-up
        below.)
      * Also removed deprecated functions
        * FilterBitsBuilder::CalculateNumEntry()
        * FilterPolicy::GetFilterBitsBuilder()
        * NewExperimentalRibbonFilterPolicy()
      * Remove default implementations of
        * FilterBitsBuilder::EstimateEntriesAdded()
        * FilterBitsBuilder::ApproximateNumEntries()
        * FilterPolicy::GetBuilderWithContext()
      * Remove support for "filter_policy=experimental_ribbon" configuration
      string.
      * Allow "filter_policy=bloomfilter:n" without bool to discourage use of
      block-based filter.
      
      Some pieces for https://github.com/facebook/rocksdb/issues/9389
      
      Likely follow-up (later PRs):
      * Refactoring toward FilterPolicy Customizable, so that we can generate
      filters with same configuration as before when configuring from options
      file.
      * Remove support for user enabling block-based filter (ignore `bool
      use_block_based_builder`)
        * Some months after this change, we could even remove read support for
        block-based filter, because it is not critical to DB data
        preservation.
      * Make FilterBitsBuilder::FinishV2 to avoid `using
      FilterBitsBuilder::Finish` mess and add support for specifying a
      MemoryAllocator (for cache warming)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9501
      
      Test Plan:
      A number of obsolete tests deleted and new tests or test
      cases added or updated.
      
      Reviewed By: hx235
      
      Differential Revision: D34008011
      
      Pulled By: pdillinger
      
      fbshipit-source-id: a39a720457c354e00d5b59166b686f7f59e392aa
      68a9c186
    • A
      Add releases till 6.29.fb to compatibility check (#9529) · ddce0c3f
      Akanksha Mahajan 提交于
      Summary:
      Add releases till 6.29.fb to compatibility check for forward and backward compatibility
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9529
      
      Test Plan: run locally
      
      Reviewed By: hx235
      
      Differential Revision: D34086063
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 4ccff513c99cf2d0e41da0b76ab27ffcfdffe7df
      ddce0c3f
    • S
      Use the comparator from the sst file table properties in sst_dump_tool (#9491) · 036bbab6
      satyajanga 提交于
      Summary:
      We introduced a new Comparator for timestamp in user keys. In the sst_dump_tool by default we use BytewiseComparator to read sst files. This change allows us to read comparator_name from table properties in meta data block and use it to read.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9491
      
      Test Plan:
      added unittests for new functionality.
      make check
      ![image](https://user-images.githubusercontent.com/4923556/152915444-28b88a1f-7b4e-47d0-815f-7011552bd9a2.png)
      ![image](https://user-images.githubusercontent.com/4923556/152916196-bea3d2a1-a3d5-4362-b911-036131b83e8d.png)
      
      Reviewed By: riversand963
      
      Differential Revision: D33993614
      
      Pulled By: satyajanga
      
      fbshipit-source-id: 4b5cf938e6d2cb3931d763bef5baccc900b8c536
      036bbab6
  6. 08 2月, 2022 9 次提交
    • P
      Work around snappy linker issue with newer compilers (#9517) · d7c868b0
      Peter Dillinger 提交于
      Summary:
      After https://github.com/facebook/rocksdb/issues/9481, we are using newer default compiler for
      build-format-compatible CircleCI nightly job, which fails on building
      2.2.fb.branch branch because it tries to use a pre-compiled libsnappy.a
      that is checked into the repo (!). This works around that by setting
      SNAPPY_LDFLAGS=-lsnappy, which is only understood by such old versions.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9517
      
      Test Plan:
      Run check_format_compatible.sh on Ubuntu 20 AWS machine,
      watch nightly run
      
      Reviewed By: hx235
      
      Differential Revision: D34055561
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 45f9d428dd082f026773bfa8d9dd4dad66fc9378
      d7c868b0
    • P
      Work around some new clang-analyze failures (#9515) · 5cb137a8
      Peter Dillinger 提交于
      Summary:
      ... seen only in internal clang-analyze runs after https://github.com/facebook/rocksdb/issues/9481
      
      * Mostly, this works around falsely reported leaks by using
      std::unique_ptr in some places where clang-analyze was getting
      confused. (I didn't see any changes in C++17 that could make our Status
      implementation leak memory.)
      * Also fixed SetBGError returning address of a stack variable.
      * Also fixed another false null deref report by adding an assert.
      
      Also, use SKIP_LINK=1 to speed up `make analyze`
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9515
      
      Test Plan:
      Was able to reproduce the reported errors locally and verify
      they're fixed (except SetBGError). Otherwise, existing tests
      
      Reviewed By: hx235
      
      Differential Revision: D34054630
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 38600ef3da75ddca307dff96b7a1a523c2885c2e
      5cb137a8
    • A
      Remove Deprecated overloads of DB::GetApproximateSizes (#9458) · bbe4763e
      Akanksha Mahajan 提交于
      Summary:
      In RocksDB few overloads of DB::GetApproximateSizes are marked as
      DEPRECATED_FUNC, and we are removing it in the upcoming 7.0 release.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9458
      
      Test Plan: CircleCI
      
      Reviewed By: riversand963
      
      Differential Revision: D34043791
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 815c0ad283a6627c4b241479c7d40ce03a758493
      bbe4763e
    • P
      Add GetTemperature on existing files (#9498) · bd083741
      Peter Dillinger 提交于
      Summary:
      For tiered storage
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9498
      
      Test Plan: Just API placeholders for now
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D33993094
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 3cf19a450c7232e05306e94018559b26e9fd35db
      bd083741
    • L
      Update HISTORY for PR 9504 (#9513) · 98942a29
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9513
      
      Reviewed By: riversand963
      
      Differential Revision: D34046181
      
      Pulled By: ltamasi
      
      fbshipit-source-id: a5d8d3bf84e5c13bdc6cbd5ba1b4216bad9adfc5
      98942a29
    • H
      Clarify Google benchmark < 1.6.0 in INSTALL.md (#9505) · c234ac9a
      Hui Xiao 提交于
      Summary:
      **Context:**
      Google benchmark [v1.6.0](https://github.com/google/benchmark/releases/tag/v1.6.0) introduced a breaking change "`introduce accessorrs for public data members (https://github.com/google/benchmark/pull/1208)`" that will fail RocksDB build of microbench developed based on previous code. For example, https://github.com/facebook/rocksdb/issues/9489.
      
      **Summary:**
      Clarify the maximum version of Google benchmark needed.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9505
      
      Test Plan: CI
      
      Reviewed By: ajkr
      
      Differential Revision: D34023447
      
      Pulled By: hx235
      
      fbshipit-source-id: 0128ffc31485f2d752ab2116771f6ae53231fcd7
      c234ac9a
    • P
      Temporary disable Travis s390x Makefile build (#9512) · c0d2d26b
      Peter Dillinger 提交于
      Summary:
      Due to some unexplained errors with gcc-7
      
      ```
      Assembler messages:
      Error: invalid switch -march=z14
      Error: unrecognized option -march=z14
      ```
      
      Relevant to https://github.com/facebook/rocksdb/issues/9388
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9512
      
      Test Plan: CI
      
      Reviewed By: hx235
      
      Differential Revision: D34044989
      
      Pulled By: pdillinger
      
      fbshipit-source-id: a5406e8f30b2b187949f75c8cee4e2a0eb976670
      c0d2d26b
    • L
      Mitigate the overhead of building the hash of file locations (#9504) · 0cc05438
      Levi Tamasi 提交于
      Summary:
      The patch builds on the refactoring done in https://github.com/facebook/rocksdb/issues/9494
      and improves the performance of building the hash of file
      locations in `VersionStorageInfo` in two ways. First, the hash
      building is moved from `AddFile` (which is called under the DB mutex)
      to a separate post-processing step done as part of `PrepareForVersionAppend`
      (during which the mutex is *not* held). Second, the space necessary
      for the hash is preallocated to prevent costly reallocation/rehashing
      operations. These changes mitigate the overhead of the file location hash,
      which can be significant with certain workloads where the baseline CPU usage
      is low (see https://github.com/facebook/rocksdb/issues/9351,
      which is a workload where keys are sorted, WAL is turned
      off, the vector memtable implementation is used, and there are lots of small
      SST files).
      
      Fixes https://github.com/facebook/rocksdb/issues/9351
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9504
      
      Test Plan:
      `make check`
      
      ```
      numactl --interleave=all ./db_bench --benchmarks=fillseq --allow_concurrent_memtable_write=false --level0_file_num_compaction_trigger=4 --level0_slowdown_writes_trigger=20 --level0_stop_writes_trigger=30 --max_background_jobs=8 --max_write_buffer_number=8 --db=/data/ltamasi-dbbench --wal_dir=/data/ltamasi-dbbench --num=800000000 --num_levels=8 --key_size=20 --value_size=400 --block_size=8192 --cache_size=51539607552 --cache_numshardbits=6 --compression_max_dict_bytes=0 --compression_ratio=0.5 --compression_type=lz4 --bytes_per_sync=8388608 --cache_index_and_filter_blocks=1 --cache_high_pri_pool_ratio=0.5 --benchmark_write_rate_limit=0 --write_buffer_size=16777216 --target_file_size_base=16777216 --max_bytes_for_level_base=67108864 --verify_checksum=1 --delete_obsolete_files_period_micros=62914560 --max_bytes_for_level_multiplier=8 --statistics=0 --stats_per_interval=1 --stats_interval_seconds=20 --histogram=1 --bloom_bits=10 --open_files=-1 --subcompactions=1 --compaction_style=0 --level_compaction_dynamic_level_bytes=true --pin_l0_filter_and_index_blocks_in_cache=1 --soft_pending_compaction_bytes_limit=167503724544 --hard_pending_compaction_bytes_limit=335007449088 --min_level_to_compress=0 --use_existing_db=0 --sync=0 --threads=1 --memtablerep=vector --disable_wal=1 --seed=<some_seed>
      ```
      
      Final statistics before this patch:
      ```
      Cumulative writes: 0 writes, 697M keys, 0 commit groups, 0.0 writes per commit group, ingest: 283.25 GB, 241.08 MB/s
      Interval writes: 0 writes, 1264K keys, 0 commit groups, 0.0 writes per commit group, ingest: 525.69 MB, 176.67 MB/s
      ```
      
      With the patch:
      ```
      Cumulative writes: 0 writes, 759M keys, 0 commit groups, 0.0 writes per commit group, ingest: 308.57 GB, 262.63 MB/s
      Interval writes: 0 writes, 1555K keys, 0 commit groups, 0.0 writes per commit group, ingest: 646.61 MB, 215.11 MB/s
      ```
      
      Reviewed By: riversand963
      
      Differential Revision: D34014734
      
      Pulled By: ltamasi
      
      fbshipit-source-id: acb2703677451d5ccaa7e9d950844b33d240695b
      0cc05438
    • J
      Fix flaky test EnvPosixTestWithParam.RunMany (#9502) · b69f4360
      Jay Zhuang 提交于
      Summary:
      Thread-pool pops a thread function and then run the function,
      which may cause thread-pool is empty but the last function is still
      running.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9502
      
      Test Plan:
      `gtest-parallel ./env_test
      --gtest_filter=DefaultEnvWithoutDirectIO/EnvPosixTestWithParam.RunMany/0
      -r 10000 -w 1000`
      
      Reviewed By: ajkr
      
      Differential Revision: D34011184
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 8c38bef155205bef96fd1c988dcc643a6b2ac270
      b69f4360
  7. 07 2月, 2022 1 次提交
  8. 05 2月, 2022 6 次提交
    • P
      Require C++17 (#9481) · fd3e0f43
      Peter Dillinger 提交于
      Summary:
      Drop support for some old compilers by requiring C++17 standard
      (or higher). See https://github.com/facebook/rocksdb/issues/9388
      
      First modification based on this is to remove some conditional compilation in slice.h (also
      better for ODR)
      
      Also in this PR:
      * Fix some Makefile formatting that seems to affect ASSERT_STATUS_CHECKED config in
      some cases
      * Add c_test to NON_PARALLEL_TEST in Makefile
      * Fix a clang-analyze reported "potential leak" in lru_cache_test
      * Better "compatibility" definition of DEFINE_uint32 for old versions of gflags
      * Fix a linking problem with shared libraries in Makefile (`./random_test: error while loading shared libraries: librocksdb.so.6.29: cannot open shared object file: No such file or directory`)
      * Always set ROCKSDB_SUPPORT_THREAD_LOCAL and use thread_local (from C++11)
        * TODO in later PR: clean up that obsolete flag
      * Fix a cosmetic typo in c.h (https://github.com/facebook/rocksdb/issues/9488)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9481
      
      Test Plan:
      CircleCI config substantially updated.
      
      * Upgrade to latest Ubuntu images for each release
      * Generally prefer Ubuntu 20, but keep a couple Ubuntu 16 builds with oldest supported
      compilers, to ensure compatibility
      * Remove .circleci/cat_ignore_eagain except for Ubuntu 16 builds, because this is to work
      around a kernel bug that should not affect anything but Ubuntu 16.
      * Remove designated gcc-9 build, because the default linux build now uses GCC 9 from
      Ubuntu 20.
      * Add some `apt-key add` to fix some apt "couldn't be verified" errors
      * Generally drop SKIP_LINK=1; work-around no longer needed
      * Generally `add-apt-repository` before `apt-get update` as manual testing indicated the
      reverse might not work.
      
      Travis:
      * Use gcc-7 by default (remove specific gcc-7 and gcc-4.8 builds)
      * TODO in later PR: fix s390x "Assembler messages: Error: invalid switch -march=z14" failure
      
      AppVeyor:
      * Completely dropped because we are dropping VS2015 support and CircleCI covers
      VS >= 2017
      
      Also local testing with old gflags (out of necessity when using ROCKSDB_NO_FBCODE=1).
      
      Reviewed By: mrambacher
      
      Differential Revision: D33946377
      
      Pulled By: pdillinger
      
      fbshipit-source-id: ae077c823905b45370a26c0103ada119459da6c1
      fd3e0f43
    • R
      WriteOptions - add missing java API. (#9295) · 42c8afd8
      Radek Hubner 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9295
      
      Reviewed By: riversand963
      
      Differential Revision: D33672440
      
      Pulled By: ajkr
      
      fbshipit-source-id: 85f73a9297888b00255b636e7826b37186aba45c
      42c8afd8
    • S
      Fixed all RocksJava test failures in Centos and Alpine (#9395) · 2c3a7809
      Si Ke 提交于
      Summary:
      Fixed all RocksJava test failures in Centos and Alpine 32 bit and 64 bit OSes
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9395
      
      Reviewed By: mrambacher
      
      Differential Revision: D33771987
      
      Pulled By: ajkr
      
      fbshipit-source-id: fed91033b8df08f191ad65e1fb745a9264bbfa70
      2c3a7809
    • J
      jni: expose memtable_whole_key_filtering option (#9394) · 83ff350f
      Jermy Li 提交于
      Summary:
      refer to: https://github.com/facebook/rocksdb/wiki/Prefix-Seek#configure-prefix-bloom-filter
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9394
      
      Reviewed By: mrambacher
      
      Differential Revision: D33671533
      
      Pulled By: ajkr
      
      fbshipit-source-id: d90db1712efdd5dd65020329867381d6b3cf2626
      83ff350f
    • P
      Enhance new cache key testing & comments (#9329) · afc280fd
      Peter Dillinger 提交于
      Summary:
      Follow-up to https://github.com/facebook/rocksdb/issues/9126
      
      Added new unit tests to validate some of the claims of guaranteed uniqueness
      within certain large bounds.
      
      Also cleaned up the cache_bench -stress-cache-key tool with better comments
      and description.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9329
      
      Test Plan: no changes to production code
      
      Reviewed By: mrambacher
      
      Differential Revision: D33269328
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 3a2b684a6b2b15f79dc872e563e3d16563be26de
      afc280fd
    • L
      Clean up VersionStorageInfo a bit (#9494) · 42e0751b
      Levi Tamasi 提交于
      Summary:
      The patch does some cleanup in and around `VersionStorageInfo`:
      * Renames the method `PrepareApply` to `PrepareAppend` in `Version`
      to make it clear that it is to be called before appending the `Version` to
      `VersionSet` (via `AppendVersion`), not before applying any `VersionEdit`s.
      * Introduces a helper method `VersionStorageInfo::PrepareForVersionAppend`
      (called by `Version::PrepareAppend`) that encapsulates the population of the
      various derived data structures in `VersionStorageInfo`, and turns the
      methods computing the derived structures (`UpdateNumNonEmptyLevels`,
      `CalculateBaseBytes` etc.) into private helpers.
      * Changes `Version::PrepareAppend` so it only calls `UpdateAccumulatedStats`
      if the `update_stats` flag is set. (Earlier, this was checked by the callee.)
      Related to this, it also moves the call to `ComputeCompensatedSizes` to
      `VersionStorageInfo::PrepareForVersionAppend`.
      * Updates and cleans up `version_builder_test`, `version_set_test`, and
      `compaction_picker_test` so `PrepareForVersionAppend` is called anytime
      a new `VersionStorageInfo` is set up or saved. This cleanup also involves
      splitting `VersionStorageInfoTest.MaxBytesForLevelDynamic`
      into multiple smaller test cases.
      * Fixes up a bunch of comments that were outdated or just plain incorrect.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9494
      
      Test Plan: Ran `make check` and the crash test script for a while.
      
      Reviewed By: riversand963
      
      Differential Revision: D33971666
      
      Pulled By: ltamasi
      
      fbshipit-source-id: fda52faac7783041126e4f8dec0fe01bdcadf65a
      42e0751b
  9. 04 2月, 2022 4 次提交
  10. 03 2月, 2022 1 次提交