1. 07 2月, 2022 1 次提交
  2. 05 2月, 2022 6 次提交
    • P
      Require C++17 (#9481) · fd3e0f43
      Peter Dillinger 提交于
      Summary:
      Drop support for some old compilers by requiring C++17 standard
      (or higher). See https://github.com/facebook/rocksdb/issues/9388
      
      First modification based on this is to remove some conditional compilation in slice.h (also
      better for ODR)
      
      Also in this PR:
      * Fix some Makefile formatting that seems to affect ASSERT_STATUS_CHECKED config in
      some cases
      * Add c_test to NON_PARALLEL_TEST in Makefile
      * Fix a clang-analyze reported "potential leak" in lru_cache_test
      * Better "compatibility" definition of DEFINE_uint32 for old versions of gflags
      * Fix a linking problem with shared libraries in Makefile (`./random_test: error while loading shared libraries: librocksdb.so.6.29: cannot open shared object file: No such file or directory`)
      * Always set ROCKSDB_SUPPORT_THREAD_LOCAL and use thread_local (from C++11)
        * TODO in later PR: clean up that obsolete flag
      * Fix a cosmetic typo in c.h (https://github.com/facebook/rocksdb/issues/9488)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9481
      
      Test Plan:
      CircleCI config substantially updated.
      
      * Upgrade to latest Ubuntu images for each release
      * Generally prefer Ubuntu 20, but keep a couple Ubuntu 16 builds with oldest supported
      compilers, to ensure compatibility
      * Remove .circleci/cat_ignore_eagain except for Ubuntu 16 builds, because this is to work
      around a kernel bug that should not affect anything but Ubuntu 16.
      * Remove designated gcc-9 build, because the default linux build now uses GCC 9 from
      Ubuntu 20.
      * Add some `apt-key add` to fix some apt "couldn't be verified" errors
      * Generally drop SKIP_LINK=1; work-around no longer needed
      * Generally `add-apt-repository` before `apt-get update` as manual testing indicated the
      reverse might not work.
      
      Travis:
      * Use gcc-7 by default (remove specific gcc-7 and gcc-4.8 builds)
      * TODO in later PR: fix s390x "Assembler messages: Error: invalid switch -march=z14" failure
      
      AppVeyor:
      * Completely dropped because we are dropping VS2015 support and CircleCI covers
      VS >= 2017
      
      Also local testing with old gflags (out of necessity when using ROCKSDB_NO_FBCODE=1).
      
      Reviewed By: mrambacher
      
      Differential Revision: D33946377
      
      Pulled By: pdillinger
      
      fbshipit-source-id: ae077c823905b45370a26c0103ada119459da6c1
      fd3e0f43
    • R
      WriteOptions - add missing java API. (#9295) · 42c8afd8
      Radek Hubner 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9295
      
      Reviewed By: riversand963
      
      Differential Revision: D33672440
      
      Pulled By: ajkr
      
      fbshipit-source-id: 85f73a9297888b00255b636e7826b37186aba45c
      42c8afd8
    • S
      Fixed all RocksJava test failures in Centos and Alpine (#9395) · 2c3a7809
      Si Ke 提交于
      Summary:
      Fixed all RocksJava test failures in Centos and Alpine 32 bit and 64 bit OSes
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9395
      
      Reviewed By: mrambacher
      
      Differential Revision: D33771987
      
      Pulled By: ajkr
      
      fbshipit-source-id: fed91033b8df08f191ad65e1fb745a9264bbfa70
      2c3a7809
    • J
      jni: expose memtable_whole_key_filtering option (#9394) · 83ff350f
      Jermy Li 提交于
      Summary:
      refer to: https://github.com/facebook/rocksdb/wiki/Prefix-Seek#configure-prefix-bloom-filter
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9394
      
      Reviewed By: mrambacher
      
      Differential Revision: D33671533
      
      Pulled By: ajkr
      
      fbshipit-source-id: d90db1712efdd5dd65020329867381d6b3cf2626
      83ff350f
    • P
      Enhance new cache key testing & comments (#9329) · afc280fd
      Peter Dillinger 提交于
      Summary:
      Follow-up to https://github.com/facebook/rocksdb/issues/9126
      
      Added new unit tests to validate some of the claims of guaranteed uniqueness
      within certain large bounds.
      
      Also cleaned up the cache_bench -stress-cache-key tool with better comments
      and description.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9329
      
      Test Plan: no changes to production code
      
      Reviewed By: mrambacher
      
      Differential Revision: D33269328
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 3a2b684a6b2b15f79dc872e563e3d16563be26de
      afc280fd
    • L
      Clean up VersionStorageInfo a bit (#9494) · 42e0751b
      Levi Tamasi 提交于
      Summary:
      The patch does some cleanup in and around `VersionStorageInfo`:
      * Renames the method `PrepareApply` to `PrepareAppend` in `Version`
      to make it clear that it is to be called before appending the `Version` to
      `VersionSet` (via `AppendVersion`), not before applying any `VersionEdit`s.
      * Introduces a helper method `VersionStorageInfo::PrepareForVersionAppend`
      (called by `Version::PrepareAppend`) that encapsulates the population of the
      various derived data structures in `VersionStorageInfo`, and turns the
      methods computing the derived structures (`UpdateNumNonEmptyLevels`,
      `CalculateBaseBytes` etc.) into private helpers.
      * Changes `Version::PrepareAppend` so it only calls `UpdateAccumulatedStats`
      if the `update_stats` flag is set. (Earlier, this was checked by the callee.)
      Related to this, it also moves the call to `ComputeCompensatedSizes` to
      `VersionStorageInfo::PrepareForVersionAppend`.
      * Updates and cleans up `version_builder_test`, `version_set_test`, and
      `compaction_picker_test` so `PrepareForVersionAppend` is called anytime
      a new `VersionStorageInfo` is set up or saved. This cleanup also involves
      splitting `VersionStorageInfoTest.MaxBytesForLevelDynamic`
      into multiple smaller test cases.
      * Fixes up a bunch of comments that were outdated or just plain incorrect.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9494
      
      Test Plan: Ran `make check` and the crash test script for a while.
      
      Reviewed By: riversand963
      
      Differential Revision: D33971666
      
      Pulled By: ltamasi
      
      fbshipit-source-id: fda52faac7783041126e4f8dec0fe01bdcadf65a
      42e0751b
  3. 04 2月, 2022 4 次提交
  4. 03 2月, 2022 2 次提交
  5. 02 2月, 2022 8 次提交
    • Y
      Revise APIs related to user-defined timestamp (#8946) · 3122cb43
      Yanqin Jin 提交于
      Summary:
      ajkr reminded me that we have a rule of not including per-kv related data in `WriteOptions`.
      Namely, `WriteOptions` should not include information about "what-to-write", but should just
      include information about "how-to-write".
      
      According to this rule, `WriteOptions::timestamp` (experimental) is clearly a violation. Therefore,
      this PR removes `WriteOptions::timestamp` for compliance.
      After the removal, we need to pass timestamp info via another set of APIs. This PR proposes a set
      of overloaded functions `Put(write_opts, key, value, ts)`, `Delete(write_opts, key, ts)`, and
      `SingleDelete(write_opts, key, ts)`. Planned to add `Write(write_opts, batch, ts)`, but its complexity
      made me reconsider doing it in another PR (maybe).
      
      For better checking and returning error early, we also add a new set of APIs to `WriteBatch` that take
      extra `timestamp` information when writing to `WriteBatch`es.
      These set of APIs in `WriteBatchWithIndex` are currently not supported, and are on our TODO list.
      
      Removed `WriteBatch::AssignTimestamps()` and renamed `WriteBatch::AssignTimestamp()` to
      `WriteBatch::UpdateTimestamps()` since this method require that all keys have space for timestamps
      allocated already and multiple timestamps can be updated.
      
      The constructor of `WriteBatch` now takes a fourth argument `default_cf_ts_sz` which is the timestamp
      size of the default column family. This will be used to allocate space when calling APIs that do not
      specify a column family handle.
      
      Also, updated `DB::Get()`, `DB::MultiGet()`, `DB::NewIterator()`, `DB::NewIterators()` methods, replacing
      some assertions about timestamp to returning Status code.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8946
      
      Test Plan:
      make check
      ./db_bench -benchmarks=fillseq,fillrandom,readrandom,readseq,deleterandom -user_timestamp_size=8
      ./db_stress --user_timestamp_size=8 -nooverwritepercent=0 -test_secondary=0 -secondary_catch_up_one_in=0 -continuous_verification_interval=0
      
      Make sure there is no perf regression by running the following
      ```
      ./db_bench_opt -db=/dev/shm/rocksdb -use_existing_db=0 -level0_stop_writes_trigger=256 -level0_slowdown_writes_trigger=256 -level0_file_num_compaction_trigger=256 -disable_wal=1 -duration=10 -benchmarks=fillrandom
      ```
      
      Before this PR
      ```
      DB path: [/dev/shm/rocksdb]
      fillrandom   :       1.831 micros/op 546235 ops/sec;   60.4 MB/s
      ```
      After this PR
      ```
      DB path: [/dev/shm/rocksdb]
      fillrandom   :       1.820 micros/op 549404 ops/sec;   60.8 MB/s
      ```
      
      Reviewed By: ltamasi
      
      Differential Revision: D33721359
      
      Pulled By: riversand963
      
      fbshipit-source-id: c131561534272c120ffb80711d42748d21badf09
      3122cb43
    • H
      Detect (new) Bloom/Ribbon Filter construction corruption (#9342) · 920386f2
      Hui Xiao 提交于
      Summary:
      Note: rebase on and merge after https://github.com/facebook/rocksdb/pull/9349, https://github.com/facebook/rocksdb/pull/9345, (optional) https://github.com/facebook/rocksdb/pull/9393
      **Context:**
      (Quoted from pdillinger) Layers of information during new Bloom/Ribbon Filter construction in building block-based tables includes the following:
      a) set of keys to add to filter
      b) set of hashes to add to filter (64-bit hash applied to each key)
      c) set of Bloom indices to set in filter, with duplicates
      d) set of Bloom indices to set in filter, deduplicated
      e) final filter and its checksum
      
      This PR aims to detect corruption (e.g, unexpected hardware/software corruption on data structures residing in the memory for a long time) from b) to e) and leave a) as future works for application level.
      - b)'s corruption is detected by verifying the xor checksum of the hash entries calculated as the entries accumulate before being added to the filter. (i.e, `XXPH3FilterBitsBuilder::MaybeVerifyHashEntriesChecksum()`)
      - c) - e)'s corruption is detected by verifying the hash entries indeed exists in the constructed filter by re-querying these hash entries in the filter (i.e, `FilterBitsBuilder::MaybePostVerify()`) after computing the block checksum (except for PartitionFilter, which is done right after each `FilterBitsBuilder::Finish` for impl simplicity - see code comment for more). For this stage of detection, we assume hash entries are not corrupted after checking on b) since the time interval from b) to c) is relatively short IMO.
      
      Option to enable this feature of detection is `BlockBasedTableOptions::detect_filter_construct_corruption` which is false by default.
      
      **Summary:**
      - Implemented new functions `XXPH3FilterBitsBuilder::MaybeVerifyHashEntriesChecksum()` and `FilterBitsBuilder::MaybePostVerify()`
      - Ensured hash entries, final filter and banding and their [cache reservation ](https://github.com/facebook/rocksdb/issues/9073) are released properly despite corruption
         - See [Filter.construction.artifacts.release.point.pdf ](https://github.com/facebook/rocksdb/files/7923487/Design.Filter.construction.artifacts.release.point.pdf) for high-level design
         -  Bundled and refactored hash entries's related artifact in XXPH3FilterBitsBuilder into `HashEntriesInfo` for better control on lifetime of these artifact during `SwapEntires`, `ResetEntries`
      - Ensured RocksDB block-based table builder calls `FilterBitsBuilder::MaybePostVerify()` after constructing the filter by `FilterBitsBuilder::Finish()`
      - When encountering such filter construction corruption, stop writing the filter content to files and mark such a block-based table building non-ok by storing the corruption status in the builder.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9342
      
      Test Plan:
      - Added new unit test `DBFilterConstructionCorruptionTestWithParam.DetectCorruption`
      - Included this new feature in `DBFilterConstructionReserveMemoryTestWithParam.ReserveMemory` as this feature heavily touch ReserveMemory's impl
         - For fallback case, I run `./filter_bench -impl=3 -detect_filter_construct_corruption=true -reserve_table_builder_memory=true -strict_capacity_limit=true  -quick -runs 10 | grep 'Build avg'` to make sure nothing break.
      - Added to `filter_bench`: increased filter construction time by **30%**, mostly by `MaybePostVerify()`
         -  FastLocalBloom
             - Before change: `./filter_bench -impl=2 -quick -runs 10 | grep 'Build avg'`: **28.86643s**
             - After change:
                -  `./filter_bench -impl=2 -detect_filter_construct_corruption=false -quick -runs 10 | grep 'Build avg'` (expect a tiny increase due to MaybePostVerify is always called regardless): **27.6644s (-4% perf improvement might be due to now we don't drop bloom hash entry in `AddAllEntries` along iteration but in bulk later, same with the bypassing-MaybePostVerify case below)**
                - `./filter_bench -impl=2 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'` (expect acceptable increase): **34.41159s (+20%)**
                - `./filter_bench -impl=2 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'` (by-passing MaybePostVerify, expect minor increase): **27.13431s (-6%)**
          -  Standard128Ribbon
             - Before change: `./filter_bench -impl=3 -quick -runs 10 | grep 'Build avg'`: **122.5384s**
             - After change:
                - `./filter_bench -impl=3 -detect_filter_construct_corruption=false -quick -runs 10 | grep 'Build avg'` (expect a tiny increase due to MaybePostVerify is always called regardless - verified by removing MaybePostVerify under this case and found only +-1ns difference): **124.3588s (+2%)**
                - `./filter_bench -impl=3 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'`(expect acceptable increase): **159.4946s (+30%)**
                - `./filter_bench -impl=3 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'`(by-passing MaybePostVerify, expect minor increase) : **125.258s (+2%)**
      - Added to `db_stress`: `make crash_test`, `./db_stress --detect_filter_construct_corruption=true`
      - Manually smoke-tested: manually corrupted the filter construction in some db level tests with basic PUT and background flush. As expected, the error did get returned to users in subsequent PUT and Flush status.
      
      Reviewed By: pdillinger
      
      Differential Revision: D33746928
      
      Pulled By: hx235
      
      fbshipit-source-id: cb056426be5a7debc1cd16f23bc250f36a08ca57
      920386f2
    • L
      Fix a copy-paste bug related to background threads in db_stress (#9485) · 7cd57632
      Levi Tamasi 提交于
      Summary:
      Fixes a typo introduced in https://github.com/facebook/rocksdb/pull/9466.
      
      Fixes https://github.com/facebook/rocksdb/issues/9482
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9485
      
      Test Plan:
      ```
      COMPILE_WITH_TSAN=1 make db_stress -j24
      ./db_stress --ops_per_thread=1000 --reopen=5
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D33928601
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 3e01a0ca5fffb56c268c811cbe045413b225059a
      7cd57632
    • A
      remove unused instance variable in GenericRateLimiter (#9484) · 272ce445
      Andrew Kryczka 提交于
      Summary:
      As reported in
      https://github.com/facebook/rocksdb/pull/2899#issuecomment-1001467021,
      `prev_num_drains_` is confusing as we never set it to nonzero. So this
      PR removes it.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9484
      
      Test Plan: `make check -j24`
      
      Reviewed By: hx235
      
      Differential Revision: D33923203
      
      Pulled By: ajkr
      
      fbshipit-source-id: 6277d50a198b90646583ee8094c2e6a1bbdadc7b
      272ce445
    • A
      Optimize db_stress setup phase (#9475) · ed75dddc
      Andrew Kryczka 提交于
      Summary:
      It is too slow that our `db_crashtest.py` often kills `db_stress` before
      the setup phase completes. Profiled it and found a few ways to optimize.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9475
      
      Test Plan:
      Measured setup phase time reduced 22% (36 -> 28 seconds) for first run, and
      36% (38 -> 24 seconds) for non-first run on empty-ish DB.
      
      - first run benchmark command: `rm -rf /dev/shm/dbstress*/ && mkdir -p /dev/shm/dbstress_expected/ && ./db_stress -max_key=100000000 -destroy_db_initially=1 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1`
      
      output before this PR:
      
      ```
      2022/01/31-11:14:05  Initializing db_stress
      ...
      2022/01/31-11:14:41  Starting database operations
      ```
      
      output after this PR:
      
      ```
      ...
      2022/01/31-11:12:23  Initializing db_stress
      ...
      2022/01/31-11:12:51  Starting database operations
      ```
      
      - non-first run benchmark command: `./db_stress -max_key=100000000 -destroy_db_initially=0 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --reopen=0 --nooverwritepercent=1`
      
      output before this PR:
      
      ```
      2022/01/31-11:20:45  Initializing db_stress
      ...
      2022/01/31-11:21:23  Starting database operations
      ```
      
      output after this PR:
      
      ```
      2022/01/31-11:22:02  Initializing db_stress
      ...
      2022/01/31-11:22:26  Starting database operations
      ```
      
      - ran minified crash test a while: `DEBUG_LEVEL=0 TEST_TMPDIR=/dev/shm python3 tools/db_crashtest.py blackbox --simple --interval=10 --max_key=1000000 --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --value_size_mult=33`
      
      Reviewed By: anand1976
      
      Differential Revision: D33897793
      
      Pulled By: ajkr
      
      fbshipit-source-id: 0d7b2c93e1e2a9f8a878e87632c2455406313087
      ed75dddc
    • P
      Revisit #9118 for compaction outputs (#9480) · a495448e
      Peter Dillinger 提交于
      Summary:
      Crash test recently started showing failures as in https://github.com/facebook/rocksdb/issues/9118 but
      for files created by compaction. This change applies a similar fix.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9480
      
      Test Plan:
      Updated / extended unit test. (Some re-arranging to do the
      simpler compaction testing before this special case.)
      
      Reviewed By: ltamasi
      
      Differential Revision: D33909835
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 58e4b44e4ecc2d21e4df2c2d8440ec0633aa1f6c
      a495448e
    • Y
      Fix compilation errors and add fuzzers to CircleCI (#9420) · c58c5596
      Yanqin Jin 提交于
      Summary:
      This PR does the following:
      - Fix compilation and linking errors when building fuzzer
      - Add the above to CircleCI
      - Update documentation
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9420
      
      Test Plan: CI
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D33849452
      
      Pulled By: riversand963
      
      fbshipit-source-id: 0794e5d04a3f53bfd2216fe2b3cd827ca2083ac3
      c58c5596
    • J
      Add more micro-benchmark tests (#9436) · 980b9ff3
      Jay Zhuang 提交于
      Summary:
      * Add more micro-benchmark tests
      * Expose an API in DBImpl for waiting for compactions (still not visible to the user)
      * Add argument name for ribbon_bench
      * remove benchmark run from CI, as it runs too long.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9436
      
      Test Plan: CI
      
      Reviewed By: riversand963
      
      Differential Revision: D33777836
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: c05de3bc082cc05b5d019f00b324e774bf4bbd96
      980b9ff3
  6. 01 2月, 2022 4 次提交
    • P
      Ignore `total_order_seek` in DB::Get (#9427) · f6d7ec1d
      Peter Dillinger 提交于
      Summary:
      Apparently setting total_order_seek=true for DB::Get was
      intended to allow accurate read semantics if the current prefix
      extractor doesn't match what was used to generate SST files on
      disk. But since prefix_extractor was made a mutable option in 5.14.0, we
      have been able to detect this case and provide the correct semantics
      regardless of the total_order_seek option. Since that time, the option
      has only made Get() slower in a reasonably common case: prefix_extractor
      unchanged and whole_key_filtering=false.
      
      So this change primarily removes unnecessary effect of
      total_order_seek on Get. Also cleans up some related comments.
      
      Also adds a -total_order_seek option to db_bench and canonicalizes
      handling of ReadOptions in db_bench so that command line options have
      the expected association with library features. (There is potential
      for change in regression test behavior, but the old behavior is likely
      indefensible, or some other inconsistency would need to be fixed.)
      
      TODO in follow-up work: there should be no reason for Get() to depend on
      current prefix extractor at all.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9427
      
      Test Plan:
      Unit tests updated.
      
      Performance (using db_bench update)
      
      Create DB with `TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12 -whole_key_filtering=0`
      
      Test with and without `-total_order_seek` on `TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -use_existing_db -readonly -benchmarks=readrandom -num=10000000 -duration=40 -disable_wal=1 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12`
      
      Before this change, total_order_seek=false: 25188 ops/sec
      Before this change, total_order_seek=true:   1222 ops/sec (~20x slower)
      
      After this change, total_order_seek=false: 24570 ops/sec
      After this change, total_order_seek=true:  25012 ops/sec (indistinguishable)
      
      Reviewed By: siying
      
      Differential Revision: D33753458
      
      Pulled By: pdillinger
      
      fbshipit-source-id: bf892f34907a5e407d9c40bd4d42f0adbcbe0014
      f6d7ec1d
    • A
      db_stress begin tracking expected state after verification (#9470) · c7ce03dc
      Andrew Kryczka 提交于
      Summary:
      Previously we enabled tracking expected state changes during
      `FinishInitDb()`, as soon as the DB was opened. This meant tracing was
      enabled during `VerifyDb()`. This cost extra CPU by requiring
      `DBImpl::trace_mutex_` to be acquired on each read operation. It was
      unnecessary since we know there are no expected state changes during the
      `VerifyDb()` phase. So, this PR delays tracking expected state changes
      until after the `VerifyDb()` phase has completed.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9470
      
      Test Plan:
      Measured this PR reduced `VerifyDb()` 76% (387 -> 92 seconds) with
      `-disable_wal=1` (i.e., expected state tracking enabled).
      
      - benchmark command: `./db_stress -max_key=100000000 -ops_per_thread=1 -destroy_db_initially=1 -expected_values_dir=/dev/shm/dbstress_expected/ -db=/dev/shm/dbstress/ --clear_column_family_one_in=0 --disable_wal=1 --reopen=0`
      - without this PR, `VerifyDb()` takes 387 seconds:
      
      ```
      2022/01/30-21:43:04  Initializing worker threads
      Crash-recovery verification passed :)
      2022/01/30-21:49:31  Starting database operations
      ```
      
      - with this PR, `VerifyDb()` takes 92 seconds
      
      ```
      2022/01/30-21:59:06  Initializing worker threads
      Crash-recovery verification passed :)
      2022/01/30-22:00:38  Starting database operations
      ```
      
      Reviewed By: riversand963
      
      Differential Revision: D33884596
      
      Pulled By: ajkr
      
      fbshipit-source-id: 5f259de8087de5b0531f088e11297f37ed2f7685
      c7ce03dc
    • A
      db_crashtest.py use cheaper settings (#9476) · 8dbd0bd1
      Andrew Kryczka 提交于
      Summary:
      Despite attempts to optimize `db_stress` setup phase (i.e.,
      pre-`OperateDb()`) latency in https://github.com/facebook/rocksdb/issues/9470 and https://github.com/facebook/rocksdb/issues/9475, it still always took tens
      of seconds. Since we still aren't able to setup a 100M key `db_stress`
      quickly, we should reduce the number of keys. This PR reduces it 4x
      while increasing `value_size_mult` 4x (from its default value of 8) so
      that memtables and SST files fill at a similar rate compared to before this PR.
      
      Also disabled bzip2 compression since we'll probably never use it and
      I noticed many CI runs spending majority of CPU on bzip2 decompression.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9476
      
      Reviewed By: siying
      
      Differential Revision: D33898520
      
      Pulled By: ajkr
      
      fbshipit-source-id: 855021784ad9664f2be5bce21f0339a1cf93230d
      8dbd0bd1
    • H
      Mark virtual ~Env() override (#9467) · a3de7ae4
      Hui Xiao 提交于
      Summary:
      **Context:**
      
      Compiling RocksDB with -Winconsistent-missing-destructor-override reveals the following :
      
      ```
      ./include/rocksdb/env.h:174:11: error: '~Env' overrides a destructor but is not marked 'override' [-Werror,-Winconsistent-missing-destructor-override]
        virtual ~Env();
                ^
      ./include/rocksdb/customizable.h:58:3: note: overridden virtual function is here
        ~Customizable() override {}
      ```
      
      The need of overriding the Env's destructor seems to be introduced by https://github.com/facebook/rocksdb/pull/9293 and surfaced by -Winconsistent-missing-destructor-override, which is not turned on by default.
      
      **Summary:**
      Mark  ~Env() override
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9467
      
      Test Plan: - Turn on -Winconsistent-missing-destructor-override and USE_CLANG=1 make -jN env/env.o to see whether the error shows up
      
      Reviewed By: jay-zhuang, riversand963, george-reynya
      
      Differential Revision: D33864985
      
      Pulled By: hx235
      
      fbshipit-source-id: 4a78bd161ff153902b2676829723e9a1c33dd749
      a3de7ae4
  7. 30 1月, 2022 1 次提交
    • L
      Set the number of threads up front in db_stress (#9466) · f07c5692
      Levi Tamasi 提交于
      Summary:
      With the code on main, `RunStressTest` increments the number of threads
      one by one as the threads are created and started. This results in a
      data race with `NonBatchedOpsStressTest::VerifyDb`, which reads this
      value without synchronization, and is also not correct in the sense
      that `VerifyDb` assumes that the number of threads already has its final
      value set (e.g. it's checking whether the current thread is the last
      one). The patch fixes this by setting the number of threads before
      creating/starting any threads. This also eliminates the need for locking
      the mutex during thread startup.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9466
      
      Test Plan: Ran the blackbox crash test under TSAN for a while.
      
      Reviewed By: ajkr
      
      Differential Revision: D33858856
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 8a6515a83fd1808b8b8dca61978777c4404f04cc
      f07c5692
  8. 29 1月, 2022 3 次提交
  9. 28 1月, 2022 11 次提交
    • A
      Remove deprecated option DBOptions::skip_log_error_on_recovery (#9434) · 74ccd193
      Akanksha Mahajan 提交于
      Summary:
      In  RocksDB DBOptions::skip_log_error_on_recovery is marked as
      "NOT SUPPORTED" for a long time, and setting this option does not have
      any effect on the behavior of RocksDB library. Therefore, we are removing it
      in the upcoming 7.0 release.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9434
      
      Test Plan: CircleCI
      
      Reviewed By: ajkr
      
      Differential Revision: D33763015
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 11f09643298da6c02d3dcdb090b996f4c3cfdd76
      74ccd193
    • A
      Remove deprecated overloads of DB::CompactRange (#9444) · ed86cd5e
      Akanksha Mahajan 提交于
      Summary:
      In RocksDB few overloads of DB::CompactRange() are marked as DEPRECATED_FUNC, and
      we are removing it in the upcoming 7.0 release.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9444
      
      Test Plan: CircleCI
      
      Reviewed By: ajkr
      
      Differential Revision: D33788520
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 716e0d5f227f791605d4d91626c0cbf5b4571630
      ed86cd5e
    • P
      Fix^2 prefix extractor testing in crash test (#9463) · c11fe940
      Peter Dillinger 提交于
      Summary:
      Even after https://github.com/facebook/rocksdb/issues/9461 could see
      ```
      Error: please specify prefix_size for test_batches_snapshots test!
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9463
      
      Test Plan:
      run `make blackbox_crashtest` for a long time. (Unfortunately,
      it's taking a long time to reproduce these failures)
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D33838152
      
      Pulled By: pdillinger
      
      fbshipit-source-id: b9a73c5bbb68df53f14c22b9b52f61d1f7ef38af
      c11fe940
    • J
      Remove unused API base_background_compactions (#9462) · 22321e10
      Jay Zhuang 提交于
      Summary:
      The API is deprecated long time ago. Clean up the codebase by
      removing it.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9462
      
      Test Plan: CI, fake release: D33835220
      
      Reviewed By: riversand963
      
      Differential Revision: D33835103
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 6d2dc12c8e7fdbe2700865a3e61f0e3f78bd8184
      22321e10
    • Y
      Disallow a combination of options (#9348) · dd203ed6
      Yanqin Jin 提交于
      Summary:
      Disallow `immutable_db_opts.use_direct_io_for_flush_and_compaction == true` and
      `mutable_db_opts.writable_file_max_buffer_size == 0`, since it causes `WritableFileWriter::Append()`
      to loop forever and does not make much sense in direct IO.
      
      This combination of options itself does not make much sense: asking RocksDB to do direct IO but not allowing
      RocksDB to allocate a buffer. We should detect this false combination and warn user early, no matter whether
      the application is running on a platform that supports direct IO or not. In the case of platform **not** supporting
      direct IO, it's ok if the user learns about this and then finds that direct IO is not supported.
      
      One tricky thing: the constructor of `WritableFileWriter` is being used in our unit tests, and it's impossible
      to return status code from constructor. Since we do not throw, I put an assertion for now. Fortunately,
      the constructor is not exposed to external applications.
      
      Closing https://github.com/facebook/rocksdb/issues/7109
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9348
      
      Test Plan: make check
      
      Reviewed By: ajkr
      
      Differential Revision: D33371924
      
      Pulled By: riversand963
      
      fbshipit-source-id: 2a3701ab541cee23bffda8a36cdf37b2d235edfa
      dd203ed6
    • M
      Fix LITE build for SliceTransform::AsString (#9460) · 7d7085c4
      mrambacher 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9460
      
      Reviewed By: pdillinger
      
      Differential Revision: D33830275
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 65dd1496e0291013085fdc3cce6ae3bf6dc955b5
      7d7085c4
    • P
      Fix/expand prefix extractor testing in crash test (#9461) · 981e8c62
      Peter Dillinger 提交于
      Summary:
      Changes in https://github.com/facebook/rocksdb/issues/9453 could trigger
      ```
      stderr:
      Error: prefixpercent is non-zero while prefix_size is not positive!
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9461
      
      Test Plan: run `make blackbox_crashtest` for a long time
      
      Reviewed By: ajkr
      
      Differential Revision: D33830751
      
      Pulled By: pdillinger
      
      fbshipit-source-id: be88377dcaa47e4bb7adb0347762639eff8f1476
      981e8c62
    • P
      Remove obsolete backupable_db.h, utility_db.h (#9438) · 78aee6fe
      Peter Dillinger 提交于
      Summary:
      This also removes the obsolete names BackupableDBOptions
      and UtilityDB. API users must now use BackupEngineOptions and
      DBWithTTL::Open. In C API, `rocksdb_backupable_db_*` is replaced
      `rocksdb_backup_engine_*`. Similar renaming in Java API.
      
      In reference to https://github.com/facebook/rocksdb/issues/9389
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9438
      
      Test Plan: CI
      
      Reviewed By: mrambacher
      
      Differential Revision: D33780269
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 4a6cfc5c1b4c78bcad790b9d3dd13c5fdf4a1fac
      78aee6fe
    • P
      Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) · ea89c77f
      Peter Dillinger 提交于
      Summary:
      MemTable::MultiGet was not considering range tombstones before
      querying Bloom filter. This means range tombstones would be skipped for
      keys (or prefixes) with no other entries in the memtable. This could cause
      old values for a key (in SST files) to still show up until the range tombstone
      covering it has been flushed.
      
      This is fixed by essentially disabling the memtable Bloom filter when there
      are any range tombstones. (This could be better optimized in the future, but
      good enough for now.)
      
      Did some other cleanup/optimization in the same code to (more than) offset
      the cost of checking on range tombstones in more cases. There is now
      notable improvement when memtable_whole_key_filtering and prefix_extractor
      are used together (unusual), and this makes MultiGet closer to the Get
      implementation.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9453
      
      Test Plan:
      new unit test added. Added memtable Bloom to crash test.
      
      Performance testing
      --------------------
      
      Build WAL-only DB (recovers to memtable):
      ```
      TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=1000000 -write_buffer_size=250000000
      ```
      
      Query test command, to maximize sensitivity to the changed code:
      ```
      TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -use_existing_db -readonly -benchmarks=multireadrandom -num=10000000 -write_buffer_size=250000000 -memtable_bloom_size_ratio=0.015 -multiread_batched -batch_size=24 -threads=8 -memtable_whole_key_filtering=$MWKF -prefix_size=$PXS
      ```
      (Note -num here is 10x larger for mostly memtable misses)
      
      Before & after run simultaneously, average over 10 iterations per data point, ops/sec.
      
      MWKF=0 PXS=0 (Bloom disabled)
      Before: 5724844
      After: 6722066
      
      MWKF=0 PXS=7 (prefixes hardly unique; Bloom not useful)
      Before: 9981319
      After: 10237990
      
      MWKF=0 PXS=8 (prefixes unique; Bloom useful)
      Before:  12081715
      After: 12117603
      
      MWKF=1 PXS=0 (whole key Bloom useful)
      Before: 11944354
      After: 12096085
      
      MWKF=1 PXS=7 (whole key Bloom useful in new version; prefixes not useful in old version)
      Before: 9444299
      After: 11826029
      
      MWKF=1 PXS=7 (whole key Bloom useful in new version; prefixes useful in old version)
      Before: 11784465
      After: 11778591
      
      Only in this last case is the 'before' *slightly* faster, perhaps because hashing prefixes is slightly faster than hashing whole keys. Otherwise, 'after' is faster.
      
      Reviewed By: ajkr
      
      Differential Revision: D33805025
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 597523cae4f4eafdf6ae6bb2bc6cb46f83b017bf
      ea89c77f
    • H
      Remove deprecated API AdvancedColumnFamilyOptions::soft_rate_limit/hard_rate_limit (#9452) · 1e0e883c
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      AdvancedColumnFamilyOptions::soft_rate_limit/hard_rate_limit have been marked as deprecated and it's time to actually remove the code.
      - Keep `soft_rate_limit`/`hard_rate_limit` in `cf_mutable_options_type_info` to prevent throwing `InvalidArgument` in `GetColumnFamilyOptionsFromMap` when reading an option file still with these options (e.g, old option file generated from RocksDB before the deprecation)
      - Keep `soft_rate_limit`/`hard_rate_limit` in under `OptionsOldApiTest.GetOptionsFromMapTest` to test the case mentioned above.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9452
      
      Test Plan: Rely on my eyeball and CI
      
      Reviewed By: ajkr
      
      Differential Revision: D33804938
      
      Pulled By: hx235
      
      fbshipit-source-id: 133d49f7ec5238d7efceeb0a3122a5792a2b9945
      1e0e883c
    • Y
      Using back to get the last element (#9415) · 7fb723f5
      yaphet 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9415
      
      Reviewed By: ajkr
      
      Differential Revision: D33773673
      
      Pulled By: riversand963
      
      fbshipit-source-id: 52b59ec5a6b01a91d3f990b7f2b0f16320afb49b
      7fb723f5