1. 14 7月, 2023 1 次提交
  2. 13 7月, 2023 1 次提交
  3. 11 7月, 2023 3 次提交
    • C
      Improve error message when an SST file in MANIFEST is not found (#11573) · 854eb76a
      Changyu Bi 提交于
      Summary:
      I got the following error message when an SST file is recorded in MANIFEST but is missing from the db folder.
      It's confusing in two ways:
      1. The part about file "./074837.ldb" which RocksDB will attempt to open only after ./074837.sst is not found.
      2. The last part about "No such file or directory in file ./MANIFEST-074507" sounds like `074837.ldb` is not found in manifest.
      
      ```
      ldb --hex --db=. get some_key
      
      Failed: Corruption: Corruption: IO error: No such file or directory: While open a file for random read: ./074837.ldb: No such file or directory in file ./MANIFEST-074507
      ```
      
      Improving the error message a little bit:
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11573
      
      Test Plan:
      run the same command after this PR
      ```
      Failed: Corruption: Corruption: IO error: No such file or directory: While open a file for random read: ./074837.sst: No such file or directory  The file ./MANIFEST-074507 may be corrupted.
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D47192056
      
      Pulled By: cbi42
      
      fbshipit-source-id: 06863f376cc4455803cffb2250c41399b4c39467
      854eb76a
    • weedge's avatar
      fix: std::optional value() build error on older macOS SDK (#11574) · 1a7c7419
      weedge 提交于
      Summary:
      `PORTABLE=1 USE_SSE=1 USE_PCLMUL=1 WITH_JEMALLOC_FLAG=1 JEMALLOC=1 make static_lib`  on MacOS
      
      clang --version:
      
      Apple clang version 12.0.0 (clang-1200.0.32.29)
      Target: x86_64-apple-darwin22.4.0
      Thread model: posix
      InstalledDir: /Library/Developer/CommandLineTools/usr/bin
      
      compile err like this:
      
      util/udt_util.cc:39:39: error: 'value' is unavailable: introduced in macOS 10.14
        if (running_ts_sz != recorded_ts_sz.value()) {
                                            ^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/optional:944:33: note: 'value' has been explicitly marked
            unavailable here
          constexpr value_type const& value() const&
                                      ^
      util/udt_util.cc:217:62: error: 'value' is unavailable: introduced in macOS 10.14
            *new_key = StripTimestampFromUserKey(key, record_ts_sz.value());
                                                                   ^
      /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/optional:953:27: note: 'value' has been explicitly marked
            unavailable here
          constexpr value_type& value() &
                                ^
      2 errors generated.
      make: *** [util/udt_util.o] Error 1
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11574
      
      Reviewed By: ajkr
      
      Differential Revision: D47269519
      
      Pulled By: cbi42
      
      fbshipit-source-id: da49d90cdf00a0af519f91c0cf7d257401eb395f
      1a7c7419
    • Y
      Handle file boundaries when timestamps should not be persisted (#11578) · f7452634
      Yu Zhang 提交于
      Summary:
      Handle file boundaries `FileMetaData.smallest`, `FileMetaData.largest` for when `persist_user_defined_timestamps` is false:
          1) on the manifest write path, the original user-defined timestamps in file boundaries are stripped. This stripping is done during `VersionEdit::Encode` to limit the effect of the stripping to only the persisted version of the file boundaries.
          2) on the manifest read path during DB open, a a min timestamp is padded to the file boundaries. Ideally, this padding should happen during `VersionEdit::Decode` so that all in memory file boundaries have a compatible user key format as the running user comparator. However, because the user-defined timestamp size information is not available at that time. This change is added to `VersionEditHandler::OnNonCfOperation`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11578
      
      Test Plan:
      ```
      make all check
      ./version_edit_test --gtest_filter="*EncodeDecodeNewFile4HandleFileBoundary*".
      ./db_with_timestamp_basic_test --gtest_filter="*HandleFileBoundariesTest*"
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D47309399
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 21b4d54d2089a62826b31d779094a39cb2bbbd51
      f7452634
  4. 08 7月, 2023 2 次提交
    • Y
      Fix a unit test hole for recovering UDTs with WAL files (#11577) · baf37a0e
      Yu Zhang 提交于
      Summary:
      Thanks pdillinger for pointing out this test hole. The test `DBWALTestWithTimestamp.Recover` that is intended to test recovery from WAL including user-defined timestamps doesn't achieve its promised coverage. Specifically, after https://github.com/facebook/rocksdb/issues/11557, timestamps will be removed during flush, and RocksDB by default flush memtables during recovery with `avoid_flush_during_recovery` defaults to false.  This test didn't fail even if all the timestamps are quickly lost due to the default flush behavior.
      
      This PR renamed test `Recover` to `RecoverAndNoFlush`, and updated it to verify timestamps are successfully recovered from WAL with some time-travel reads. `avoid_flush_during_recovery` is set to true to help do this verification.
      
      On the other hand, for test `DBWALTestWithTimestamp.RecoverAndFlush`, since flush on reopen is DB's default behavior. Setting the flags `max_write_buffer` and `arena_block_size` are not really the factors that enforces the flush, so these flags are removed.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11577
      
      Test Plan: ./db_wal_test
      
      Reviewed By: pdillinger
      
      Differential Revision: D47142892
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 9465e278806faa5885b541b4e32d99e698edef7d
      baf37a0e
    • C
      Make `rocksdb_options_add_compact_on_deletion_collector_factory` backward compatible (#11593) · 1f410ff9
      Changyu Bi 提交于
      Summary:
      https://github.com/facebook/rocksdb/issues/11542 added a parameter to the C API `rocksdb_options_add_compact_on_deletion_collector_factory` which causes some internal builds to fail. External users using this API would also require code change. Making the API backward compatible by restoring the old C API and add the parameter to a new C API `rocksdb_options_add_compact_on_deletion_collector_factory_del_ratio`.
      
      Also updated change log for 8.4 and will backport this change to 8.4 branch once landed.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11593
      
      Test Plan: `make c_test && ./c_test`
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D47299555
      
      Pulled By: cbi42
      
      fbshipit-source-id: 517dc093ef4cf02cac2fe4af4f1af13754bbda63
      1f410ff9
  5. 06 7月, 2023 2 次提交
    • C
      Deprecate option `periodic_compaction_seconds` for FIFO compaction (#11550) · df082c8d
      Changyu Bi 提交于
      Summary:
      both options `ttl` and `periodic_compaction_seconds` have the same meaning for FIFO compaction, which is redundant and can be confusing to use. For example, setting TTL to 0 does not disable TTL: user needs to also set periodic_compaction_seconds to 0. Another example is that dynamically setting `periodic_compaction_seconds` (surprisingly) has no effect on TTL compaction. This is because FIFO compaction picker internally only looks at value of `ttl`. The value of `ttl` is in `SanitizeOptions()` which take into account the value of `periodic_compaction_seconds`, but dynamically setting an option does not invoke this method.
      
      This PR clarifies the usage of both options for FIFO compaction: only `ttl` should be used, `periodic_compaction_seconds` will not have any effect on FIFO compaction.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11550
      
      Test Plan:
      - updated existing unit test `DBOptionsTest.SanitizeFIFOPeriodicCompaction`
      - checked existing values of both options in feature matrix: https://fburl.com/daiquery/xxd0gs9w. All current uses cases either have `periodic_compaction_seconds = 0` or have `periodic_compaction_seconds > ttl`, so should not cause change of behavior.
      
      Reviewed By: ajkr
      
      Differential Revision: D46902959
      
      Pulled By: cbi42
      
      fbshipit-source-id: a9ede235b276783b4906aaec443551fa62ceff4c
      df082c8d
    • C
      `sst_dump --command=verify` should verify block checksums (#11576) · c53d604f
      Changyu Bi 提交于
      Summary:
      `sst_dump --command=verify` did not set read_options.verify_checksum to true so it was not verifying checksum.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11576
      
      Test Plan:
      ran the same command on an SST file with bad checksum:
      ```
      sst_dump --command=verify --file=...sst_file_with_bad_block_checksum
      
      Before this PR:
      options.env is 0x6ba048
      Process ...sst_file_with_bad_block_checksum
      Sst file format: block-based
      The file is ok
      
      After this PR:
      options.env is 0x7f43f6690000
      Process ...sst_file_with_bad_block_checksum
      Sst file format: block-based
      ... is corrupted: Corruption: block checksum mismatch: stored = 2170109798, computed = 2170097510, type = 4  ...
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D47136284
      
      Pulled By: cbi42
      
      fbshipit-source-id: 07d68db715c00347145e5b83d649aef2c3f2acd9
      c53d604f
  6. 04 7月, 2023 2 次提交
  7. 30 6月, 2023 1 次提交
    • Y
      Logically strip timestamp during flush (#11557) · 15053f3a
      Yu Zhang 提交于
      Summary:
      Logically strip the user-defined timestamp when L0 files are created during flush when `AdvancedColumnFamilyOptions.persist_user_defined_timestamps` is false. Logically stripping timestamp here means replacing the original user-defined timestamp with a mininum timestamp, which for now is hard coded to be all zeros bytes.
      
      While working on this, I caught a missing piece on the `BlockBuilder` level for this feature. The current quick path `std::min(buffer_size, last_key_size)` needs a bit tweaking to work for this feature. When user-defined timestamp is stripped during block building, on writing first entry or right after resetting, `buffer` is empty and `buffer_size` is zero as usual. However, in follow-up writes, depending on the size of the stripped user-defined timestamp, and the size of the value, what's in `buffer` can sometimes be smaller than `last_key_size`, leading `std::min(buffer_size, last_key_size)` to truncate the `last_key`. Previous test doesn't caught the bug because in those tests, the size of the stripped user-defined timestamps bytes is smaller than the length of the value. In order to avoid the conditional operation, this PR changed the original trivial `std::min` operation into an arithmetic operation. Since this is a change in a hot and performance critical path, I did the following benchmark to check no observable regression is introduced.
      ```TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000```
      Compiled with DEBUG_LEVEL=0
      Test vs. control runs simulaneous for better accuracy, units = ops/sec
                             PR  vs base:
      Round 1: 350652 vs 349055
      Round 2: 365733 vs 364308
      Round 3: 355681 vs 354475
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11557
      
      Test Plan:
      New timestamp specific test added or existing tests augmented, both are parameterized with `UserDefinedTimestampTestMode`:
      `UserDefinedTimestampTestMode::kNormal` -> UDT feature enabled, write / read with min timestamp
      `UserDefinedTimestampTestMode::kStripUserDefinedTimestamps` -> UDT feature enabled, write / read with min timestamp, set Options.persist_user_defined_timestamps to false.
      
      ```
      make all check
      ./db_wal_test --gtest_filter="*WithTimestamp*"
      ./flush_job_test --gtest_filter="*WithTimestamp*"
      ./repair_test --gtest_filter="*WithTimestamp*"
      ./block_based_table_reader_test
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D47027664
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: e729193b6334dfc63aaa736d684d907a022571f5
      15053f3a
  8. 28 6月, 2023 6 次提交
  9. 27 6月, 2023 2 次提交
  10. 24 6月, 2023 2 次提交
    • C
      Add CreateColumnFamilyWithImport to `StackableDB` and `DBImplReadOnly` (#11556) · ca50ccc7
      Changyu Bi 提交于
      Summary:
      https://github.com/facebook/rocksdb/issues/11378 added a new overloaded `CreateColumnFamilyWithImport` API and updated the virtual function in `StackableDB` and `DBImplReadOnly` to the newly overloaded one. This caused internal error when there is a derived class that tries to override the original `CreateColumnFamilyWithImport` function. This PR adds the original `CreateColumnFamilyWithImport` function back to `StackableDB` and `DBImplReadOnly`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11556
      
      Test Plan: check if this fixes an internal build
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D46980506
      
      Pulled By: cbi42
      
      fbshipit-source-id: 975a6c5748bf9481499a62ee5997ca59e542e3bc
      ca50ccc7
    • A
      Add an interface to provide support for underlying FS to pass their own buffer... · fbd2f563
      akankshamahajan 提交于
      Add an interface to provide support for underlying FS to pass their own buffer during reads (#11324)
      
      Summary:
      1. Public API change: Replace `use_async_io`  API in file_system with `SupportedOps` API which is used by underlying FileSystem to indicate to upper layers whether the FileSystem supports different operations introduced in `enum FSSupportedOps `. Right now operations are `async_io` and whether FS will provide its own buffer during reads or not. The api is changed to extend it to various FileSystem operations in one API rather than creating a separate API for each operation.
      
      2. Provide support for underlying FS to pass their own buffer during Reads (async and sync read) instead of using RocksDB provided `scratch` (buffer) in `FSReadRequest`. Currently only MultiRead supports it and later will be extended to other reads as well (point lookup, scan etc). More details in how to enable in file_system.h
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11324
      
      Test Plan: Tested locally
      
      Reviewed By: anand1976
      
      Differential Revision: D44465322
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 9ec9e08f839b5cc815e75d5dade6cd549998d0ec
      fbd2f563
  11. 23 6月, 2023 1 次提交
  12. 22 6月, 2023 3 次提交
    • P
      Use FaultInjectionTestFS in transaction_test, clarify Close() APIs (#11499) · 05a1d52e
      Peter Dillinger 提交于
      Summary:
      ... instead of race-condition-laden FaultInjectionTestEnv. See https://app.circleci.com/pipelines/github/facebook/rocksdb/27912/workflows/4c63e5a8-597e-439d-8c7e-82308056af02/jobs/609648 and similar PR https://github.com/facebook/rocksdb/issues/11271
      
      Had to fix the semantics of FaultInjectionTestFS Close() operations to allow a non-OK Close() to fulfill the obligation to close before destruction. To me, this is the obvious choice of Close contract, because what is the caller supposed to do if Close() fails and they still have an obligation to successfully close before object destruction? Call Close() in an infinite loop? Leak the object? I have added API comments to the Env and Filesystem Close() functions to clarify the contracts.
      
      Note that `DB::Close()` has one exception to this kind of Close contract, but it is clearly described in API comments and it is really only for catching programming mistakes, not for dealing with exogenous errors.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11499
      
      Test Plan: watch CI
      
      Reviewed By: jowlyzhang
      
      Differential Revision: D46375708
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 03d4d8251e5df50a82ecd139f7e83f613015fe40
      05a1d52e
    • Y
      Record the `persist_user_defined_timestamps` flag in manifest (#11515) · 7521478b
      Yu Zhang 提交于
      Summary:
      Start to record the value of the flag `AdvancedColumnFamilyOptions.persist_user_defined_timestamps` in the Manifest and table properties for a SST file when it is created. And use the recorded flag when creating a table reader for the SST file. This flag's default value is true, it is only explicitly recorded if it's false.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11515
      
      Test Plan:
      ```
      make all check
      ./version_edit_test
      ```
      
      Reviewed By: ltamasi
      
      Differential Revision: D46920386
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 075c20363d3d2cc1368422ecc805617ed135cc26
      7521478b
    • P
      Internal API for generating semi-random salt (#11331) · 98c6d7fd
      Peter Dillinger 提交于
      Summary:
      ... so that a non-cryptographic whole file checksum would be highly resistant
      to manipulation by a user able to manipulate key-value data (e.g. a user whose data is
      stored in RocksDB) and able to predict SST metadata such as DB session id and file
      number based on read access to logs or DB files. The adversary would also need to predict
      the salt in order to influence the checksum result toward collision with another file's
      checksum.
      
      This change is just internal code to support such a future feature. I think this should be a
      passive feature, not option-controlled, because you probably won't think about needing it
      until you discover you do need it, and it should be low cost, in space (16 bytes per SST
      file) and CPU.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11331
      
      Test Plan: Unit tests added to verify at least pseudorandom behavior. (Actually caught a bug in first draft!) The new "stress" style tests run in ~3ms each on my system.
      
      Reviewed By: ajkr
      
      Differential Revision: D46129415
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 7972dc74487e062b29b1fd9c227425e922c98796
      98c6d7fd
  13. 21 6月, 2023 1 次提交
  14. 20 6月, 2023 2 次提交
    • L
      Attempt to deflake DBWALTestWithEnrichedEnv.SkipDeletedWALs (#11537) · 022d8954
      Levi Tamasi 提交于
      Summary:
      Calling `Flush` (even with `wait==true`) does not guarantee that obsolete WAL files are physically deleted before the call returns. The patch attempts to fix the resulting flakiness by using `SyncPoint`s to make sure `PurgeObsoleteFiles` finishes before checking for WAL deletions.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11537
      
      Test Plan:
      ```
      gtest-parallel --repeat=1000 ./db_wal_test --gtest_filter="*SkipDeletedWALs*"
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D46736050
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 47a931b7a3a03ef681fbf4adb5a0b223d452703e
      022d8954
    • L
      Initialize StressTest::optimistic_txn_db_ in ctor (#11547) · b3edb873
      Levi Tamasi 提交于
      Summary:
      `StressTest::optimistic_txn_db_` is currently not initialized by the constructor, which
      can lead to assertion failures down the line in `StressTest::Open`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11547
      
      Reviewed By: cbi42
      
      Differential Revision: D46845658
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 578b0f24fc00e3e97f24221fcdd003cc529439c2
      b3edb873
  15. 18 6月, 2023 1 次提交
    • J
      Stress/Crash Test for OptimisticTransactionDB (#11513) · 17d52005
      Jay Huh 提交于
      Summary:
      Context:
      OptimisticTransactionDB has not been covered by db_stress (including crash test) like TransactionDB.
      1. Adding the following gflag options to to test OptimisticTransactionDB
      - `use_optimistic_txn`: When true, open OptimisticTransactionDB to test
      - `occ_validation_policy`: `OccValidationPolicy::kValidateParallel = 1` by default.
      - `share_occ_lock_buckets`: Use shared occ locks
      - `occ_lock_bucket_count`: 500 by default. Number of buckets to use for shared occ lock.
      2. Opening OptimisticTransactionDB and NewTxn/Commit added per `use_optimistic_txn` flag in `db_stress_test_base.cc`
      3. OptimisticTransactionDB blackbox/whitebox test added in crash_test.mk
      
      Please note that the existing flag `use_txn` is being used here. When `use_txn == true` and `use_optimistic_txn == false`, we use `TransactionDB` (a.k.a. pessimistic transaction db). When both `use_txn` and `use_optimistic_txn` are true, we use `OptimisticTransactionDB`. If `use_txn == false` but `use_optimistic_txn == true` throw error with message _"You cannot set use_optimistic_txn true while use_txn is false. Please set use_txn true if you want to use OptimisticTransactionDB"_.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11513
      
      Test Plan:
      **Crash Test**
      Serial Validation
      ```
      export CRASH_TEST_EXT_ARGS="--use_optimistic_txn=1 --use_txn=1 --use_put_entity_one_in=0 --occ_validation_policy=0"
      make crash_test -j
      ```
      Parallel Validation (no share bucket)
      ```
      export CRASH_TEST_EXT_ARGS="--use_optimistic_txn=1 --use_txn=1 --use_put_entity_one_in=0 --occ_validation_policy=1 --share_occ_lock_buckets=0"
      make crash_test -j
      ```
      Parallel Validation (share bucket)
      ```
      export CRASH_TEST_EXT_ARGS="--use_optimistic_txn=1 --use_txn=1 --use_put_entity_one_in=0 --occ_validation_policy=1 --share_occ_lock_buckets=1 --occ_lock_bucket_count=500"
      make crash_test -j
      ```
      
      **Stress Test**
      ```
      ./db_stress -use_optimistic_txn -threads=32
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D46547387
      
      Pulled By: jaykorean
      
      fbshipit-source-id: ca19819ca6e0281694966998014b40d95d4e5960
      17d52005
  16. 17 6月, 2023 3 次提交
    • H
      Add UT to test BG read qps behavior during upgrade for pr11406 (#11522) · 1da9ac23
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      When db is upgrading to adopt [pr11406](https://github.com/facebook/rocksdb/pull/11406/), it's possible for RocksDB to infer a small tail size to prefetch for pre-upgrade files. Such small tail size would have caused 1 file read per index or filter partition if partitioned index or filer is used. This PR provides a UT to show this would not happen.
      
      Misc: refactor the related UTs a bit to make this new UT more readable.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11522
      
      Test Plan:
      - New UT
      If logic of upgrade is wrong e.g,
      ```
       --- a/table/block_based/partitioned_index_reader.cc
      +++ b/table/block_based/partitioned_index_reader.cc
      @@ -166,7 +166,8 @@ Status PartitionIndexReader::CacheDependencies(
         uint64_t prefetch_len = last_off - prefetch_off;
         std::unique_ptr<FilePrefetchBuffer> prefetch_buffer;
         if (tail_prefetch_buffer == nullptr || !tail_prefetch_buffer->Enabled() ||
      -      tail_prefetch_buffer->GetPrefetchOffset() > prefetch_off) {
      +      (false && tail_prefetch_buffer->GetPrefetchOffset() > prefetch_off)) {
      ```
      , then the UT will fail like below
      ```
      [ RUN      ] PrefetchTailTest/PrefetchTailTest.UpgradeToTailSizeInManifest/0
      file/prefetch_test.cc:461: Failure
      Expected: (db_open_file_read.count) < (num_index_partition), actual: 38 vs 33
      Received signal 11 (Segmentation fault)
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D46546707
      
      Pulled By: hx235
      
      fbshipit-source-id: 9897b0a975e9055963edac5451fd1cd9d6c45d0e
      1da9ac23
    • Y
      Fix error case memory bug in GetHostName() (#11544) · 66499780
      Yu Zhang 提交于
      Summary:
      Fix the error handling in `GetHostName` for non EFAULT, non EINVAL error. Current handling will cause stack overflow when non null-terminated c style string is in `name`, e.g. ENAMETOOLONG, when the `name` buffer is not big enough and the host name is truncated.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11544
      
      Test Plan:
      ```
      COMPILE_WITH_ASAN=1 make all check
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D46775799
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: e0fc9400c50fe38bc1fd888b4fea5fe8706165bf
      66499780
    • Y
      Add a ticker to track number of trash files deleted in background thread (#11540) · b421a8c2
      Yu Zhang 提交于
      Summary:
      This ticker combined with `rocksdb.files.marked.trash` can help give a better picture of how DeleteScheduler is keeping up.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11540
      
      Test Plan:
      ```
      ./delete_scheduler_test
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D46746401
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: f3daa622aa3ddefe7d673e0cc257a47699d506df
      b421a8c2
  17. 16 6月, 2023 4 次提交
  18. 15 6月, 2023 2 次提交
    • P
      Avoid destroying default PosixEnv, safely (#11538) · 70bf5ef0
      Peter Dillinger 提交于
      Summary:
      Use another static object to join threads instead.
      
      This change is motivated by a case in which some code using NewLRUCache() -> ShardedCacheBase -> SemiStructuredUniqueIdGen -> GenerateRawUniqueId() -> Env::Default() was happening
      during static destruction.
      
      I didn't see anything else in PosixEnv or base classes that would cause a problem by not
      destroying. (WinEnv is already not destroyed; see env_default.cc)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11538UndefinedBehaviorSanitizer: undefined-behavior env/env_test.cc:3561:23 in
      $
      ```
      
      Test Plan:
      test added, which would previously fail with UBSAN:
      
      ```
      $ ./env_test --gtest_filter=*Destruct*
      Note: Google Test filter = *Destruct*
      [==========] Running 1 test from 1 test case.
      [----------] Global test environment set-up.
      [----------] 1 test from EnvTestMisc
      [ RUN      ] EnvTestMisc.StaticDestruction
      [       OK ] EnvTestMisc.StaticDestruction (0 ms)
      [----------] 1 test from EnvTestMisc (0 ms total)
      
      [----------] Global test environment tear-down
      [==========] 1 test from 1 test case ran. (0 ms total)
      [  PASSED  ] 1 test.
      env/env_test.cc:3561:23: runtime error: member call on address 0x7f7b96671ca8 which does not point to an object of type 'rocksdb::Env'
      0x7f7b96671ca8: note: object is of type 'N7rocksdb12ConfigurableE'
       00 00 00 00  90 a7 f7 95 7b 7f 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00
                    ^~~~~~~~~~~~~~~~~~~~~~~
                    vptr for 'N7rocksdb12ConfigurableE'
      
      Reviewed By: jowlyzhang
      
      Differential Revision: D46737389
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 0f80a443bf799ffc5641e898cf3a75f7d10a987b
      70bf5ef0
    • C
      Do not include last level in compaction when `allow_ingest_behind=true` (#11489) · 15e8a843
      Changyu Bi 提交于
      Summary:
      when a DB is configured with `allow_ingest_behind = true`, the last level should be reserved for ingested files and these files should not be included in any compaction. Currently, a major compaction can compact these files to smaller levels. This can cause future files to be rejected for ingest behind (see `ExternalSstFileIngestionJob::CheckLevelForIngestedBehindFile()`). This PR fixes the issue such that files in the last level is not included in any compaction.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11489
      
      Test Plan: * Updated unit test `ExternalSSTFileTest.IngestBehind` to test that last level is not included in manual and auto-compaction.
      
      Reviewed By: ajkr
      
      Differential Revision: D46455711
      
      Pulled By: cbi42
      
      fbshipit-source-id: 5e2142c2a709ef932ad797897795021c06c4ac8c
      15e8a843
  19. 14 6月, 2023 1 次提交