1. 02 11月, 2022 1 次提交
    • A
      Fix async_io failures in case there is error in reading data (#10890) · ff9ad2c3
      akankshamahajan 提交于
      Summary:
      Fix memory corruption error in scans if async_io is enabled. Memory corruption happened if data is overlapping between two buffers. If there is IOError while reading the data, it leads to empty buffer and other buffer already in progress of async read goes again for reading causing the error.
      Fix: Added check to abort IO in second buffer if curr_ got empty.
      
      This PR also fixes db_stress failures which happened when buffers are not aligned.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10890
      
      Test Plan:
      - Ran make crash_test -j32 with async_io enabled.
      -  Ran benchmarks to make sure there is no regression.
      
      Reviewed By: anand1976
      
      Differential Revision: D40881731
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 39fcf2134c7b1bbb08415ede3e1ef261ac2dbc58
      ff9ad2c3
  2. 01 11月, 2022 5 次提交
    • Y
      Basic Support for Merge with user-defined timestamp (#10819) · 7d26e4c5
      Yanqin Jin 提交于
      Summary:
      This PR implements the originally disabled `Merge()` APIs when user-defined timestamp is enabled.
      
      Simplest usage:
      ```cpp
      // assume string append merge op is used with '.' as delimiter.
      // ts1 < ts2
      db->Put(WriteOptions(), "key", ts1, "v0");
      db->Merge(WriteOptions(), "key", ts2, "1");
      ReadOptions ro;
      ro.timestamp = &ts2;
      db->Get(ro, "key", &value);
      ASSERT_EQ("v0.1", value);
      ```
      
      Some code comments are added for clarity.
      
      Note: support for timestamp in `DB::GetMergeOperands()` will be done in a follow-up PR.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10819
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D40603195
      
      Pulled By: riversand963
      
      fbshipit-source-id: f96d6f183258f3392d80377025529f7660503013
      7d26e4c5
    • D
      Fix compilation errors, clang++-15 (#10907) · 9f3475ec
      Denis Hananein 提交于
      Summary:
      I've tried to compile the main branch, but there are two minor things which are make CE.
      I'm not sure about the second one (`num_empty_non_l0_level`), probably there is should be additional assert.
      
      ```
      -c ../cache/clock_cache.cc
      [build] ../cache/clock_cache.cc:855:15: error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]
      [build]   for (size_t i = 0; &array_[current] != h; i++) {
      [build]               ^
      ```
      
      ```
      [build] ../db/version_set.cc:3665:7: error: variable 'num_empty_non_l0_level' set but not used [-Werror,-Wunused-but-set-variable]
      [build]   int num_empty_non_l0_level = 0;
      [build]       ^
      [build] 1 error generated.
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10907
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D40866667
      
      Pulled By: ajkr
      
      fbshipit-source-id: 963b7bd56859d0b3b2779cd36fad229425cb7b17
      9f3475ec
    • H
      Move move wrong history entry out of 7.8 release (#10898) · 7f5e438a
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      
      https://github.com/facebook/rocksdb/pull/10777 mistakenly added a history entry under 7.8 release but the PR is not included in 7.8. This mistake was due to rebase and merge didn't realize it was a conflict when "## Unreleased" was changed to "## 7.8.0 (10/22/2022)".
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10898
      
      Test Plan: Make check
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D40861001
      
      Pulled By: hx235
      
      fbshipit-source-id: b2310c95490f6ebb90834a210c965a74c9560b51
      7f5e438a
    • L
      Add missing copyright headers to a couple of Java test files (#10900) · ea1982d0
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10900
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D40825886
      
      Pulled By: ltamasi
      
      fbshipit-source-id: e60f74aa8a622c3c71e1fee420fd586728fb2b7b
      ea1982d0
    • S
      Avoid repeat periodic stats printing when there is no change (#10891) · d989300a
      sdong 提交于
      Summary:
      When there is a column family that doesn't get any traffic, its stats are still dumped when options.options.stats_dump_period_sec triggers. This sometimes spam the information logs. With this change, we skip the printing if there is not change, until 8 periods.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10891
      
      Test Plan: Manually test the behavior with hacked db_bench setups.
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D40777183
      
      fbshipit-source-id: ef0b9a793e4f6282df099b464f01d1fb4c5a2cab
      d989300a
  3. 29 10月, 2022 6 次提交
    • Y
      Fix deletion counting in memtable stats (#10886) · 9079895a
      Yanqin Jin 提交于
      Summary:
      Currently, a memtable's stats `num_deletes_` is incremented only if the entry is a regular delete (kTypeDeletion). We need to fix it by accounting for kTypeSingleDeletion and kTypeDeletionWithTimestamp.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10886
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D40740754
      
      Pulled By: riversand963
      
      fbshipit-source-id: 7bde62cd6df136585bc5bfb1c426c7a8276c08e1
      9079895a
    • J
      Fix a Windows build error (#10897) · 36f5e19e
      Jay Zhuang 提交于
      Summary:
      The for loop is marked as unreachable code because it will never call the increment. Switch it to `if`.
      
      ```
      \table\merging_iterator.cc(823): error C2220: the following warning is treated as an error
      \table\merging_iterator.cc(823): warning C4702: unreachable code
      \table\merging_iterator.cc(1030): error C2220: the following warning is treated as an error
      \table\merging_iterator.cc(1030): warning C4702: unreachable code
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10897
      
      Reviewed By: cbi42
      
      Differential Revision: D40811790
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: fe8fd3e7cf3d6f710360c402b79763854d5120df
      36f5e19e
    • Y
      Pass `const LockInfo&` to AcquireLocked() and AcquireWithTimeout (#10874) · 900f7912
      Yanqin Jin 提交于
      Summary:
      The motivation and benefit of current behavior of passing `LockInfo&&` as argument to AcquireLocked() and AcquireWithTimeout() is not clear to me. Furthermore, in AcquireWithTimeout(), we access members of `LockInfo&&` after it is passed to AcquireLocked() as rvalue ref. In addition, we may call `AcquireLocked()` with `std::move(lock_info)` multiple times.
      
      This leads to linter warning of use-after-move. If future implementation of AcquireLocked() does something like moving-construct a new `LockedInfo` using the passed-in `LockInfo&&`, then the caller cannot use it because `LockInfo` has a member of type `autovector`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10874
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D40704210
      
      Pulled By: riversand963
      
      fbshipit-source-id: 20091df65b4fc63b072bcec9809efc49955d6d35
      900f7912
    • H
      Run clang format against files under example/, memory/ and memtable/ folders (#10893) · 08a63ad1
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      Run the following to format
      ```
      find ./examples -iname *.h -o -iname *.cc | xargs clang-format -i
      find ./memory -iname *.h -o -iname *.cc | xargs clang-format -i
      find ./memtable -iname *.h -o -iname *.cc | xargs clang-format -i
      ```
      
      **Test**
      - Manual inspection to ensure changes are cosmetic only
      - CI
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10893
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D40779187
      
      Pulled By: hx235
      
      fbshipit-source-id: 529cbb0f0fbd698d95817e8c42fe3ce32254d9b0
      08a63ad1
    • L
      Handle Merges correctly in GetEntity (#10894) · 7867a111
      Levi Tamasi 提交于
      Summary:
      The PR fixes the handling of `Merge`s in `GetEntity`. Note that `Merge` is not yet
      supported for wide-column entities written using `PutEntity`; this change is
      about returning correct (i.e. consistent with `Get`) results in cases like when the
      base value is a plain old key-value written using `Put` or when there is no real base
      value because we hit either a tombstone or the beginning of history.
      
      Implementation-wise, the patch introduces a new wrapper around the existing
      `MergeHelper::TimedFullMerge` that can store the merge result in either a string
      (for the purposes of `Get`) or a `PinnableWideColumns` instance (for `GetEntity`).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10894
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D40782708
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 3d700d56b2ef81f02ba1e2d93f6481bf13abcc90
      7867a111
    • J
      Upgrade CircleCI Windows Build (#10090) · 1e6f1ef8
      Jay Zhuang 提交于
      Summary:
      * Upgrade CircleCI orb from 2.4 to 5.0
      * Setup vs2022 build
      * Use image build-in vs2019 and vs2022
      * Remove vs2017
      * Remove CMAKE_CXX_STANDARD=20
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10090
      
      Reviewed By: ajkr
      
      Differential Revision: D40787942
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: cc74c02a9f28dd784a0ba5502c4bfc9ff1a26d3e
      1e6f1ef8
  4. 28 10月, 2022 4 次提交
    • A
      Allow a custom DB cleanup command to be passed to db_crashtest.py (#10883) · bf497e91
      anand76 提交于
      Summary:
      This option allows a custom cleanup command line for a non-Posix file system to be used by db_crashtest.py to cleanup between runs.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10883
      
      Test Plan: Run the whitebox crash test
      
      Reviewed By: pdillinger
      
      Differential Revision: D40726424
      
      Pulled By: anand1976
      
      fbshipit-source-id: b827f6b583ff78f9ca75ced2d96f7e58f5200432
      bf497e91
    • L
      Use malloc/free for LRUHandle instead of new[]/delete[] (#10884) · 22ff8c5a
      Levi Tamasi 提交于
      Summary:
      It's unsafe to call `malloc_usable_size` with an address not returned by a function from the `malloc` family (see https://github.com/facebook/rocksdb/issues/10798). The patch switches from using `new[]` / `delete[]` for `LRUHandle` to `malloc` / `free`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10884
      
      Test Plan: `make check`
      
      Reviewed By: pdillinger
      
      Differential Revision: D40738089
      
      Pulled By: ltamasi
      
      fbshipit-source-id: ac5583f88125fee49c314639be6b6df85937fbee
      22ff8c5a
    • C
      Reduce heap operations for range tombstone keys in iterator (#10877) · 56715350
      Changyu Bi 提交于
      Summary:
      Right now in MergingIterator, for each range tombstone start and end key, we pop one end from heap and push the other end into the heap. This involves extra downheap and upheap cost. In the likely cases when a range tombstone iterator emits relatively adjacent keys, these keys should have similar order within all keys in the heap. This can happen when there is a burst of consecutive range tombstones, and most of the keys covered by them are dropped already. This PR uses `replace_top()` when inserting new range tombstone keys, which is more efficient in these common cases.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10877
      
      Test Plan:
      - existing UT
      - ran all flavors of stress test through sandcastle
      - benchmark:
      ```
      # Set up: --writes_per_range_tombstone=1 means one point write and one delete range
      
      TEST_TMPDIR=/tmp/rocksdb-rangedel-test-all-tombstone ./db_bench --benchmarks=fillseq,levelstats --writes_per_range_tombstone=1 --max_num_range_tombstones=1000000 --range_tombstone_width=2 --num=100000000 --writes=800000 --max_bytes_for_level_base=4194304 --disable_auto_compactions --write_buffer_size=33554432 --key_size=64
      
      Level Files Size(MB)
      --------------------
        0        8      152
        1        0        0
        2        0        0
        3        0        0
        4        0        0
        5        0        0
        6        0        0
      
      # Benchmark
      TEST_TMPDIR=/tmp/rocksdb-rangedel-test-all-tombstone/ ./db_bench --benchmarks=readseq[-W1][-X5],levelstats --use_existing_db=true --cache_size=3221225472 --num=100000000 --reads=1000000 --disable_auto_compactions=true --avoid_flush_during_recovery=true
      
      # Pre PR
      readseq [AVG    5 runs] : 1432116 (± 59664) ops/sec;  224.0 (± 9.3) MB/sec
      readseq [MEDIAN 5 runs] : 1454886 ops/sec;  227.5 MB/sec
      
      # Post PR
      readseq [AVG    5 runs] : 1944425 (± 29521) ops/sec;  304.1 (± 4.6) MB/sec
      readseq [MEDIAN 5 runs] : 1959430 ops/sec;  306.5 MB/sec
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D40710936
      
      Pulled By: cbi42
      
      fbshipit-source-id: cb782fb9cdcd26c0c3eb9443215a4ef4d2f79022
      56715350
    • S
      sst_dump --command=raw to add index offset information (#10873) · 3e686c7c
      sdong 提交于
      Summary:
      Add some extra information in outputs of "sst_dump --command=raw" to help debug some issues. Right now, encoded block handle is printed out. It is more useful to directly print out offset and size.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10873
      
      Test Plan: Manually run it against a file and check the output.
      
      Reviewed By: anand1976
      
      Differential Revision: D40742289
      
      fbshipit-source-id: 04d7de26e7f27e1595a7cc3ac1c1082e4e835b93
      3e686c7c
  5. 27 10月, 2022 7 次提交
  6. 26 10月, 2022 9 次提交
    • B
      Fix ChecksumType::kXXH3 in the Java API (#10862) · 5f915b44
      Brendan MacDonell 提交于
      Summary:
      While PR#9749 nominally added support for XXH3 in the Java API, it did not update the `toCppChecksumType` method. As a result, setting the checksum type to XXH3 actually set it to CRC32c instead.
      
      This commit adds the missing entry to portal.h, and also updates the tests so that they verify the options passed to RocksDB, instead of simply checking that the getter returns the value set by the setter.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10862
      
      Reviewed By: pdillinger
      
      Differential Revision: D40665031
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2834419b3361a4bac47db3b858951fb451b5bdc8
      5f915b44
    • L
      Adjust value generation in batched ops stress tests (#10872) · d4842752
      Levi Tamasi 提交于
      Summary:
      The patch adjusts the generation of values in batched ops stress tests so that the digits 0..9 are appended (instead of prepended) to the values written. This has the advantage of aligning the encoding of the "value base" into the value string across non-batched, batched, and CF consistency stress tests.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10872
      
      Test Plan: Tested using some black box stress test runs.
      
      Reviewed By: riversand963
      
      Differential Revision: D40692847
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 26bf8adff2944cbe416665f09c3bab89d80416b3
      d4842752
    • S
      Run clang format against files under tools/ and db_stress_tool/ (#10868) · 48fe9217
      sdong 提交于
      Summary:
      Some lines of .h and .cc files are not properly fomatted. Clear them up with clang format.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10868
      
      Test Plan: Watch existing CI to pass
      
      Reviewed By: ajkr
      
      Differential Revision: D40683485
      
      fbshipit-source-id: 491fbb78b2cdcb948164f306829909ad816d5d0b
      48fe9217
    • Y
      Run clang-format on utilities/transactions (#10871) · 95a1935c
      Yanqin Jin 提交于
      Summary:
      This PR is the result of running the following command
      ```
      find ./utilities/transactions/ -name '*.cc' -o -name '*.h' -o -name '*.c' -o -name '*.hpp' -o -name '*.cpp' | xargs clang-format -i
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10871
      
      Test Plan: make check
      
      Reviewed By: cbi42
      
      Differential Revision: D40686871
      
      Pulled By: riversand963
      
      fbshipit-source-id: 613738d667ec8f8e13cce4802e0e166d6be52211
      95a1935c
    • Y
      Run clang-format on some files in db/db_impl directory (#10869) · 84563a27
      Yanqin Jin 提交于
      Summary:
      Run clang-format on some files in db/db_impl/ directory
      
      ```
      clang-format -i <file>
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10869
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D40685390
      
      Pulled By: riversand963
      
      fbshipit-source-id: 64449ccb21b0d61c5142eb2bcbff828acb45c154
      84563a27
    • A
      Format files under table/ by clang-format (#10852) · 727bad78
      anand76 提交于
      Summary:
      Run clang-format on files under the `table` directory.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10852
      
      Reviewed By: ajkr
      
      Differential Revision: D40650732
      
      Pulled By: anand1976
      
      fbshipit-source-id: 2023a958e37fd6274040c5181130284600c9e0ef
      727bad78
    • C
      Improve FragmentTombstones() speed by lazily initializing `seq_set_` (#10848) · 7a959388
      Changyu Bi 提交于
      Summary:
      FragmentedRangeTombstoneList has a member variable `seq_set_` that contains the sequence numbers of all range tombstones in a set. The set is constructed in `FragmentTombstones()` and is used only in `FragmentedRangeTombstoneList::ContainsRange()` which only happens during compaction. This PR moves the initialization of `seq_set_` to `FragmentedRangeTombstoneList::ContainsRange()`. This should speed up `FragmentTombstones()` when the range tombstone list is used for read/scan requests. Microbench shows the speed improvement to be ~45%.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10848
      
      Test Plan:
      - Existing tests and stress test: `python3 tools/db_crashtest.py whitebox --simple  --verify_iterator_with_expected_state_one_in=5`.
      - Microbench: update `range_del_aggregator_bench` to benchmark speed of `FragmentTombstones()`:
      ```
      ./range_del_aggregator_bench --num_range_tombstones=1000 --tombstone_start_upper_bound=50000000 --num_runs=10000 --tombstone_width_mean=200 --should_deletes_per_run=100 --use_compaction_range_del_aggregator=true
      
      Before this PR:
      =========================
      Fragment Tombstones:     270.286 us
      AddTombstones:           1.28933 us
      ShouldDelete (first):    0.525528 us
      ShouldDelete (rest):     0.0797519 us
      
      After this PR: time to fragment tombstones is pushed to AddTombstones() which only happen during compaction.
      =========================
      Fragment Tombstones:     149.879 us
      AddTombstones:           102.131 us
      ShouldDelete (first):    0.565871 us
      ShouldDelete (rest):     0.0729444 us
      ```
      - db_bench: this should improve speed for fragmenting range tombstones for mutable memtable:
      ```
      ./db_bench --benchmarks=readwhilewriting --writes_per_range_tombstone=100 --max_write_buffer_number=100 --min_write_buffer_number_to_merge=100 --writes=500000 --reads=250000 --disable_auto_compactions --max_num_range_tombstones=100000 --finish_after_writes --write_buffer_size=1073741824 --threads=25
      
      Before this PR:
      readwhilewriting :      18.301 micros/op 1310445 ops/sec 4.769 seconds 6250000 operations;   28.1 MB/s (41001 of 250000 found)
      After this PR:
      readwhilewriting :      16.943 micros/op 1439376 ops/sec 4.342 seconds 6250000 operations;   23.8 MB/s (28977 of 250000 found)
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D40646227
      
      Pulled By: cbi42
      
      fbshipit-source-id: ea471667edb258f67d01cfd828588e80a89e4083
      7a959388
    • H
      Fix FIFO causing overlapping seqnos in L0 files due to overlapped seqnos... · fc74abb4
      Hui Xiao 提交于
      Fix FIFO causing overlapping seqnos in L0 files due to overlapped seqnos between ingested files and memtable's (#10777)
      
      Summary:
      **Context:**
      Same as https://github.com/facebook/rocksdb/pull/5958#issue-511150930 but apply the fix to FIFO Compaction case
      Repro:
      ```
      COERCE_CONTEXT_SWICH=1 make -j56 db_stress
      
      ./db_stress --acquire_snapshot_one_in=0 --adaptive_readahead=0 --allow_data_in_errors=True --async_io=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=18 --bottommost_compression_type=disable --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=1 --charge_filter_construction=1 --charge_table_reader=1 --checkpoint_one_in=0 --checksum_type=kCRC32c --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=0 --compact_range_one_in=1000 --compaction_pri=3 --open_files=-1 --compaction_style=2 --fifo_allow_compaction=1 --compaction_ttl=0 --compression_max_dict_buffer_bytes=8388607 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zlib --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test0/rocksdb_crashtest_whitebox --db_write_buffer_size=8388608 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --fail_if_options_file_error=1 --file_checksum_impl=none --flush_one_in=1000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=0 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=15 --index_type=3 --ingest_external_file_one_in=100 --initial_auto_readahead_size=0 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --log2_keys_per_lock=10 --long_running_snapshots=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=16384 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=100000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=4194304 --memtable_prefix_bloom_size_ratio=0.5 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --num_levels=1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=32 --open_write_fault_one_in=0 --ops_per_thread=200000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=1 --pause_background_one_in=0 --periodic_compaction_seconds=0 --prefix_size=8 --prefixpercent=5 --prepopulate_block_cache=0 --progress_reports=0 --read_fault_one_in=0 --readahead_size=16384 --readpercent=45 --recycle_log_file_num=1 --reopen=20 --ribbon_starting_level=999 --snapshot_hold_ops=1000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0 --sync_fault_injection=0 --target_file_size_base=524288 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=1 --use_merge=0 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=0 --verify_db_one_in=1000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=zstd --write_buffer_size=524288 --write_dbid_to_manifest=0 --writepercent=35
      
      put or merge error: Corruption: force_consistency_checks(DEBUG): VersionBuilder: L0 file https://github.com/facebook/rocksdb/issues/479 with seqno 23711 29070 vs. file https://github.com/facebook/rocksdb/issues/482 with seqno 27138 29049
      ```
      
      **Summary:**
      FIFO only does intra-L0 compaction in the following four cases. For other cases, FIFO drops data instead of compacting on data, which is irrelevant to the overlapping seqno issue we are solving.
      -  [FIFOCompactionPicker::PickSizeCompaction](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker_fifo.cc#L155) when `total size < compaction_options_fifo.max_table_files_size` and `compaction_options_fifo.allow_compaction == true`
         - For this path, we simply reuse the fix in `FindIntraL0Compaction` https://github.com/facebook/rocksdb/pull/5958/files#diff-c261f77d6dd2134333c4a955c311cf4a196a08d3c2bb6ce24fd6801407877c89R56
         - This path was not stress-tested at all. Therefore we covered `fifo.allow_compaction` in stress test to surface the overlapping seqno issue we are fixing here.
      - [FIFOCompactionPicker::PickCompactionToWarm](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker_fifo.cc#L313) when `compaction_options_fifo.age_for_warm > 0`
        - For this path, we simply replicate the idea in https://github.com/facebook/rocksdb/pull/5958#issue-511150930 and skip files of largest seqno greater than `earliest_mem_seqno`
        - This path was not stress-tested at all. However covering `age_for_warm` option worths a separate PR to deal with db stress compatibility. Therefore we manually tested this path for this PR
      - [FIFOCompactionPicker::CompactRange](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker_fifo.cc#L365) that ends up picking one of the above two compactions
      - [CompactionPicker::CompactFiles](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker.cc#L378)
          - Since `SanitizeCompactionInputFiles()` will be called [before](https://github.com/facebook/rocksdb/blob/7.6.fb/db/compaction/compaction_picker.h#L111-L113) `CompactionPicker::CompactFiles` , we simply replicate the idea in https://github.com/facebook/rocksdb/pull/5958#issue-511150930  in `SanitizeCompactionInputFiles()`. To simplify implementation, we return `Stats::Abort()` on encountering seqno-overlapped file when doing compaction to L0 instead of skipping the file and proceed with the compaction.
      
      Some additional clean-up included in this PR:
      - Renamed `earliest_memtable_seqno` to `earliest_mem_seqno` for consistent naming
      - Added comment about `earliest_memtable_seqno` in related APIs
      - Made parameter `earliest_memtable_seqno` constant and required
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10777
      
      Test Plan:
      - make check
      - New unit test `TEST_P(DBCompactionTestFIFOCheckConsistencyWithParam, FlushAfterIntraL0CompactionWithIngestedFile)`corresponding to the above 4 cases, which will fail accordingly without the fix
      - Regular CI stress run on this PR + stress test with aggressive value https://github.com/facebook/rocksdb/pull/10761  and on FIFO compaction only
      
      Reviewed By: ajkr
      
      Differential Revision: D40090485
      
      Pulled By: hx235
      
      fbshipit-source-id: 52624186952ee7109117788741aeeac86b624a4f
      fc74abb4
    • S
      Run format check for *.h and *.cc files under java/ (#10851) · 2a551976
      sdong 提交于
      Summary:
      Run format check for .h and .cc files to clean the format
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10851
      
      Test Plan: Watch CI tests to pass
      
      Reviewed By: ajkr
      
      Differential Revision: D40649723
      
      fbshipit-source-id: 62d32cead0b3b8e6540e86d25451bd72642109eb
      2a551976
  7. 25 10月, 2022 8 次提交