1. 17 3月, 2023 1 次提交
  2. 16 3月, 2023 4 次提交
    • P
      Simplify tracking entries already in SecondaryCache (#11299) · ccaa3225
      Peter Dillinger 提交于
      Summary:
      In preparation for factoring secondary cache support out of individual Cache implementations, we can get rid of the "in secondary cache" flag on entries through a workable hack: when an entry is promoted from secondary, it is inserted in primary using a helper that lacks secondary cache support, thus preventing re-insertion into secondary cache through existing logic.
      
      This adds to the complexity of building CacheItemHelpers, because you always have to be able to get to an equivalent helper without secondary cache support, but that complexity is reasonably isolated within RocksDB typed_cache.h and test code.
      
      gcc-7 seems to have problems with constexpr constructor referencing `this` so removed constexpr support on CacheItemHelper.
      
      Also refactored some related test code to share common code / functionality.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11299
      
      Test Plan: existing tests
      
      Reviewed By: anand1976
      
      Differential Revision: D44101453
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 7a59d0a3938ee40159c90c3e65d7004f6a272345
      ccaa3225
    • N
      Add Microsoft Bing as a user (#11270) · 664dabda
      nccx 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11270
      
      Reviewed By: pdillinger
      
      Differential Revision: D43811584
      
      Pulled By: ajkr
      
      fbshipit-source-id: f27e55395644a469840785685646456f6b1452fc
      664dabda
    • H
      Add new stat rocksdb.table.open.prefetch.tail.read.bytes,... · bab5f9a6
      Hui Xiao 提交于
      Add new stat rocksdb.table.open.prefetch.tail.read.bytes, rocksdb.table.open.prefetch.tail.{miss|hit} (#11265)
      
      Summary:
      **Context/Summary:**
      We are adding new stats to measure behavior of prefetched tail size and look up into this buffer
      
      The stat collection is done in FilePrefetchBuffer but only for prefetched tail buffer during table open for now using FilePrefetchBuffer enum. It's cleaner than the alternative of implementing in upper-level call places of FilePrefetchBuffer for table open. It also has the benefit of extensible to other types of FilePrefetchBuffer if needed. See db bench for perf regression concern.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11265
      
      Test Plan:
      **- Piggyback on existing test**
      **- rocksdb.table.open.prefetch.tail.miss is harder to UT so I manually set prefetch tail read bytes to be small and run db bench.**
      ```
      ./db_bench -db=/tmp/testdb -statistics=true -benchmarks="fillseq" -key_size=32 -value_size=512 -num=5000 -write_buffer_size=655 -target_file_size_base=655 -disable_auto_compactions=false -compression_type=none -bloom_bits=3  -use_direct_reads=true
      ```
      ```
      rocksdb.table.open.prefetch.tail.read.bytes P50 : 4096.000000 P95 : 4096.000000 P99 : 4096.000000 P100 : 4096.000000 COUNT : 225 SUM : 921600
      rocksdb.table.open.prefetch.tail.miss COUNT : 91
      rocksdb.table.open.prefetch.tail.hit COUNT : 1034
      ```
      **- No perf regression observed in db_bench**
      
      SETUP command: create same db with ~900 files for pre-change/post-change.
      ```
      ./db_bench -db=/tmp/testdb -benchmarks="fillseq" -key_size=32 -value_size=512 -num=500000 -write_buffer_size=655360  -disable_auto_compactions=true -target_file_size_base=16777216 -compression_type=none
      ```
      TEST command 60 runs or til convergence: as suggested by anand1976 and akankshamahajan15, vary `seek_nexts` and `async_io` in testing.
      ```
      ./db_bench -use_existing_db=true -db=/tmp/testdb -statistics=false -cache_size=0 -cache_index_and_filter_blocks=false -benchmarks=seekrandom[-X60] -num=50000 -seek_nexts={10, 500, 1000} -async_io={0|1} -use_direct_reads=true
      ```
      async io = 0, direct io read = true
      
        | seek_nexts = 10, 30 runs | seek_nexts = 500, 12 runs | seek_nexts = 1000, 6 runs
      -- | -- | -- | --
      pre-post change | 4776 (± 28) ops/sec;   24.8 (± 0.1) MB/sec | 288 (± 1) ops/sec;   74.8 (± 0.4) MB/sec | 145 (± 4) ops/sec;   75.6 (± 2.2) MB/sec
      post-change | 4790 (± 32) ops/sec;   24.9 (± 0.2) MB/sec | 288 (± 3) ops/sec;   74.7 (± 0.8) MB/sec | 143 (± 3) ops/sec;   74.5 (± 1.6) MB/sec
      
      async io = 1, direct io read = true
        | seek_nexts = 10, 54 runs | seek_nexts = 500, 6 runs | seek_nexts = 1000, 4 runs
      -- | -- | -- | --
      pre-post change | 3350 (± 36) ops/sec;   17.4 (± 0.2) MB/sec | 264 (± 0) ops/sec;   68.7 (± 0.2) MB/sec | 138 (± 1) ops/sec;   71.8 (± 1.0) MB/sec
      post-change | 3358 (± 27) ops/sec;   17.4 (± 0.1) MB/sec  | 263 (± 2) ops/sec;   68.3 (± 0.8) MB/sec | 139 (± 1) ops/sec;   72.6 (± 0.6) MB/sec
      
      Reviewed By: ajkr
      
      Differential Revision: D43781467
      
      Pulled By: hx235
      
      fbshipit-source-id: a706a18472a8edb2b952bac3af40eec803537f2a
      bab5f9a6
    • P
      Misc cleanup of block cache code (#11291) · 601efe3c
      Peter Dillinger 提交于
      Summary:
      ... ahead of a larger change.
      * Rename confusingly named `is_in_sec_cache` to `kept_in_sec_cache`
      * Unify naming of "standalone" block cache entries (was "detached" in clock_cache)
      * Remove some unused definitions in clock_cache.h (leftover from a previous revision)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11291
      
      Test Plan: usual tests and CI, no behavior changes
      
      Reviewed By: anand1976
      
      Differential Revision: D43984642
      
      Pulled By: pdillinger
      
      fbshipit-source-id: b8bf0c5b90a932a88bcbdb413b2f256834aedf97
      601efe3c
  3. 15 3月, 2023 1 次提交
    • H
      Fix bug of prematurely excluded CF in atomic flush contains unflushed data... · 11cb6af6
      Hui Xiao 提交于
      Fix bug of prematurely excluded CF in atomic flush contains unflushed data that should've been included in the atomic flush (#11148)
      
      Summary:
      **Context:**
      Atomic flush should guarantee recoverability of all data of seqno up to the max seqno of the flush. It achieves this by ensuring all such data are flushed by the time this atomic flush finishes through `SelectColumnFamiliesForAtomicFlush()`. However, our crash test exposed the following case where an excluded CF from an atomic flush contains unflushed data of seqno less than the max seqno of that atomic flush and loses its data with `WriteOptions::DisableWAL=true` in face of a crash right after the atomic flush finishes .
      ```
      ./db_stress --preserve_unverified_changes=1 --reopen=0 --acquire_snapshot_one_in=0 --adaptive_readahead=1 --allow_data_in_errors=True --async_io=1 --atomic_flush=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=15 --bottommost_compression_type=none --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=1 --charge_filter_construction=0 --charge_table_reader=0 --checkpoint_one_in=0 --checksum_type=kXXH3 --clear_column_family_one_in=0 --compact_files_one_in=0 --compact_range_one_in=0 --compaction_pri=1 --compaction_ttl=100 --compression_max_dict_buffer_bytes=134217727 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=lz4hc --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=$db --db_write_buffer_size=1048576 --delpercent=4 --delrangepercent=1 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir=$exp --fail_if_options_file_error=0 --fifo_allow_compaction=0 --file_checksum_impl=none --flush_one_in=0 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=100 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=2 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=524288 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --long_running_snapshots=1 --manual_wal_flush_one_in=100 --mark_for_compaction_one_file_in=0 --max_auto_readahead_size=0 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=10000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=64 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_protection_bytes_per_key=4 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=0 --periodic_compaction_seconds=100 --prefix_size=8 --prefixpercent=5 --prepopulate_block_cache=0 --preserve_internal_time_seconds=3600 --progress_reports=0 --read_fault_one_in=32 --readahead_size=16384 --readpercent=50 --recycle_log_file_num=0 --ribbon_starting_level=6 --secondary_cache_fault_one_in=0 --set_options_one_in=10000 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=1048576 --stats_dump_period_sec=10 --subcompactions=1 --sync=0 --sync_fault_injection=0 --target_file_size_base=524288 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=0 --unpartitioned_pinning=1 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_merge=0 --use_multiget=1 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=0 --verify_db_one_in=1000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=524288 --wal_compression=none --write_buffer_size=524288 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=30 &
          pid=$!
          sleep 0.2
          sleep 10
          kill $pid
          sleep 0.2
      ./db_stress --ops_per_thread=1 --preserve_unverified_changes=1 --reopen=0 --acquire_snapshot_one_in=0 --adaptive_readahead=1 --allow_data_in_errors=True --async_io=1 --atomic_flush=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=15 --bottommost_compression_type=none --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=1 --charge_filter_construction=0 --charge_table_reader=0 --checkpoint_one_in=0 --checksum_type=kXXH3 --clear_column_family_one_in=0 --compact_files_one_in=0 --compact_range_one_in=0 --compaction_pri=1 --compaction_ttl=100 --compression_max_dict_buffer_bytes=134217727 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=lz4hc --compression_use_zstd_dict_trainer=0 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=$db --db_write_buffer_size=1048576 --delpercent=4 --delrangepercent=1 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir=$exp --fail_if_options_file_error=0 --fifo_allow_compaction=0 --file_checksum_impl=none --flush_one_in=0 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=100 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=2 --index_type=0 --ingest_external_file_one_in=0 --initial_auto_readahead_size=524288 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --long_running_snapshots=1 --manual_wal_flush_one_in=100 --mark_for_compaction_one_file_in=0 --max_auto_readahead_size=0 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=10000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=64 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_protection_bytes_per_key=4 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=0 --periodic_compaction_seconds=100 --prefix_size=8 --prefixpercent=5 --prepopulate_block_cache=0 --preserve_internal_time_seconds=3600 --progress_reports=0 --read_fault_one_in=32 --readahead_size=16384 --readpercent=50 --recycle_log_file_num=0 --ribbon_starting_level=6 --secondary_cache_fault_one_in=0 --set_options_one_in=10000 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=1048576 --stats_dump_period_sec=10 --subcompactions=1 --sync=0 --sync_fault_injection=0 --target_file_size_base=524288 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=0 --unpartitioned_pinning=1 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_merge=0 --use_multiget=1 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=0 --verify_db_one_in=1000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=524288 --wal_compression=none --write_buffer_size=524288 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=30 &
          pid=$!
          sleep 0.2
          sleep 40
          kill $pid
          sleep 0.2
      
      Verification failed for column family 6 key 0000000000000239000000000000012B0000000000000138 (56622): value_from_db: , value_from_expected: 4A6331754E4F4C4D42434041464744455A5B58595E5F5C5D5253505156575455, msg: Value not found: NotFound:
      Crash-recovery verification failed :(
      No writes or ops?
      Verification failed :(
      ```
      
      The bug is due to the following:
      - When atomic flush is used, an empty CF is legally [excluded](https://github.com/facebook/rocksdb/blob/7.10.fb/db/db_filesnapshot.cc#L39) in `SelectColumnFamiliesForAtomicFlush` as the first step of `DBImpl::FlushForGetLiveFiles` before [passing](https://github.com/facebook/rocksdb/blob/7.10.fb/db/db_filesnapshot.cc#L42) the included CFDs to `AtomicFlushMemTables`.
      - But [later](https://github.com/facebook/rocksdb/blob/7.10.fb/db/db_impl/db_impl_compaction_flush.cc#L2133) in `AtomicFlushMemTables`, `WaitUntilFlushWouldNotStallWrites` will [release the db mutex](https://github.com/facebook/rocksdb/blob/7.10.fb/db/db_impl/db_impl_compaction_flush.cc#L2403), during which data@seqno N can be inserted into the excluded CF and data@seqno M can be inserted into one of the included CFs, where M > N.
      - However, data@seqno N in an already-excluded CF is thus excluded from this atomic flush while we seqno N is less than seqno M.
      
      **Summary:**
      - Replace `SelectColumnFamiliesForAtomicFlush()`-before-`AtomicFlushMemTables()` with `SelectColumnFamiliesForAtomicFlush()`-after-wait-within-`AtomicFlushMemTables()` so we ensure no write affecting the recoverability of this atomic job (i.e, change to max seqno of this atomic flush or insertion of data with less seqno than the max seqno of the atomic flush to excluded CF) can happen after calling `SelectColumnFamiliesForAtomicFlush()`.
      - For above, refactored and clarified comments on `SelectColumnFamiliesForAtomicFlush()` and `AtomicFlushMemTables()` for clearer semantics of passed-in CFDs to atomic-flush
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11148
      
      Test Plan:
      - New unit test failed before the fix and passes after
      - Make check
      - Rehearsal stress test
      
      Reviewed By: ajkr
      
      Differential Revision: D42799871
      
      Pulled By: hx235
      
      fbshipit-source-id: 13636b63e9c25c5895857afc36ea580d57f6d644
      11cb6af6
  4. 14 3月, 2023 4 次提交
    • P
      Use CacheWrapper in more places (#11295) · 2a23bee9
      Peter Dillinger 提交于
      Summary:
      ... to simplify code and make it less prone to needless updates on refactoring.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11295
      
      Test Plan: existing tests (no functional changes intended)
      
      Reviewed By: hx235
      
      Differential Revision: D44040260
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 1b6badb5c8ca673db0903bfaba3cfbc986f386be
      2a23bee9
    • L
      Rename a recently added PerfContext counter (#11294) · 49881921
      Levi Tamasi 提交于
      Summary:
      The patch renames the counter added in https://github.com/facebook/rocksdb/issues/11284 for better consistency with the existing naming scheme.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11294
      
      Test Plan: `make check`
      
      Reviewed By: jowlyzhang
      
      Differential Revision: D44035964
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 8b1a2a03ee728148365367e0ecc1fcf462f62191
      49881921
    • P
      Document DB::Resume(), fix LockWALInEffect test (#11290) · 648e972f
      Peter Dillinger 提交于
      Summary:
      In rare cases seeing failures like this
      
      ```
      [ RUN      ] DBWriteTestInstance/DBWriteTest.LockWALInEffect/2
      db/db_write_test.cc:653: Failure
      Put("key3", "value")
      Corruption: Not active
      ```
      
      in a test with no explicit threading. This is likely because of the unpredictability of background auto-resume. I didn't really know this feature, in part because DB::Resume() was undocumented. So I believe I have fixed the test and documented the API function.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11290
      
      Test Plan: 1000s of stress runs of the test with gtest-parallel
      
      Reviewed By: anand1976
      
      Differential Revision: D43984583
      
      Pulled By: pdillinger
      
      fbshipit-source-id: d30dec120b4864e193751b2e33ff16834d313db3
      648e972f
    • C
      Support range deletion tombstones in `CreateColumnFamilyWithImport` (#11252) · 9aa3b6f9
      Changyu Bi 提交于
      Summary:
      CreateColumnFamilyWithImport() did not support range tombstones for two reasons:
      1. it uses point keys of a input file to determine its boundary (smallest and largest internal key), which means range tombstones outside of the point key range will be effectively dropped.
      2. it does not handle files with no point keys.
      
      Also included a fix in external_sst_file_ingestion_job.cc where the blocks read in `GetIngestedFileInfo()` can be added to block cache now (issue fixed in https://github.com/facebook/rocksdb/pull/6429).
      
      This PR adds support for exporting and importing column family with range tombstones. The main change is to add smallest internal key and largest internal key to `SstFileMetaData` that will be part of the output of `ExportColumnFamily()`. Then during `CreateColumnFamilyWithImport(...,const ExportImportFilesMetaData& metadata,...)`, file boundaries can be set from `metadata` directly. This is needed since when file boundaries are extended by range tombstones, sometimes they cannot be deduced from a file's content alone.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11252
      
      Test Plan:
      - added unit tests that fails before this change
      
      Closes https://github.com/facebook/rocksdb/issues/11245
      
      Reviewed By: ajkr
      
      Differential Revision: D43577443
      
      Pulled By: cbi42
      
      fbshipit-source-id: 6bff78e583cc50c44854994dea0a8dd519398f2f
      9aa3b6f9
  5. 11 3月, 2023 1 次提交
    • A
      Reverse wrong order of parameter names for Java WriteBatchWithIndex#iteratorWithBase (#11280) · fbd603d0
      Alan Paxton 提交于
      Summary:
      Fix for https://github.com/facebook/rocksdb/issues/11008
      
      `Java_org_rocksdb_WriteBatchWithIndex_iteratorWithBase` takes parameters `(… jlong jwbwi_handle, jlong jcf_handle,
          jlong jbase_iterator_handle, jlong jread_opts_handle)` while `WriteBatchWithIndex.java` declares `private native long iteratorWithBase(final long handle, final long baseIteratorHandle,
            final long cfHandle, final long readOptionsHandle)`.
      
      Luckily the only call to `iteratorWithBase` passes the parameters in the correct order for the implementation `(… cfHandle, baseIteratorHandle …)` This type checks because the types are the same (long words).
      
      The code is currently used correctly, it is just extremely misleading. Swap the names of the 2 parameters in the Java method so that the correct usage is clear.
      
      There already exist test methods which call the API correctly and only succeed because of that. These continue to work.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11280
      
      Reviewed By: cbi42
      
      Differential Revision: D43874798
      
      Pulled By: ajkr
      
      fbshipit-source-id: b59bc930bf579f4e0804f0effd4fb17f4225d60c
      fbd603d0
  6. 10 3月, 2023 4 次提交
    • J
      Fix compile errors in Clang due to unused variables depending on the build configuration (#11234) · 969d4e1d
      Jaepil Jeong 提交于
      Summary:
      This PR fixes compilation errors in Clang due to unused variables like the below:
      ```
      [109/329] Building CXX object CMakeFiles/rocksdb.dir/db/version_edit_handler.cc.o
      FAILED: CMakeFiles/rocksdb.dir/db/version_edit_handler.cc.o
      ccache /opt/homebrew/opt/llvm/bin/clang++ -DGFLAGS=1 -DGFLAGS_IS_A_DLL=0 -DHAVE_FULLFSYNC -DJEMALLOC_NO_DEMANGLE -DLZ4 -DOS_MACOSX -DROCKSDB_JEMALLOC -DROCKSDB_LIB_IO_POSIX -DROCKSDB_NO_DYNAMIC_EXTENSION -DROCKSDB_PLATFORM_POSIX -DSNAPPY -DTBB -DZLIB -DZSTD -I/Users/jaepil/work/deepsearch/deps/cpp/rocksdb -I/Users/jaepil/work/deepsearch/deps/cpp/rocksdb/include -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -I/Users/jaepil/app/include -I/opt/homebrew/include -I/opt/homebrew/opt/llvm/include -I/opt/homebrew/opt/llvm/include/c++/v1 -W -Wextra -Wall -pthread -Wsign-compare -Wshadow -Wno-unused-parameter -Wno-unused-variable -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers -Wno-strict-aliasing -Wno-invalid-offsetof -fno-omit-frame-pointer -momit-leaf-frame-pointer -march=armv8-a+crc+crypto -Wno-unused-function -Werror -O2 -g -DNDEBUG -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.1.sdk -std=gnu++20 -MD -MT CMakeFiles/rocksdb.dir/db/version_edit_handler.cc.o -MF CMakeFiles/rocksdb.dir/db/version_edit_handler.cc.o.d -o CMakeFiles/rocksdb.dir/db/version_edit_handler.cc.o -c /Users/jaepil/work/deepsearch/deps/cpp/rocksdb/db/version_edit_handler.cc
      /Users/jaepil/work/deepsearch/deps/cpp/rocksdb/db/version_edit_handler.cc:30:10: error: variable 'recovered_edits' set but not used [-Werror,-Wunused-but-set-variable]
        size_t recovered_edits = 0;
               ^
      1 error generated.
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11234
      
      Reviewed By: cbi42
      
      Differential Revision: D43458604
      
      Pulled By: ajkr
      
      fbshipit-source-id: d8c50e1a108887b037a120cd9f19374ddaeee817
      969d4e1d
    • Z
      DBWithTTLImpl::IsStale overflow when ttl is 15 years (#11279) · 7a07afe8
      zhangliangkai1992 提交于
      Summary:
      Fix DBWIthTTLImpl::IsStale overflow
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11279
      
      Reviewed By: cbi42
      
      Differential Revision: D43875039
      
      Pulled By: ajkr
      
      fbshipit-source-id: 3e5feb8c4c4480bf1421b0763ade3d2e459ec028
      7a07afe8
    • A
      Add instructions for installing googlebenchmark (#11282) · daeec505
      Alan Paxton 提交于
      Summary:
      Per the discussion in https://groups.google.com/g/rocksdb/c/JqhlvSs6ZEs/m/bnXZ7Q--AAAJ
      It seems non-obvious that googlebenchmark must be installed manually before microbenchmarks can be run. I have added more detail to the installation instructions to make it clearer.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11282
      
      Reviewed By: cbi42
      
      Differential Revision: D43874724
      
      Pulled By: ajkr
      
      fbshipit-source-id: f64a4ac4914cb057955d1ca965885f8822ca7764
      daeec505
    • A
      Fix hang in async_io benchmarks in regression script (#11285) · 1de69762
      akankshamahajan 提交于
      Summary:
      Fix hang in async_io benchmarks in regression script. I changed the order of benchmarks and that somehow fixed the issue of hang.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11285
      
      Test Plan: Ran it manually
      
      Reviewed By: pdillinger
      
      Differential Revision: D43937431
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 7c43075d3be6b8f41d08e845664012768b769661
      1de69762
  7. 09 3月, 2023 1 次提交
    • L
      Add a PerfContext counter for merge operands applied in point lookups (#11284) · 1d524385
      Levi Tamasi 提交于
      Summary:
      The existing PerfContext counter `internal_merge_count` only tracks the
      Merge operands applied during range scans. The patch adds a new counter
      called `internal_merge_count_point_lookups` to track the same metric
      for point lookups (`Get` / `MultiGet` / `GetEntity` / `MultiGetEntity`), and
      also fixes a couple of cases in the iterator where the existing counter wasn't
      updated.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11284
      
      Test Plan: `make check`
      
      Reviewed By: jowlyzhang
      
      Differential Revision: D43926082
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 321566d8b4cf0a3b6c9b73b7a5c984fb9bb492e9
      1d524385
  8. 08 3月, 2023 1 次提交
  9. 07 3月, 2023 3 次提交
    • P
      Tests verifying non-zero checksums of zero bytes (#11260) · e0107325
      Peter Dillinger 提交于
      Summary:
      Adds unit tests verifying that a block payload and checksum of all zeros is not falsely considered valid data. The test exhaustively checks that for blocks up to some length (default 20K, more exhaustively 10M) of all zeros do not produce a block checksum of all zeros.
      
      Also small refactoring of an existing checksum test to use parameterized test. (Suggest hiding whitespace changes for review.)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11260
      
      Test Plan:
      this is the test, manual run with
      `ROCKSDB_THOROUGH_CHECKSUM_TEST=1` to verify up to 10M.
      
      Reviewed By: hx235
      
      Differential Revision: D43706192
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 95e721c320ca928e7fa2400c2570fb359cc30b1f
      e0107325
    • A
      Add support for parameters setting related to async_io benchmarks (#11262) · 13357de0
      akankshamahajan 提交于
      Summary:
      Provide support in benchmark regression to use different options to be used in async_io benchamark only - "$`MAX_READAHEAD_SIZE`", $`INITIAL_READAHEAD_SIZE`", "$`NUM_READS_FOR_READAHEAD_SIZE`".
      If user wants to run set these parameters for all benchmarks then these parameters need to be set in OPTION file instead.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11262
      
      Test Plan: Ran manually
      
      Reviewed By: anand1976
      
      Differential Revision: D43725567
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 28c3462dd785ffd646d44560fa9c92bc6a8066e5
      13357de0
    • L
      Deflake/fix BlobSourceCacheReservationTest.IncreaseCacheReservationOnFullCache (#11273) · a1a3b233
      Levi Tamasi 提交于
      Summary:
      `BlobSourceCacheReservationTest.IncreaseCacheReservationOnFullCache` is both flaky and also doesn't do what its name says. The patch changes this test so it actually tests increasing the cache reservation, hopefully also deflaking it in the process.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11273
      
      Test Plan: `make check`
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D43800935
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 5eb54130dfbe227285b0e14f2084aa4b89f0b107
      a1a3b233
  10. 06 3月, 2023 2 次提交
  11. 04 3月, 2023 3 次提交
    • I
      Avoid ColumnFamilyDescriptor copy (#10978) · ddde1e6a
      Igor Canadi 提交于
      Summary:
      Hi. :) Noticed we are copying ColumnFamilyDescriptor here because my process crashed during copy constructor (cause unrelated)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10978
      
      Reviewed By: cbi42
      
      Differential Revision: D41473924
      
      Pulled By: ajkr
      
      fbshipit-source-id: 58a3473f2d7b24918f79d4b2726c20081c5e95b4
      ddde1e6a
    • C
      Improve documentation for MergingIterator (#11161) · d053926f
      Changyu Bi 提交于
      Summary:
      Add some comments to try to explain how/why MergingIterator works. Made some small refactoring, mostly in MergingIterator::SkipNextDeleted() and MergingIterator::SeekImpl().
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11161
      
      Test Plan:
      crash test with small key range:
      ```
      python3 tools/db_crashtest.py blackbox --simple --max_key=100 --interval=6000 --write_buffer_size=262144 --target_file_size_base=256 --max_bytes_for_level_base=262144 --block_size=128 --value_size_mult=33 --subcompactions=10 --use_multiget=1 --delpercent=3 --delrangepercent=2 --verify_iterator_with_expected_state_one_in=2 --num_iterations=10
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D42860994
      
      Pulled By: cbi42
      
      fbshipit-source-id: 3f0c1c9c6481a7f468bf79d823998907a8116e9e
      d053926f
    • L
      Fix/clarify/extend the API comments of CompactionFilter (#11261) · 95d67f36
      Levi Tamasi 提交于
      Summary:
      The patch makes the following changes to the API comments:
      * Some general comments about snapshots, thread safety, and user-defined timestamps are moved to a more prominent place at the top of the file.
      * Detailed descriptions are added for each `ValueType` and `Decision`, fixing and extending some existing comments (e.g. that of `kRemove`, which suggested that key-values are simply removed from the output, while in reality base values are converted to tombstones) and adding detailed comments that were missing (e.g. `kPurge` and `kChangeWideColumnEntity`).
      * Updated/extended the comments of `FilterV2/V3` and `FilterBlobByKey`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11261
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D43714314
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 835f4b1bdac1ce0e291155186095211303260729
      95d67f36
  12. 02 3月, 2023 1 次提交
    • Y
      Fix backward iteration issue when user defined timestamp is enabled in BlobDB (#11258) · 8dfcfd4e
      Yu Zhang 提交于
      Summary:
      During backward iteration, blob verification would fail because the user key (ts included) in `saved_key_` doesn't match the blob. This happens because during`FindValueForCurrentKey`, `saved_key_` is not updated when the user key(ts not included) is the same for all cases except when `timestamp_lb_` is specified. This breaks the blob verification logic when user defined timestamp is enabled and `timestamp_lb_` is not specified. Fix this by always updating `saved_key_` when a smaller user key (ts included) is seen.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11258
      
      Test Plan:
      `make check`
      `./db_blob_basic_test --gtest_filter=DBBlobWithTimestampTest.IterateBlobs`
      
      Run db_bench (built with DEBUG_LEVEL=0) to demonstrate that no overhead is introduced with:
      
      `./db_bench -user_timestamp_size=8  -db=/dev/shm/rocksdb -disable_wal=1 -benchmarks=fillseq,seekrandom[-W1-X6] -reverse_iterator=1 -seek_nexts=5`
      
      Baseline:
      
      - seekrandom [AVG    6 runs] : 72188 (± 1481) ops/sec;   37.2 (± 0.8) MB/sec
      
      With this PR:
      
      - seekrandom [AVG    6 runs] : 74171 (± 1427) ops/sec;   38.2 (± 0.7) MB/sec
      
      Reviewed By: ltamasi
      
      Differential Revision: D43675642
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 8022ae8522d1f66548821855e6eed63640c14e04
      8dfcfd4e
  13. 01 3月, 2023 1 次提交
  14. 28 2月, 2023 1 次提交
  15. 25 2月, 2023 1 次提交
  16. 24 2月, 2023 1 次提交
    • Y
      Fix a TestGet failure when user defined timestamp is enabled (#11249) · af7872ff
      Yu Zhang 提交于
      Summary:
      Stressing small DB with small number of keys and user-defined timestamp enabled usually fails pretty quickly in TestGet.
      
      Example command to reproduce the failure:
      
      ` tools/db_crashtest.py blackbox --enable_ts --simple --delrangepercent=0 --delpercent=5 --max_key=100 --interval=3 --write_buffer_size=262144 --target_file_size_base=262144 --max_bytes_for_level_base=262144 --subcompactions=1`
      
      Example failure: `error : inconsistent values for key 0000000000000009000000000000000A7878: expected state has the key, Get() returns NotFound.`
      
      Fixes this test failure by refreshing the read up to timestamp to the most up to date timestamp, a.k.a now, after a key is locked.  Without this, things could happen in this order and cause a test failure:
      
      <table>
        <tr>
          <th>TestGet thread</th>
          <th> A writing thread</th>
        </tr>
        <tr>
          <td>read_opts.timestamp = GetNow()</td>
          <td></td>
        </tr>
        <tr>
          <td></td>
          <td>Lock key, do write</td>
        </tr>
        <tr>
          <td>Lock key, read(read_opts) return NotFound</td>
          <td></td>
        </tr>
      </table>
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11249
      
      Reviewed By: ltamasi
      
      Differential Revision: D43551302
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 26877ab379bdb97acd2682a2632bc29718427f38
      af7872ff
  17. 23 2月, 2023 2 次提交
    • Y
      Support iter_start_ts in integrated BlobDB (#11244) · f007b8fd
      Yu Zhang 提交于
      Summary:
      Fixed an issue during backward iteration when `iter_start_ts` is set in an integrated BlobDB.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11244
      
      Test Plan:
      ```make check
      ./db_blob_basic_test --gtest_filter="DBBlobWithTimestampTest.IterateBlobs"
      tools/db_crashtest.py --stress_cmd=./db_stress --cleanup_cmd='' --enable_ts whitebox --random_kill_odd 888887 --enable_blob_files=1```
      
      Reviewed By: ltamasi
      
      Differential Revision: D43506726
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 2cdc19ebf8da909d8d43d621353905784949a9f0
      f007b8fd
    • C
      Refactor AddRangeDels() + consider range tombstone during compaction file cutting (#11113) · 229297d1
      Changyu Bi 提交于
      Summary:
      A second attempt after https://github.com/facebook/rocksdb/issues/10802, with bug fixes and refactoring. This PR updates compaction logic to take range tombstones into account when determining whether to cut the current compaction output file (https://github.com/facebook/rocksdb/issues/4811). Before this change, only point keys were considered, and range tombstones could cause large compactions. For example, if the current compaction outputs is a range tombstone [a, b) and 2 point keys y, z, they would be added to the same file, and may overlap with too many files in the next level and cause a large compaction in the future. This PR also includes ajkr's effort to simplify the logic to add range tombstones to compaction output files in `AddRangeDels()` ([https://github.com/facebook/rocksdb/issues/11078](https://github.com/facebook/rocksdb/pull/11078#issuecomment-1386078861)).
      
      The main change is for `CompactionIterator` to emit range tombstone start keys to be processed by `CompactionOutputs`. A new class `CompactionMergingIterator` is introduced to replace `MergingIterator` under `CompactionIterator` to enable emitting of range tombstone start keys. Further improvement after this PR include cutting compaction output at some grandparent boundary key (instead of the next output key) when cutting within a range tombstone to reduce overlap with grandparents.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11113
      
      Test Plan:
      * added unit test in db_range_del_test
      * crash test with a small key range: `python3 tools/db_crashtest.py blackbox --simple --max_key=100 --interval=600 --write_buffer_size=262144 --target_file_size_base=256 --max_bytes_for_level_base=262144 --block_size=128 --value_size_mult=33 --subcompactions=10 --use_multiget=1 --delpercent=3 --delrangepercent=2 --verify_iterator_with_expected_state_one_in=2 --num_iterations=10`
      
      Reviewed By: ajkr
      
      Differential Revision: D42655709
      
      Pulled By: cbi42
      
      fbshipit-source-id: 8367e36ef5640e8f21c14a3855d4a8d6e360a34c
      229297d1
  18. 22 2月, 2023 8 次提交
    • Y
      fix -Wrange-loop-analysis in Apple clang version 12.0.0 (clang-1200.0.32.29) (#11240) · 9fa9becf
      ywave 提交于
      Summary:
      Fix complain
      ```
      db/db_impl/db_impl_compaction_flush.cc:417:19: error: loop variable 'bg_flush_arg' of type 'const rocksdb::DBImpl::BGFlushArg' creates a copy from type
            'const rocksdb::DBImpl::BGFlushArg' [-Werror,-Wrange-loop-analysis]
        for (const auto bg_flush_arg : bg_flush_args) {
                        ^
      db/db_impl/db_impl_compaction_flush.cc:417:8: note: use reference type 'const rocksdb::DBImpl::BGFlushArg &' to prevent copying
        for (const auto bg_flush_arg : bg_flush_args) {
             ^~~~~~~~~~~~~~~~~~~~~~~~~
                        &
      db/db_impl/db_impl_compaction_flush.cc:2911:21: error: loop variable 'bg_flush_arg' of type 'const rocksdb::DBImpl::BGFlushArg' creates a copy from type
            'const rocksdb::DBImpl::BGFlushArg' [-Werror,-Wrange-loop-analysis]
          for (const auto bg_flush_arg : bg_flush_args) {
                          ^
      db/db_impl/db_impl_compaction_flush.cc:2911:10: note: use reference type 'const rocksdb::DBImpl::BGFlushArg &' to prevent copying
          for (const auto bg_flush_arg : bg_flush_args) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~
                          &
      ```
      from
      
      ```sh
      xxx@MacBook-Pro / % g++ -v
      Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
      Apple clang version 12.0.0 (clang-1200.0.32.29)
      Target: x86_64-apple-darwin21.6.0
      Thread model: posix
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11240
      
      Reviewed By: cbi42
      
      Differential Revision: D43458729
      
      Pulled By: ajkr
      
      fbshipit-source-id: 26e110f83451509463a1bc308f737ccb693c9f45
      9fa9becf
    • A
      Update HISTORY.md and version.h for 8.0 release (#11238) · 28608045
      Andrew Kryczka 提交于
      Summary:
      8.0.fb branch is cut so changes going forward will be part of 8.1. Updated version.h and HISTORY.md accordingly
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11238
      
      Reviewed By: cbi42
      
      Differential Revision: D43428345
      
      Pulled By: ajkr
      
      fbshipit-source-id: d344b6e504c81a85563ae9d3705b11c533b1cd43
      28608045
    • A
      Revert enabling IO uring in db_stress (#11242) · 476b0157
      anand76 提交于
      Summary:
      IO uring usage is causing crash test failures due to bad cqe data being returned in the uring. Revert the change to enable IO uring in db_stress, and also re-enable async_io in CircleCI so that code path can be tested. Added the -use_io_uring flag to db_stress that, when false, will wrap the default env in db_stress to emulate async IO.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11242
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D43470569
      
      Pulled By: anand1976
      
      fbshipit-source-id: 7c69ac3f53a79ade31d37313f815f1a4b6108b75
      476b0157
    • C
      Fix an assertion failure in DBIter::SeekToLast() when user-defined timestamp is enabled (#11223) · 1b48ecc2
      Changyu Bi 提交于
      Summary:
      in DBIter::SeekToLast(), key() can be called when iter is invalid and fails the following assertion:
      ```
      ./db/db_iter.h:153: virtual rocksdb::Slice rocksdb::DBIter::key() const: Assertion `valid_' failed.
      ```
      This happens when `iterate_upper_bound` and timestamp_lb_ are set. SeekForPrev(*iterate_upper_bound_) positions the iterator on the same user key as *iterate_upper_bound_. A subsequent PrevInternal() call makes the iterator invalid just be the call to key().
      
      This PR fixes this issue by setting updating the seek key to have max sequence number AND max timestamp when the seek key has the same user key as *iterate_upper_bound_.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11223
      
      Test Plan: - Added a unit test that would fail the above assertion before this fix.
      
      Reviewed By: jowlyzhang
      
      Differential Revision: D43283600
      
      Pulled By: cbi42
      
      fbshipit-source-id: 0dd3999845b722584679bbc95be2664b266005ba
      1b48ecc2
    • L
      DBIter::FindNextUserEntryInternal: do not PrepareValue for `Delete` (#11211) · ea85148b
      leipeng 提交于
      Summary:
      `kTypeDeletion/kTypeDeletionWithTimestamp/kTypeSingleDeletion` does not need access iter value, so omit `PrepareValue`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11211
      
      Reviewed By: ajkr
      
      Differential Revision: D43253068
      
      Pulled By: cbi42
      
      fbshipit-source-id: 1945c7f8a90b6909128a0553b62d9fd1078b0a08
      ea85148b
    • C
      Fix comment for option `periodic_compaction_seconds` (#11227) · ebfca2cf
      Changyu Bi 提交于
      Summary:
      the comment for option `periodic_compaction_seconds` only mentions support for Leveled and FIFO compaction, while the implementation supports all compaction styles after https://github.com/facebook/rocksdb/issues/5970. This PR updates comment to reflect this.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/11227
      
      Reviewed By: ajkr
      
      Differential Revision: D43325046
      
      Pulled By: cbi42
      
      fbshipit-source-id: 2364dcb5a01cd098ad52c818fe10d621445e2188
      ebfca2cf
    • H
      add c api to set option fail_if_not_bottommost_level (#11158) · 83bc03a9
      HuangYi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11158
      
      Reviewed By: cbi42
      
      Differential Revision: D42870647
      
      Pulled By: ajkr
      
      fbshipit-source-id: 1b71a1dd415c34c332cecf60c68ce37fe4393e2a
      83bc03a9
    • H
      add c api for HyperClockCache (#11110) · cfe50f7e
      HuangYi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/11110
      
      Reviewed By: cbi42
      
      Differential Revision: D42660941
      
      Pulled By: ajkr
      
      fbshipit-source-id: e977d9b76dfd5d8c62335f961c275f3b810503d7
      cfe50f7e