1. 21 5月, 2022 6 次提交
    • Y
      Point libprotobuf-mutator to the latest verified commit hash (#10028) · 899db56a
      Yanqin Jin 提交于
      Summary:
      Recent updates to https://github.com/google/libprotobuf-mutator has caused link errors for RocksDB
      CircleCI job 'build-fuzzers'. This PR points the CI to a specific, most recent verified commit hash.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10028
      
      Test Plan: watch for CI to finish.
      
      Reviewed By: pdillinger, jay-zhuang
      
      Differential Revision: D36562517
      
      Pulled By: riversand963
      
      fbshipit-source-id: ba5ef0f9ed6ea6a75aa5dd2768bd5f389ac14f46
      899db56a
    • Y
      Fix a bug of not setting enforce_single_del_contracts (#10027) · f648915b
      Yanqin Jin 提交于
      Summary:
      Before this PR, BuildDBOptions() does not set a newly-added option, i.e.
      enforce_single_del_contracts, causing OPTIONS files to contain incorrect
      information.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10027
      
      Test Plan:
      make check
      Manually check OPTIONS file.
      
      Reviewed By: ltamasi
      
      Differential Revision: D36556125
      
      Pulled By: riversand963
      
      fbshipit-source-id: e1074715b22c328b68c19e9ad89aa5d67d864bb5
      f648915b
    • A
      Seek parallelization (#9994) · 2db6a4a1
      Akanksha Mahajan 提交于
      Summary:
      The RocksDB iterator is a hierarchy of iterators. MergingIterator maintains a heap of LevelIterators, one for each L0 file and for each non-zero level. The Seek() operation naturally lends itself to parallelization, as it involves positioning every LevelIterator on the correct data block in the correct SST file. It lookups a level for a target key, to find the first key that's >= the target key. This typically involves reading one data block that is likely to contain the target key, and scan forward to find the first valid key. The forward scan may read more data blocks. In order to find the right data block, the iterator may read some metadata blocks (required for opening a file and searching the index).
      This flow can be parallelized.
      
      Design: Seek will be called two times under async_io option. First seek will send asynchronous request to prefetch the data blocks at each level and second seek will follow the normal flow and in FilePrefetchBuffer::TryReadFromCacheAsync it will wait for the Poll() to get the results and add the iterator to min_heap.
      - Status::TryAgain is passed down from FilePrefetchBuffer::PrefetchAsync to block_iter_.Status indicating asynchronous request has been submitted.
      - If for some reason asynchronous request returns error in submitting the request, it will fallback to sequential reading of blocks in one pass.
      - If the data already exists in prefetch_buffer, it will return the data without prefetching further and it will be treated as single pass of seek.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9994
      
      Test Plan:
      - **Run Regressions.**
      ```
      ./db_bench -db=/tmp/prefix_scan_prefetch_main -benchmarks="fillseq" -key_size=32 -value_size=512 -num=5000000 -use_direct_io_for_flush_and_compaction=true -target_file_size_base=16777216
      ```
      i) Previous release 7.0 run for normal prefetching with async_io disabled:
      ```
      ./db_bench -use_existing_db=true -db=/tmp/prefix_scan_prefetch_main -benchmarks="seekrandom" -key_size=32 -value_size=512 -num=5000000 -use_direct_reads=true -seek_nexts=327680 -duration=120 -ops_between_duration_checks=1
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      RocksDB:    version 7.0
      Date:       Thu Mar 17 13:11:34 2022
      CPU:        24 * Intel Core Processor (Broadwell)
      CPUCache:   16384 KB
      Keys:       32 bytes each (+ 0 bytes user-defined timestamp)
      Values:     512 bytes each (256 bytes after compression)
      Entries:    5000000
      Prefix:    0 bytes
      Keys per prefix:    0
      RawSize:    2594.0 MB (estimated)
      FileSize:   1373.3 MB (estimated)
      Write rate: 0 bytes/second
      Read rate: 0 ops/second
      Compression: Snappy
      Compression sampling rate: 0
      Memtablerep: SkipListFactory
      Perf Level: 1
      ------------------------------------------------
      DB path: [/tmp/prefix_scan_prefetch_main]
      seekrandom   :  483618.390 micros/op 2 ops/sec;  338.9 MB/s (249 of 249 found)
      ```
      
      ii) normal prefetching after changes with async_io disable:
      ```
      ./db_bench -use_existing_db=true -db=/tmp/prefix_scan_prefetch_main -benchmarks="seekrandom" -key_size=32 -value_size=512 -num=5000000 -use_direct_reads=true -seek_nexts=327680 -duration=120 -ops_between_duration_checks=1
      Set seed to 1652922591315307 because --seed was 0
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      RocksDB:    version 7.3
      Date:       Wed May 18 18:09:51 2022
      CPU:        32 * Intel Xeon Processor (Skylake)
      CPUCache:   16384 KB
      Keys:       32 bytes each (+ 0 bytes user-defined timestamp)
      Values:     512 bytes each (256 bytes after compression)
      Entries:    5000000
      Prefix:    0 bytes
      Keys per prefix:    0
      RawSize:    2594.0 MB (estimated)
      FileSize:   1373.3 MB (estimated)
      Write rate: 0 bytes/second
      Read rate: 0 ops/second
      Compression: Snappy
      Compression sampling rate: 0
      Memtablerep: SkipListFactory
      Perf Level: 1
      ------------------------------------------------
      DB path: [/tmp/prefix_scan_prefetch_main]
      seekrandom   :  483080.466 micros/op 2 ops/sec 120.287 seconds 249 operations;  340.8 MB/s (249 of 249 found)
      ```
      iii) db_bench with async_io enabled completed succesfully
      
      ```
      ./db_bench -use_existing_db=true -db=/tmp/prefix_scan_prefetch_main -benchmarks="seekrandom" -key_size=32 -value_size=512 -num=5000000 -use_direct_reads=true -seek_nexts=327680 -duration=120 -ops_between_duration_checks=1 -async_io=1 -adaptive_readahead=1
      Set seed to 1652924062021732 because --seed was 0
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      RocksDB:    version 7.3
      Date:       Wed May 18 18:34:22 2022
      CPU:        32 * Intel Xeon Processor (Skylake)
      CPUCache:   16384 KB
      Keys:       32 bytes each (+ 0 bytes user-defined timestamp)
      Values:     512 bytes each (256 bytes after compression)
      Entries:    5000000
      Prefix:    0 bytes
      Keys per prefix:    0
      RawSize:    2594.0 MB (estimated)
      FileSize:   1373.3 MB (estimated)
      Write rate: 0 bytes/second
      Read rate: 0 ops/second
      Compression: Snappy
      Compression sampling rate: 0
      Memtablerep: SkipListFactory
      Perf Level: 1
      ------------------------------------------------
      DB path: [/tmp/prefix_scan_prefetch_main]
      seekrandom   :  553913.576 micros/op 1 ops/sec 120.199 seconds 217 operations;  293.6 MB/s (217 of 217 found)
      ```
      
      - db_stress with async_io disabled completed succesfully
      ```
       export CRASH_TEST_EXT_ARGS=" --async_io=0"
       make crash_test -j
      ```
      
      I**n Progress**: db_stress with async_io is failing and working on debugging/fixing it.
      
      Reviewed By: anand1976
      
      Differential Revision: D36459323
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: abb1cd944abe712bae3986ae5b16704b3338917c
      2db6a4a1
    • A
      Fix crash due to MultiGet async IO and direct IO (#10024) · e015206d
      anand76 提交于
      Summary:
      MultiGet with async IO is not officially supported with Posix yet. Avoid a crash by using synchronous MultiRead when direct IO is enabled.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10024
      
      Test Plan: Run db_crashtest.py manually
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D36551053
      
      Pulled By: anand1976
      
      fbshipit-source-id: 72190418fa92dd0397e87825df618b12c9bdecda
      e015206d
    • C
      Support using ZDICT_finalizeDictionary to generate zstd dictionary (#9857) · cc23b46d
      Changyu Bi 提交于
      Summary:
      An untrained dictionary is currently simply the concatenation of several samples. The ZSTD API, ZDICT_finalizeDictionary(), can improve such a dictionary's effectiveness at low cost. This PR changes how dictionary is created by calling the ZSTD ZDICT_finalizeDictionary() API instead of creating raw content dictionary (when max_dict_buffer_bytes > 0), and pass in all buffered uncompressed data blocks as samples.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9857
      
      Test Plan:
      #### db_bench test for cpu/memory of compression+decompression and space saving on synthetic data:
      Set up: change the parameter [here](https://github.com/facebook/rocksdb/blob/fb9a167a55e0970b1ef6f67c1600c8d9c4c6114f/tools/db_bench_tool.cc#L1766) to 16384 to make synthetic data more compressible.
      ```
      # linked local ZSTD with version 1.5.2
      # DEBUG_LEVEL=0 ROCKSDB_NO_FBCODE=1 ROCKSDB_DISABLE_ZSTD=1  EXTRA_CXXFLAGS="-DZSTD_STATIC_LINKING_ONLY -DZSTD -I/data/users/changyubi/install/include/" EXTRA_LDFLAGS="-L/data/users/changyubi/install/lib/ -l:libzstd.a" make -j32 db_bench
      
      dict_bytes=16384
      train_bytes=1048576
      echo "========== No Dictionary =========="
      TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=filluniquerandom,compact -num=10000000 -compression_type=zstd -compression_max_dict_bytes=0 -block_size=4096 -max_background_jobs=24 -memtablerep=vector -allow_concurrent_memtable_write=false -disable_wal=true -max_write_buffer_number=8 >/dev/null 2>&1
      TEST_TMPDIR=/dev/shm /usr/bin/time ./db_bench -use_existing_db=true -benchmarks=compact -compression_type=zstd -compression_max_dict_bytes=0 -block_size=4096 2>&1 | grep elapsed
      du -hc /dev/shm/dbbench/*sst | grep total
      
      echo "========== Raw Content Dictionary =========="
      TEST_TMPDIR=/dev/shm ./db_bench_main -benchmarks=filluniquerandom,compact -num=10000000 -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -block_size=4096 -max_background_jobs=24 -memtablerep=vector -allow_concurrent_memtable_write=false -disable_wal=true -max_write_buffer_number=8 >/dev/null 2>&1
      TEST_TMPDIR=/dev/shm /usr/bin/time ./db_bench_main -use_existing_db=true -benchmarks=compact -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -block_size=4096 2>&1 | grep elapsed
      du -hc /dev/shm/dbbench/*sst | grep total
      
      echo "========== FinalizeDictionary =========="
      TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=filluniquerandom,compact -num=10000000 -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes -compression_use_zstd_dict_trainer=false -block_size=4096 -max_background_jobs=24 -memtablerep=vector -allow_concurrent_memtable_write=false -disable_wal=true -max_write_buffer_number=8 >/dev/null 2>&1
      TEST_TMPDIR=/dev/shm /usr/bin/time ./db_bench -use_existing_db=true -benchmarks=compact -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes -compression_use_zstd_dict_trainer=false -block_size=4096 2>&1 | grep elapsed
      du -hc /dev/shm/dbbench/*sst | grep total
      
      echo "========== TrainDictionary =========="
      TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=filluniquerandom,compact -num=10000000 -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes -block_size=4096 -max_background_jobs=24 -memtablerep=vector -allow_concurrent_memtable_write=false -disable_wal=true -max_write_buffer_number=8 >/dev/null 2>&1
      TEST_TMPDIR=/dev/shm /usr/bin/time ./db_bench -use_existing_db=true -benchmarks=compact -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes -block_size=4096 2>&1 | grep elapsed
      du -hc /dev/shm/dbbench/*sst | grep total
      
      # Result: TrainDictionary is much better on space saving, but FinalizeDictionary seems to use less memory.
      # before compression data size: 1.2GB
      dict_bytes=16384
      max_dict_buffer_bytes =  1048576
                          space   cpu/memory
      No Dictionary       468M    14.93user 1.00system 0:15.92elapsed 100%CPU (0avgtext+0avgdata 23904maxresident)k
      Raw Dictionary      251M    15.81user 0.80system 0:16.56elapsed 100%CPU (0avgtext+0avgdata 156808maxresident)k
      FinalizeDictionary  236M    11.93user 0.64system 0:12.56elapsed 100%CPU (0avgtext+0avgdata 89548maxresident)k
      TrainDictionary     84M     7.29user 0.45system 0:07.75elapsed 100%CPU (0avgtext+0avgdata 97288maxresident)k
      ```
      
      #### Benchmark on 10 sample SST files for spacing saving and CPU time on compression:
      FinalizeDictionary is comparable to TrainDictionary in terms of space saving, and takes less time in compression.
      ```
      dict_bytes=16384
      train_bytes=1048576
      
      for sst_file in `ls ../temp/myrock-sst/`
      do
        echo "********** $sst_file **********"
        echo "========== No Dictionary =========="
        ./sst_dump --file="../temp/myrock-sst/$sst_file" --command=recompress --compression_level_from=6 --compression_level_to=6 --compression_types=kZSTD
      
        echo "========== Raw Content Dictionary =========="
        ./sst_dump --file="../temp/myrock-sst/$sst_file" --command=recompress --compression_level_from=6 --compression_level_to=6 --compression_types=kZSTD --compression_max_dict_bytes=$dict_bytes
      
        echo "========== FinalizeDictionary =========="
        ./sst_dump --file="../temp/myrock-sst/$sst_file" --command=recompress --compression_level_from=6 --compression_level_to=6 --compression_types=kZSTD --compression_max_dict_bytes=$dict_bytes --compression_zstd_max_train_bytes=$train_bytes --compression_use_zstd_finalize_dict
      
        echo "========== TrainDictionary =========="
        ./sst_dump --file="../temp/myrock-sst/$sst_file" --command=recompress --compression_level_from=6 --compression_level_to=6 --compression_types=kZSTD --compression_max_dict_bytes=$dict_bytes --compression_zstd_max_train_bytes=$train_bytes
      done
      
                               010240.sst (Size/Time) 011029.sst              013184.sst              021552.sst              185054.sst              185137.sst              191666.sst              7560381.sst             7604174.sst             7635312.sst
      No Dictionary           28165569 / 2614419      32899411 / 2976832      32977848 / 3055542      31966329 / 2004590      33614351 / 1755877      33429029 / 1717042      33611933 / 1776936      33634045 / 2771417      33789721 / 2205414      33592194 / 388254
      Raw Content Dictionary  28019950 / 2697961      33748665 / 3572422      33896373 / 3534701      26418431 / 2259658      28560825 / 1839168      28455030 / 1846039      28494319 / 1861349      32391599 / 3095649      33772142 / 2407843      33592230 / 474523
      FinalizeDictionary      27896012 / 2650029      33763886 / 3719427      33904283 / 3552793      26008225 / 2198033      28111872 / 1869530      28014374 / 1789771      28047706 / 1848300      32296254 / 3204027      33698698 / 2381468      33592344 / 517433
      TrainDictionary         28046089 / 2740037      33706480 / 3679019      33885741 / 3629351      25087123 / 2204558      27194353 / 1970207      27234229 / 1896811      27166710 / 1903119      32011041 / 3322315      32730692 / 2406146      33608631 / 570593
      ```
      
      #### Decompression/Read test:
      With FinalizeDictionary/TrainDictionary, some data structure used for decompression are in stored in dictionary, so they are expected to be faster in terms of decompression/reads.
      ```
      dict_bytes=16384
      train_bytes=1048576
      echo "No Dictionary"
      TEST_TMPDIR=/dev/shm/ ./db_bench -benchmarks=filluniquerandom,compact -compression_type=zstd -compression_max_dict_bytes=0 > /dev/null 2>&1
      TEST_TMPDIR=/dev/shm/ ./db_bench -use_existing_db=true -benchmarks=readrandom -cache_size=0 -compression_type=zstd -compression_max_dict_bytes=0 2>&1 | grep MB/s
      
      echo "Raw Dictionary"
      TEST_TMPDIR=/dev/shm/ ./db_bench -benchmarks=filluniquerandom,compact -compression_type=zstd -compression_max_dict_bytes=$dict_bytes > /dev/null 2>&1
      TEST_TMPDIR=/dev/shm/ ./db_bench -use_existing_db=true -benchmarks=readrandom -cache_size=0 -compression_type=zstd  -compression_max_dict_bytes=$dict_bytes 2>&1 | grep MB/s
      
      echo "FinalizeDict"
      TEST_TMPDIR=/dev/shm/ ./db_bench -benchmarks=filluniquerandom,compact -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes -compression_use_zstd_dict_trainer=false  > /dev/null 2>&1
      TEST_TMPDIR=/dev/shm/ ./db_bench -use_existing_db=true -benchmarks=readrandom -cache_size=0 -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes -compression_use_zstd_dict_trainer=false 2>&1 | grep MB/s
      
      echo "Train Dictionary"
      TEST_TMPDIR=/dev/shm/ ./db_bench -benchmarks=filluniquerandom,compact -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes > /dev/null 2>&1
      TEST_TMPDIR=/dev/shm/ ./db_bench -use_existing_db=true -benchmarks=readrandom -cache_size=0 -compression_type=zstd -compression_max_dict_bytes=$dict_bytes -compression_zstd_max_train_bytes=$train_bytes 2>&1 | grep MB/s
      
      No Dictionary
      readrandom   :      12.183 micros/op 82082 ops/sec 12.183 seconds 1000000 operations;    9.1 MB/s (1000000 of 1000000 found)
      Raw Dictionary
      readrandom   :      12.314 micros/op 81205 ops/sec 12.314 seconds 1000000 operations;    9.0 MB/s (1000000 of 1000000 found)
      FinalizeDict
      readrandom   :       9.787 micros/op 102180 ops/sec 9.787 seconds 1000000 operations;   11.3 MB/s (1000000 of 1000000 found)
      Train Dictionary
      readrandom   :       9.698 micros/op 103108 ops/sec 9.699 seconds 1000000 operations;   11.4 MB/s (1000000 of 1000000 found)
      ```
      
      Reviewed By: ajkr
      
      Differential Revision: D35720026
      
      Pulled By: cbi42
      
      fbshipit-source-id: 24d230fdff0fd28a1bb650658798f00dfcfb2a1f
      cc23b46d
    • D
      Bump nokogiri from 1.13.4 to 1.13.6 in /docs (#10019) · 6255ac72
      dependabot[bot] 提交于
      Summary:
      Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.13.4 to 1.13.6.
      <details>
      <summary>Release notes</summary>
      <p><em>Sourced from <a href="https://github.com/sparklemotion/nokogiri/releases">nokogiri's releases</a>.</em></p>
      <blockquote>
      <h2>1.13.6 / 2022-05-08</h2>
      <h3>Security</h3>
      <ul>
      <li>[CRuby] Address <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-29181">CVE-2022-29181</a>, improper handling of unexpected data types, related to untrusted inputs to the SAX parsers. See <a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-xh29-r2w5-wx8m">GHSA-xh29-r2w5-wx8m</a> for more information.</li>
      </ul>
      <h3>Improvements</h3>
      <ul>
      <li><code>{HTML4,XML}::SAX::{Parser,ParserContext}</code> constructor methods now raise <code>TypeError</code> instead of segfaulting when an incorrect type is passed.</li>
      </ul>
      <hr />
      <p>sha256:</p>
      <pre><code>58417c7c10f78cd1c0e1984f81538300d4ea98962cfd3f46f725efee48f9757a  nokogiri-1.13.6-aarch64-linux.gem
      a2b04ec3b1b73ecc6fac619b41e9fdc70808b7a653b96ec97d04b7a23f158dbc  nokogiri-1.13.6-arm64-darwin.gem
      4437f2d03bc7da8854f4aaae89e24a98cf5c8b0212ae2bc003af7e65c7ee8e27  nokogiri-1.13.6-java.gem
      99d3e212bbd5e80aa602a1f52d583e4f6e917ec594e6aa580f6aacc253eff984  nokogiri-1.13.6-x64-mingw-ucrt.gem
      a04f6154a75b6ed4fe2d0d0ff3ac02f094b54e150b50330448f834fa5726fbba  nokogiri-1.13.6-x64-mingw32.gem
      a13f30c2863ef9e5e11240dd6d69ef114229d471018b44f2ff60bab28327de4d  nokogiri-1.13.6-x86-linux.gem
      63a2ca2f7a4f6bd9126e1695037f66c8eb72ed1e1740ef162b4480c57cc17dc6  nokogiri-1.13.6-x86-mingw32.gem
      2b266e0eb18030763277b30dc3d64337f440191e2bd157027441ac56a59d9dfe  nokogiri-1.13.6-x86_64-darwin.gem
      3fa37b0c3b5744af45f9da3e4ae9cbd89480b35e12ae36b5e87a0452e0b38335  nokogiri-1.13.6-x86_64-linux.gem
      b1512fdc0aba446e1ee30de3e0671518eb363e75fab53486e99e8891d44b8587  nokogiri-1.13.6.gem
      </code></pre>
      <h2>1.13.5 / 2022-05-04</h2>
      <h3>Security</h3>
      <ul>
      <li>[CRuby] Vendored libxml2 is updated to address <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-29824">CVE-2022-29824</a>. See <a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-cgx6-hpwq-fhv5">GHSA-cgx6-hpwq-fhv5</a> for more information.</li>
      </ul>
      <h3>Dependencies</h3>
      <ul>
      <li>[CRuby] Vendored libxml2 is updated from v2.9.13 to <a href="https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.9.14">v2.9.14</a>.</li>
      </ul>
      <h3>Improvements</h3>
      <ul>
      <li>[CRuby] The libxml2 HTML4 parser no longer exhibits quadratic behavior when recovering some broken markup related to start-of-tag and bare <code>&lt;</code> characters.</li>
      </ul>
      <h3>Changed</h3>
      <ul>
      <li>[CRuby] The libxml2 HTML4 parser in v2.9.14 recovers from some broken markup differently. Notably, the XML CDATA escape sequence <code>&lt;![CDATA[</code> and incorrectly-opened comments will result in HTML text nodes starting with <code>&amp;lt;!</code> instead of skipping the invalid tag. This behavior is a direct result of the <a href="https://gitlab.gnome.org/GNOME/libxml2/-/commit/798bdf1">quadratic-behavior fix</a> noted above. The behavior of downstream sanitizers relying on this behavior will also change. Some tests describing the changed behavior are in <a href="https://github.com/sparklemotion/nokogiri/blob/3ed5bf2b5a367cb9dc6e329c5a1c512e1dd4565d/test/html4/test_comments.rb#L187-L204"><code>test/html4/test_comments.rb</code></a>.</li>
      </ul>
      
      </blockquote>
      <p>... (truncated)</p>
      </details>
      <details>
      <summary>Changelog</summary>
      <p><em>Sourced from <a href="https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md">nokogiri's changelog</a>.</em></p>
      <blockquote>
      <h2>1.13.6 / 2022-05-08</h2>
      <h3>Security</h3>
      <ul>
      <li>[CRuby] Address <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-29181">CVE-2022-29181</a>, improper handling of unexpected data types, related to untrusted inputs to the SAX parsers. See <a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-xh29-r2w5-wx8m">GHSA-xh29-r2w5-wx8m</a> for more information.</li>
      </ul>
      <h3>Improvements</h3>
      <ul>
      <li><code>{HTML4,XML}::SAX::{Parser,ParserContext}</code> constructor methods now raise <code>TypeError</code> instead of segfaulting when an incorrect type is passed.</li>
      </ul>
      <h2>1.13.5 / 2022-05-04</h2>
      <h3>Security</h3>
      <ul>
      <li>[CRuby] Vendored libxml2 is updated to address <a href="https://nvd.nist.gov/vuln/detail/CVE-2022-29824">CVE-2022-29824</a>. See <a href="https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-cgx6-hpwq-fhv5">GHSA-cgx6-hpwq-fhv5</a> for more information.</li>
      </ul>
      <h3>Dependencies</h3>
      <ul>
      <li>[CRuby] Vendored libxml2 is updated from v2.9.13 to <a href="https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.9.14">v2.9.14</a>.</li>
      </ul>
      <h3>Improvements</h3>
      <ul>
      <li>[CRuby] The libxml2 HTML parser no longer exhibits quadratic behavior when recovering some broken markup related to start-of-tag and bare <code>&lt;</code> characters.</li>
      </ul>
      <h3>Changed</h3>
      <ul>
      <li>[CRuby] The libxml2 HTML parser in v2.9.14 recovers from some broken markup differently. Notably, the XML CDATA escape sequence <code>&lt;![CDATA[</code> and incorrectly-opened comments will result in HTML text nodes starting with <code>&amp;lt;!</code> instead of skipping the invalid tag. This behavior is a direct result of the <a href="https://gitlab.gnome.org/GNOME/libxml2/-/commit/798bdf1">quadratic-behavior fix</a> noted above. The behavior of downstream sanitizers relying on this behavior will also change. Some tests describing the changed behavior are in <a href="https://github.com/sparklemotion/nokogiri/blob/3ed5bf2b5a367cb9dc6e329c5a1c512e1dd4565d/test/html4/test_comments.rb#L187-L204"><code>test/html4/test_comments.rb</code></a>.</li>
      </ul>
      </blockquote>
      </details>
      <details>
      <summary>Commits</summary>
      <ul>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/b7817b6a62ac210203a451d1a691a824288e9eab"><code>b7817b6</code></a> version bump to v1.13.6</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/61b1a395cd512af2e0595a8e369465415e574fe8"><code>61b1a39</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2530">https://github.com/facebook/rocksdb/issues/2530</a> from sparklemotion/flavorjones-check-parse-memory-ty...</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/83cc451c3f29df397caa890afc3b714eae6ab8f7"><code>83cc451</code></a> fix: {HTML4,XML}::SAX::{Parser,ParserContext} check arg types</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/22c9e5b300c27a377fdde37c17eb9d07dd7322d0"><code>22c9e5b</code></a> version bump to v1.13.5</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/615588192572f7cfcb43eabbb070a6e07bf9e731"><code>6155881</code></a> doc: update CHANGELOG for v1.13.5</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/c519a47ab11f5e8fce77328fcb01a7b3befc2b9e"><code>c519a47</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/sparklemotion/nokogiri/issues/2527">https://github.com/facebook/rocksdb/issues/2527</a> from sparklemotion/2525-update-libxml-2_9_14-v1_13_x</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/66c2886e78f6801def83a549c3e6581ac48e61e8"><code>66c2886</code></a> dep: update libxml2 to v2.9.14</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/b7c4cc35de38fcfdde4da1203d79ae38bc4324bf"><code>b7c4cc3</code></a> test: unpend the LIBXML_LOADED_VERSION test on freebsd</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/eac793487183a5e72464e53cccd260971d5f29b5"><code>eac7934</code></a> dev: require yaml</li>
      <li><a href="https://github.com/sparklemotion/nokogiri/commit/f3521ba3d38922d76dd5ed59705eab3988213712"><code>f3521ba</code></a> style(rubocop): pend Style/FetchEnvVar for now</li>
      <li>Additional commits viewable in <a href="https://github.com/sparklemotion/nokogiri/compare/v1.13.4...v1.13.6">compare view</a></li>
      </ul>
      </details>
      <br />
      
      [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=nokogiri&package-manager=bundler&previous-version=1.13.4&new-version=1.13.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
      
      Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`.
      
      [//]: # (dependabot-automerge-start)
      [//]: # (dependabot-automerge-end)
      
       ---
      
      <details>
      <summary>Dependabot commands and options</summary>
      <br />
      
      You can trigger Dependabot actions by commenting on this PR:
      - `dependabot rebase` will rebase this PR
      - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
      - `dependabot merge` will merge this PR after your CI passes on it
      - `dependabot squash and merge` will squash and merge this PR after your CI passes on it
      - `dependabot cancel merge` will cancel a previously requested merge and block automerging
      - `dependabot reopen` will reopen this PR if it is closed
      - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
      - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
      - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
      - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
      - `dependabot use these labels` will set the current labels as the default for future PRs for this repo and language
      - `dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language
      - `dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language
      - `dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language
      
      You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/facebook/rocksdb/network/alerts).
      
      </details>
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10019
      
      Reviewed By: riversand963
      
      Differential Revision: D36536897
      
      Pulled By: ajkr
      
      fbshipit-source-id: 368c24e86d5d39f0a3adc08a397ae074b1b18b1a
      6255ac72
  2. 20 5月, 2022 5 次提交
    • Y
      Add timestamp support to DBImplReadOnly (#10004) · 16bdb1f9
      Yu Zhang 提交于
      Summary:
      This PR adds timestamp support to a read only DB instance opened as `DBImplReadOnly`. A follow up PR will add the same support to `CompactedDBImpl`.
      
       With this, read only database has these timestamp related APIs:
      
      `ReadOptions.timestamp` : read should return the latest data visible to this specified timestamp
      `Iterator::timestamp()` : returns the timestamp associated with the key, value
      `DB:Get(..., std::string* timestamp)` : returns the timestamp associated with the key, value in `timestamp`
      
      Test plan (on devserver):
      
      ```
      $COMPILE_WITH_ASAN=1 make -j24 all
      $./db_with_timestamp_basic_test --gtest_filter=DBBasicTestWithTimestamp.ReadOnlyDB*
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10004
      
      Reviewed By: riversand963
      
      Differential Revision: D36434422
      
      Pulled By: jowlyzhang
      
      fbshipit-source-id: 5d949e65b1ffb845758000e2b310fdd4aae71cfb
      16bdb1f9
    • A
      Multi file concurrency in MultiGet using coroutines and async IO (#9968) · 57997dda
      anand76 提交于
      Summary:
      This PR implements a coroutine version of batched MultiGet in order to concurrently read from multiple SST files in a level using async IO, thus reducing the latency of the MultiGet. The API from the user perspective is still synchronous and single threaded, with the RocksDB part of the processing happening in the context of the caller's thread. In Version::MultiGet, the decision is made whether to call synchronous or coroutine code.
      
      A good way to review this PR is to review the first 4 commits in order - de773b3, 70c2f70, 10b50e1, and 377a597 - before reviewing the rest.
      
      TODO:
      1. Figure out how to build it in CircleCI (requires some dependencies to be installed)
      2. Do some stress testing with coroutines enabled
      
      No regression in synchronous MultiGet between this branch and main -
      ```
      ./db_bench -use_existing_db=true --db=/data/mysql/rocksdb/prefix_scan -benchmarks="readseq,multireadrandom" -key_size=32 -value_size=512 -num=5000000 -batch_size=64 -multiread_batched=true -use_direct_reads=false -duration=60 -ops_between_duration_checks=1 -readonly=true -adaptive_readahead=true -threads=16 -cache_size=10485760000 -async_io=false -multiread_stride=40000 -statistics
      ```
      Branch - ```multireadrandom :       4.025 micros/op 3975111 ops/sec 60.001 seconds 238509056 operations; 2062.3 MB/s (14767808 of 14767808 found)```
      
      Main - ```multireadrandom :       3.987 micros/op 4013216 ops/sec 60.001 seconds 240795392 operations; 2082.1 MB/s (15231040 of 15231040 found)```
      
      More benchmarks in various scenarios are given below. The measurements were taken with ```async_io=false``` (no coroutines) and ```async_io=true``` (use coroutines). For an IO bound workload (with every key requiring an IO), the coroutines version shows a clear benefit, being ~2.6X faster. For CPU bound workloads, the coroutines version has ~6-15% higher CPU utilization, depending on how many keys overlap an SST file.
      
      1. Single thread IO bound workload on remote storage with sparse MultiGet batch keys (~1 key overlap/file) -
      No coroutines - ```multireadrandom :     831.774 micros/op 1202 ops/sec 60.001 seconds 72136 operations;    0.6 MB/s (72136 of 72136 found)```
      Using coroutines - ```multireadrandom :     318.742 micros/op 3137 ops/sec 60.003 seconds 188248 operations;    1.6 MB/s (188248 of 188248 found)```
      
      2. Single thread CPU bound workload (all data cached) with ~1 key overlap/file -
      No coroutines - ```multireadrandom :       4.127 micros/op 242322 ops/sec 60.000 seconds 14539384 operations;  125.7 MB/s (14539384 of 14539384 found)```
      Using coroutines - ```multireadrandom :       4.741 micros/op 210935 ops/sec 60.000 seconds 12656176 operations;  109.4 MB/s (12656176 of 12656176 found)```
      
      3. Single thread CPU bound workload with ~2 key overlap/file -
      No coroutines - ```multireadrandom :       3.717 micros/op 269000 ops/sec 60.000 seconds 16140024 operations;  139.6 MB/s (16140024 of 16140024 found)```
      Using coroutines - ```multireadrandom :       4.146 micros/op 241204 ops/sec 60.000 seconds 14472296 operations;  125.1 MB/s (14472296 of 14472296 found)```
      
      4. CPU bound multi-threaded (16 threads) with ~4 key overlap/file -
      No coroutines - ```multireadrandom :       4.534 micros/op 3528792 ops/sec 60.000 seconds 211728728 operations; 1830.7 MB/s (12737024 of 12737024 found) ```
      Using coroutines - ```multireadrandom :       4.872 micros/op 3283812 ops/sec 60.000 seconds 197030096 operations; 1703.6 MB/s (12548032 of 12548032 found) ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9968
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D36348563
      
      Pulled By: anand1976
      
      fbshipit-source-id: c0ce85a505fd26ebfbb09786cbd7f25202038696
      57997dda
    • B
      Address comments for PR #9988 and #9996 (#10020) · 5be1579e
      Bo Wang 提交于
      Summary:
      1. The latest change of DecideRateLimiterPriority in https://github.com/facebook/rocksdb/pull/9988 is reverted.
      2. For https://github.com/facebook/rocksdb/blob/main/db/builder.cc#L345-L349
        2.1. Remove `we will regrad this verification as user reads` from the comments.
        2.2. `Do not set` the read_options.rate_limiter_priority to Env::IO_USER . Flush should be a background job.
        2.3. Update db_rate_limiter_test.cc.
      3. In IOOptions, mark `prio` as deprecated for future removal.
      4. In `file_system.h`, mark `IOPriority` as deprecated for future removal.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10020
      
      Test Plan: Unit tests.
      
      Reviewed By: ajkr
      
      Differential Revision: D36525317
      
      Pulled By: gitbw95
      
      fbshipit-source-id: 011ba421822f8a124e6d25a2661c4e242df6ad36
      5be1579e
    • P
      Fix auto_prefix_mode performance with partitioned filters (#10012) · 280b9f37
      Peter Dillinger 提交于
      Summary:
      Essentially refactored the RangeMayExist implementation in
      FullFilterBlockReader to FilterBlockReaderCommon so that it applies to
      partitioned filters as well. (The function is not called for the
      block-based filter case.) RangeMayExist is essentially a series of checks
      around a possible PrefixMayExist, and I'm confident those checks should
      be the same for partitioned as for full filters. (I think it's likely
      that bugs remain in those checks, but this change is overall a simplifying
      one.)
      
      Added auto_prefix_mode support to db_bench
      
      Other small fixes as well
      
      Fixes https://github.com/facebook/rocksdb/issues/10003
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10012
      
      Test Plan:
      Expanded unit test that uses statistics to check for filter
      optimization, fails without the production code changes here
      
      Performance: populate two DBs with
      ```
      TEST_TMPDIR=/dev/shm/rocksdb_nonpartitioned ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8
      TEST_TMPDIR=/dev/shm/rocksdb_partitioned ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -partition_index_and_filters
      ```
      
      Observe no measurable change in non-partitioned performance
      ```
      TEST_TMPDIR=/dev/shm/rocksdb_nonpartitioned ./db_bench -benchmarks=seekrandom[-X1000] -num=10000000 -readonly -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -auto_prefix_mode -cache_index_and_filter_blocks=1 -cache_size=1000000000 -duration 20
      ```
      Before: seekrandom [AVG 15 runs] : 11798 (± 331) ops/sec
      After: seekrandom [AVG 15 runs] : 11724 (± 315) ops/sec
      
      Observe big improvement with partitioned (also supported by bloom use statistics)
      ```
      TEST_TMPDIR=/dev/shm/rocksdb_partitioned ./db_bench -benchmarks=seekrandom[-X1000] -num=10000000 -readonly -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -partition_index_and_filters -auto_prefix_mode -cache_index_and_filter_blocks=1 -cache_size=1000000000 -duration 20
      ```
      Before: seekrandom [AVG 12 runs] : 2942 (± 57) ops/sec
      After: seekrandom [AVG 12 runs] : 7489 (± 184) ops/sec
      
      Reviewed By: siying
      
      Differential Revision: D36469796
      
      Pulled By: pdillinger
      
      fbshipit-source-id: bcf1e2a68d347b32adb2b27384f945434e7a266d
      280b9f37
    • J
      Track SST unique id in MANIFEST and verify (#9990) · c6d326d3
      Jay Zhuang 提交于
      Summary:
      Start tracking SST unique id in MANIFEST, which is used to verify with
      SST properties to make sure the SST file is not overwritten or
      misplaced. A DB option `try_verify_sst_unique_id` is introduced to
      enable/disable the verification, if enabled, it opens all SST files
      during DB-open to read the unique_id from table properties (default is
      false), so it's recommended to use it with `max_open_files = -1` to
      pre-open the files.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9990
      
      Test Plan: unittests, format-compatible test, mini-crash
      
      Reviewed By: anand1976
      
      Differential Revision: D36381863
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 89ea2eb6b35ed3e80ead9c724eb096083eaba63f
      c6d326d3
  3. 19 5月, 2022 6 次提交
    • H
      Mark old reserve* option deprecated (#10016) · dde774db
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      https://github.com/facebook/rocksdb/pull/9926 removed inefficient `reserve*` option API but forgot to mark them deprecated in `block_based_table_type_info` for compatible table format.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10016
      
      Test Plan: build-format-compatible
      
      Reviewed By: pdillinger
      
      Differential Revision: D36484247
      
      Pulled By: hx235
      
      fbshipit-source-id: c41b90cc99fb7ab7098934052f0af7290b221f98
      dde774db
    • G
      Set Read rate limiter priority dynamically and pass it to FS (#9996) · 4da34b97
      gitbw95 提交于
      Summary:
      ### Context:
      Background compactions and flush generate large reads and writes, and can be long running, especially for universal compaction. In some cases, this can impact foreground reads and writes by users.
      
      ### Solution
      User, Flush, and Compaction reads share some code path. For this task, we update the rate_limiter_priority in ReadOptions for code paths (e.g. FindTable (mainly in BlockBasedTable::Open()) and various iterators), and eventually update the rate_limiter_priority in IOOptions for FSRandomAccessFile.
      
      **This PR is for the Read path.** The **Read:** dynamic priority for different state are listed as follows:
      
      | State | Normal | Delayed | Stalled |
      | ----- | ------ | ------- | ------- |
      |  Flush (verification read in BuildTable()) | IO_USER | IO_USER | IO_USER |
      |  Compaction | IO_LOW  | IO_USER | IO_USER |
      |  User | User provided | User provided | User provided |
      
      We will respect the read_options that the user provided and will not set it.
      The only sst read for Flush is the verification read in BuildTable(). It claims to be "regard as user read".
      
      **Details**
      1. Set read_options.rate_limiter_priority dynamically:
      - User: Do not update the read_options. Use the read_options that the user provided.
      - Compaction: Update read_options in CompactionJob::ProcessKeyValueCompaction().
      - Flush: Update read_options in BuildTable().
      
      2. Pass the rate limiter priority to FSRandomAccessFile functions:
      - After calling the FindTable(), read_options is passed through GetTableReader(table_cache.cc), BlockBasedTableFactory::NewTableReader(block_based_table_factory.cc), and BlockBasedTable::Open(). The Open() needs some updates for the ReadOptions variable and the updates are also needed for the called functions,  including PrefetchTail(), PrepareIOOptions(), ReadFooterFromFile(), ReadMetaIndexblock(), ReadPropertiesBlock(), PrefetchIndexAndFilterBlocks(), and ReadRangeDelBlock().
      - In RandomAccessFileReader, the functions to be updated include Read(), MultiRead(), ReadAsync(), and Prefetch().
      - Update the downstream functions of NewIndexIterator(), NewDataBlockIterator(), and BlockBasedTableIterator().
      
      ### Test Plans
      Add unit tests.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9996
      
      Reviewed By: anand1976
      
      Differential Revision: D36452483
      
      Pulled By: gitbw95
      
      fbshipit-source-id: 60978204a4f849bb9261cb78d9bc1cb56d6008cf
      4da34b97
    • S
      Remove two tests from platform dependent tests (#10017) · f1303bf8
      sdong 提交于
      Summary:
      Platform dependent tests sometimes run too long and causes timeout in Travis. Remove two tests that are less likely to be platform dependent.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10017
      
      Test Plan: Watch Travis tests.
      
      Reviewed By: pdillinger
      
      Differential Revision: D36486734
      
      fbshipit-source-id: 2a3ad1746791c893a790c2a69a3b70f81e7de260
      f1303bf8
    • Y
      Remove ROCKSDB_SUPPORT_THREAD_LOCAL define because it's a part of C++11 (#10015) · 0a43061f
      Yaroslav Stepanchuk 提交于
      Summary:
      ROCKSDB_SUPPORT_THREAD_LOCAL definition has been removed.
      `__thread`(#define) has been replaced with `thread_local`(C++ keyword) across the code base.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10015
      
      Reviewed By: siying
      
      Differential Revision: D36485491
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 6522d212514ee190b90b4e2750c80c7e34013c78
      0a43061f
    • Y
      Avoid overwriting options loaded from OPTIONS (#9943) · e3a3dbf2
      Yanqin Jin 提交于
      Summary:
      This is similar to https://github.com/facebook/rocksdb/issues/9862, including the following fixes/refactoring:
      
      1. If OPTIONS file is specified via `-options_file`, majority of options will be loaded from the file. We should not
      overwrite options that have been loaded from the file. Instead, we configure only fields of options which are
      shared objects and not set by the OPTIONS file. We also configure a few fields, e.g. `create_if_missing` necessary
      for stress test to run.
      
      2. Refactor options initialization into three functions, `InitializeOptionsFromFile()`, `InitializeOptionsFromFlags()`
      and `InitializeOptionsGeneral()` similar to db_bench. I hope they can be shared in the future. The high-level logic is
      as follows:
      ```cpp
      if (!InitializeOptionsFromFile(...)) {
        InitializeOptionsFromFlags(...);
      }
      InitializeOptionsGeneral(...);
      ```
      
      3. Currently, the setting for `block_cache_compressed` does not seem correct because it by default specifies a
      size of `numeric_limits<size_t>::max()` ((size_t)-1). According to code comments, `-1` indicates default value,
      which should be referring to `num_shard_bits` argument.
      
      4. Clarify `fail_if_options_file_error`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9943
      
      Test Plan:
      1. make check
      2. Run stress tests, and manually check generated OPTIONS file and compare them with input OPTIONS files
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D36133769
      
      Pulled By: riversand963
      
      fbshipit-source-id: 35dacdc090a0a72c922907170cd132b9ecaa073e
      e3a3dbf2
    • S
      Log error message when LinkFile() is not supported when ingesting files (#10010) · a74f14b5
      sdong 提交于
      Summary:
      Right now, whether moving file is skipped due to LinkFile() is not supported is opaque to users. Add a log message to help users debug.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10010
      
      Test Plan: Run existing test. Manual test verify the log message printed out.
      
      Reviewed By: riversand963
      
      Differential Revision: D36463237
      
      fbshipit-source-id: b00bd5041bd5c11afa4e326819c8461ee2c98a91
      a74f14b5
  4. 18 5月, 2022 7 次提交
    • G
      Set Write rate limiter priority dynamically and pass it to FS (#9988) · 05c678e1
      gitbw95 提交于
      Summary:
      ### Context:
      Background compactions and flush generate large reads and writes, and can be long running, especially for universal compaction. In some cases, this can impact foreground reads and writes by users.
      
      From the RocksDB perspective, there can be two kinds of rate limiters, the internal (native) one and the external one.
      - The internal (native) rate limiter is introduced in [the wiki](https://github.com/facebook/rocksdb/wiki/Rate-Limiter). Currently, only IO_LOW and IO_HIGH are used and they are set statically.
      - For the external rate limiter, in FSWritableFile functions,  IOOptions is open for end users to set and get rate_limiter_priority for their own rate limiter. Currently, RocksDB doesn’t pass the rate_limiter_priority through IOOptions to the file system.
      
      ### Solution
      During the User Read, Flush write, Compaction read/write, the WriteController is used to determine whether DB writes are stalled or slowed down. The rate limiter priority (Env::IOPriority) can be determined accordingly. We decided to always pass the priority in IOOptions. What the file system does with it should be a contract between the user and the file system. We would like to set the rate limiter priority at file level, since the Flush/Compaction job level may be too coarse with multiple files and block IO level is too granular.
      
      **This PR is for the Write path.** The **Write:** dynamic priority for different state are listed as follows:
      
      | State | Normal | Delayed | Stalled |
      | ----- | ------ | ------- | ------- |
      |  Flush | IO_HIGH | IO_USER | IO_USER |
      |  Compaction | IO_LOW | IO_USER | IO_USER |
      
      Flush and Compaction writes share the same call path through BlockBaseTableWriter, WritableFileWriter, and FSWritableFile. When a new FSWritableFile object is created, its io_priority_ can be set dynamically based on the state of the WriteController. In WritableFileWriter, before the call sites of FSWritableFile functions, WritableFileWriter::DecideRateLimiterPriority() determines the rate_limiter_priority. The options (IOOptions) argument of FSWritableFile functions will be updated with the rate_limiter_priority.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9988
      
      Test Plan: Add unit tests.
      
      Reviewed By: anand1976
      
      Differential Revision: D36395159
      
      Pulled By: gitbw95
      
      fbshipit-source-id: a7c82fc29759139a1a07ec46c37dbf7e753474cf
      05c678e1
    • J
      Add table_properties_collector_factories override (#9995) · b84e3363
      Jay Zhuang 提交于
      Summary:
      Add table_properties_collector_factories override on the remote
      side.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9995
      
      Test Plan: unittest added
      
      Reviewed By: ajkr
      
      Differential Revision: D36392623
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 3ba031294d90247ca063d7de7b43178d38e3f66a
      b84e3363
    • P
      Adjust public APIs to prefer 128-bit SST unique ID (#10009) · 0070680c
      Peter Dillinger 提交于
      Summary:
      128 bits should suffice almost always and for tracking in manifest.
      
      Note that this changes the output of sst_dump --show_properties to only show 128 bits.
      
      Also introduces InternalUniqueIdToHumanString for presenting internal IDs for debugging purposes.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10009
      
      Test Plan: unit tests updated
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D36458189
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 93ebc4a3b6f9c73ee154383a1f8b291a5d6bbef5
      0070680c
    • X
      fix: build on risc-v (#9215) · 8b1df101
      XieJiSS 提交于
      Summary:
      Patch is modified from ~~https://reviews.llvm.org/file/data/du5ol5zctyqw53ma7dwz/PHID-FILE-knherxziu4tl4erti5ab/file~~
      
      Tested on Arch Linux riscv64gc (qemu)
      
      UPDATE: Seems like the above link is broken, so I tried to search for a link pointing to the original merge request. It turned out to me that the LLVM guys are cherry-picking from `google/benchmark`, and the upstream should be this:
      
      https://github.com/google/benchmark/blob/808571a52fd6cc7e9f0788e08f71f0f4175b6673/src/cycleclock.h#L190
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9215
      
      Reviewed By: siying, jay-zhuang
      
      Differential Revision: D34170586
      
      Pulled By: riversand963
      
      fbshipit-source-id: 41b16b9f7f3bb0f3e7b26bb078eb575499c0f0f4
      8b1df101
    • H
      Rewrite memory-charging feature's option API (#9926) · 3573558e
      Hui Xiao 提交于
      Summary:
      **Context:**
      Previous PR https://github.com/facebook/rocksdb/pull/9748, https://github.com/facebook/rocksdb/pull/9073, https://github.com/facebook/rocksdb/pull/8428 added separate flag for each charged memory area. Such API design is not scalable as we charge more and more memory areas. Also, we foresee an opportunity to consolidate this feature with other cache usage related features such as `cache_index_and_filter_blocks` using `CacheEntryRole`.
      
      Therefore we decided to consolidate all these flags with `CacheUsageOptions cache_usage_options` and this PR serves as the first step by consolidating memory-charging related flags.
      
      **Summary:**
      - Replaced old API reference with new ones, including making `kCompressionDictionaryBuildingBuffer` opt-out and added a unit test for that
      - Added missing db bench/stress test for some memory charging features
      - Renamed related test suite to indicate they are under the same theme of memory charging
      - Refactored a commonly used mocked cache component in memory charging related tests to reduce code duplication
      - Replaced the phrases "memory tracking" / "cache reservation" (other than CacheReservationManager-related ones) with "memory charging" for standard description of this feature.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9926
      
      Test Plan:
      - New unit test for opt-out `kCompressionDictionaryBuildingBuffer` `TEST_F(ChargeCompressionDictionaryBuildingBufferTest, Basic)`
      - New unit test for option validation/sanitization `TEST_F(CacheUsageOptionsOverridesTest, SanitizeAndValidateOptions)`
      - CI
      - db bench (in case querying new options introduces regression) **+0.5% micros/op**: `TEST_TMPDIR=/dev/shm/testdb ./db_bench -benchmarks=fillseq -db=$TEST_TMPDIR  -charge_compression_dictionary_building_buffer=1(remove this for comparison)  -compression_max_dict_bytes=10000 -disable_auto_compactions=1 -write_buffer_size=100000 -num=4000000 | egrep 'fillseq'`
      
      #-run | (pre-PR) avg micros/op | std micros/op | (post-PR)  micros/op | std micros/op | change (%)
      -- | -- | -- | -- | -- | --
      10 | 3.9711 | 0.264408 | 3.9914 | 0.254563 | 0.5111933721
      20 | 3.83905 | 0.0664488 | 3.8251 | 0.0695456 | **-0.3633711465**
      40 | 3.86625 | 0.136669 | 3.8867 | 0.143765 | **0.5289363078**
      
      - db_stress: `python3 tools/db_crashtest.py blackbox  -charge_compression_dictionary_building_buffer=1 -charge_filter_construction=1 -charge_table_reader=1 -cache_size=1` killed as normal
      
      Reviewed By: ajkr
      
      Differential Revision: D36054712
      
      Pulled By: hx235
      
      fbshipit-source-id: d406e90f5e0c5ea4dbcb585a484ad9302d4302af
      3573558e
    • H
      Clarify some SequentialFileReader::Read logic (#10002) · f6339de0
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      The logic related to PositionedRead in SequentialFileReader::Read confused me a bit as discussed here https://github.com/facebook/rocksdb/pull/9973#discussion_r872869256. Therefore I added a drawing with help from cbi42.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/10002
      
      Test Plan: - no code change
      
      Reviewed By: anand1976, cbi42
      
      Differential Revision: D36422632
      
      Pulled By: hx235
      
      fbshipit-source-id: 9a8311d2365564f90d216c430f542fc11b2d9cde
      f6339de0
    • M
      Use STATIC_AVOID_DESTRUCTION for static objects with non-trivial destructors (#9958) · b11ff347
      mrambacher 提交于
      Summary:
      Changed the static objects that had non-trivial destructors to use the STATIC_AVOID_DESTRUCTION construct.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9958
      
      Reviewed By: pdillinger
      
      Differential Revision: D36442982
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 029d47b1374d30d198bfede369a4c0ae7a4eb519
      b11ff347
  5. 17 5月, 2022 3 次提交
    • Y
      Add a temporary option for user to opt-out enforcement of SingleDelete contract (#9983) · 3f263ef5
      Yanqin Jin 提交于
      Summary:
      PR https://github.com/facebook/rocksdb/issues/9888 started to enforce the contract of single delete described in https://github.com/facebook/rocksdb/wiki/Single-Delete.
      
      For some of existing use cases, it is desirable to have a transition during which compaction will not fail
      if the contract is violated. Therefore, we add a temporary option `enforce_single_del_contracts` to allow
      application to opt out from this new strict behavior. Once transition completes, the flag can be set to `true` again.
      
      In a future release, the option will be removed.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9983
      
      Test Plan: make check
      
      Reviewed By: ajkr
      
      Differential Revision: D36333672
      
      Pulled By: riversand963
      
      fbshipit-source-id: dcb703ea0ed08076a1422f1bfb9914afe3c2caa2
      3f263ef5
    • H
      Use SpecialEnv to speed up some slow BackupEngineRateLimitingTestWithParam (#9974) · e66e6d2f
      Hui Xiao 提交于
      Summary:
      **Context:**
      `BackupEngineRateLimitingTestWithParam.RateLimiting` and `BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup` involve creating backup and restoring of a big database with rate-limiting. Using the normal env with a normal clock requires real elapse of time (13702 - 19848 ms/per test). As suggested in https://github.com/facebook/rocksdb/pull/8722#discussion_r703698603, this PR is to speed it up with SpecialEnv (`time_elapse_only_sleep=true`) where its clock accepts fake elapse of time during rate-limiting (100 - 600 ms/per test)
      
      **Summary:**
      - Added TEST_ function to set clock of the default rate limiters in backup engine
      - Shrunk testdb by 10 times while keeping it big enough for testing
      - Renamed some test variables and reorganized some if-else branch for clarity without changing the test
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9974
      
      Test Plan:
      - Run tests pre/post PR the same time to verify the tests are sped up by 90 - 95%
      `BackupEngineRateLimitingTestWithParam.RateLimiting`
      Pre:
      ```
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 (11123 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 (9441 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 (11096 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 (9339 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 (11121 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 (9413 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 (11185 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 (9511 ms)
      [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (82230 ms total)
      ```
      Post:
      ```
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/0 (395 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/1 (564 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/2 (358 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/3 (567 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/4 (173 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/5 (176 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/6 (191 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimiting/7 (177 ms)
      [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (2601 ms total)
      ```
      `BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup`
      Pre:
      ```
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 (7275 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 (3961 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 (7117 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 (3921 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 (19862 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 (10231 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 (19848 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 (10372 ms)
      [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (82587 ms total)
      ```
      Post:
      ```
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/0 (157 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/1 (152 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/2 (160 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/3 (158 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/4 (155 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/5 (151 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/6 (146 ms)
      [ RUN      ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7
      [       OK ] RateLimiting/BackupEngineRateLimitingTestWithParam.RateLimitingVerifyBackup/7 (153 ms)
      [----------] 8 tests from RateLimiting/BackupEngineRateLimitingTestWithParam (1232 ms total)
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D36336345
      
      Pulled By: hx235
      
      fbshipit-source-id: 724c6ba745f95f56d4440a6d2f1e4512a2987589
      e66e6d2f
    • M
      Added GetFactoryCount/Names/Types to ObjectRegistry (#9358) · 204a42ca
      mrambacher 提交于
      Summary:
      These methods allow for more thorough testing of the ObjectRegistry and Customizable infrastructure in a simpler manner.  With this change, the Customizable tests can now check what factories are registered and attempt to create each of them in a systematic fashion.
      
      With this change, I think all of the factories registered with the ObjectRegistry/CreateFromString are now tested via the customizable_test classes.
      
      Note that there were a few other minor changes.  There was a "posix://*" register with the ObjectRegistry which was missed during the PatternEntry conversion -- these changes found that.  The nickname and default names for the FileSystem classes was also inverted.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9358
      
      Reviewed By: pdillinger
      
      Differential Revision: D33433542
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 9a32da74e6620745b4eeffb2712be70eeeadfa7e
      204a42ca
  6. 14 5月, 2022 3 次提交
  7. 13 5月, 2022 4 次提交
    • M
      Option type info functions (#9411) · bfc6a8ee
      mrambacher 提交于
      Summary:
      Add methods to set the various functions (Parse, Serialize, Equals) to the OptionTypeInfo.  These methods simplify the number of constructors required for OptionTypeInfo and make the code a little clearer.
      
      Add functions to the OptionTypeInfo for Prepare and Validate.  These methods allow types other than Configurable and Customizable to have Prepare and Validate logic.  These methods could be used by an option to guarantee that its settings were in a range or that a value was initialized.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9411
      
      Reviewed By: pdillinger
      
      Differential Revision: D36174849
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 72517d8c6bab4723788a4c1a9e16590bff870125
      bfc6a8ee
    • P
      Put build size checking logic in Makefile (#9989) · cdaa9576
      Peter Dillinger 提交于
      Summary:
      ... for better maintainability, in case of Makefile changes /
      refactoring. This is lightly modified from rocksd-lego-determinator, and
      will be used by Meta-internal CI with custom REPORT_BUILD_STATISTIC
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9989
      
      Test Plan: some manual stuff
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D36362362
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 52b65b6282fe839dc6d906ff95a3ed66ca1574ba
      cdaa9576
    • C
      Add pmem-rocksdb-plugin link in PLUGINs.md (#9934) · 07c68071
      Chen Xiao 提交于
      Summary:
      This change adds pmem-rocksdb-plugin link in PLUGINS.md. The link is: https://github.com/pmem/pmem-rocksdb-plugin. It provides a collection plugins to enable Persistent Memory (PMEM) on RocksDB.
      
      The pmem-rocksdb-plugin repo contains RocksDB’s plugins for LSM-tree based KV store to fit it on the PMEM by effectively utilize its characteristics. The first two basic plugins are:
      1) Providing a filesystem API wrapper to write RocksDB's WAL (Write Ahead Log) files on PMEM to optimize write performance. 2) Using PMEM as secondary cache to optimize read performance.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9934
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D36366893
      
      Pulled By: riversand963
      
      fbshipit-source-id: d58a39365e9b5d6a3249d4e9b377c7fb2c79badb
      07c68071
    • Y
      Port the batched version of MultiGet() to RocksDB's C API (#9952) · bcb12872
      Yueh-Hsuan Chiang 提交于
      Summary:
      The batched version of MultiGet() is not available in RocksDB's C API.
      This PR implements rocksdb_batched_multi_get_cf which is a C wrapper function
      that invokes the batched version of MultiGet() which takes one single column family.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9952
      
      Test Plan: Added a new test case under "columnfamilies" test case in c_test.cc
      
      Reviewed By: riversand963
      
      Differential Revision: D36302888
      
      Pulled By: ajkr
      
      fbshipit-source-id: fa134c4a1c8e7d72dd4ae8649a74e3797b5cf4e6
      bcb12872
  8. 12 5月, 2022 4 次提交
    • A
      Update WAL corruption test so that it fails without fix (#9942) · 6442a62e
      Akanksha Mahajan 提交于
      Summary:
      In case of non-TransactionDB and avoid_flush_during_recovery = true, RocksDB won't
      flush the data from WAL to L0 for all column families if possible. As a
      result, not all column families can increase their log_numbers, and
      min_log_number_to_keep won't change.
      For transaction DB (.allow_2pc), even with the flush, there may be old WAL files that it must not delete because they can contain data of uncommitted transactions and min_log_number_to_keep won't change.
      If we persist a new MANIFEST with
      advanced log_numbers for some column families, then during a second
      crash after persisting the MANIFEST, RocksDB will see some column
      families' log_numbers larger than the corrupted WAL, and the "column family inconsistency" error will be hit, causing recovery to fail.
      
      This PR update unit tests to emulate the errors and tests are failing without a fix.
      
      Error:
      ```
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/0
      db/corruption_test.cc:1190: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF test_cf
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/0, where GetParam() = (true, false) (91 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/1
      db/corruption_test.cc:1190: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF test_cf
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/1, where GetParam() = (false, false) (92 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/2
      db/corruption_test.cc:1190: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF test_cf
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/2, where GetParam() = (true, true) (95 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/3
      db/corruption_test.cc:1190: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF test_cf
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery/3, where GetParam() = (false, true) (92 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/0
      db/corruption_test.cc:1354: Failure
      TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/0, where GetParam() = (true, false) (94 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/1
      db/corruption_test.cc:1354: Failure
      TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/1, where GetParam() = (false, false) (97 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/2
      db/corruption_test.cc:1354: Failure
      TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/2, where GetParam() = (true, true) (94 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/3
      db/corruption_test.cc:1354: Failure
      TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles, &txn_db)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.TxnDbCrashDuringRecovery/3, where GetParam() = (false, true) (91 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/0
      db/corruption_test.cc:1483: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/0, where GetParam() = (true, false) (93 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/1
      db/corruption_test.cc:1483: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/1, where GetParam() = (false, false) (94 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/2
      db/corruption_test.cc:1483: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/2, where GetParam() = (true, true) (90 ms)
      [ RUN      ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/3
      db/corruption_test.cc:1483: Failure
      DB::Open(options, dbname_, cf_descs, &handles, &db_)
      Corruption: SST file is ahead of WALs in CF default
      [  FAILED  ] CorruptionTest/CrashDuringRecoveryWithCorruptionTest.CrashDuringRecoveryWithFlush/3, where GetParam() = (false, true) (93 ms)
      [----------] 12 tests from CorruptionTest/CrashDuringRecoveryWithCorruptionTest (1116 ms total)
      
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9942
      
      Test Plan: Not needed
      
      Reviewed By: riversand963
      
      Differential Revision: D36324112
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: cab2075ac4ebe48f5ef93a6ea162558aa4fc334d
      6442a62e
    • P
      Remove slack CircleCI hook (#9982) · e96e8e2d
      Peter Dillinger 提交于
      Summary:
      Our Slack site is deprecated
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9982
      
      Test Plan: CircleCI
      
      Reviewed By: siying
      
      Differential Revision: D36322050
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 678202404d307e1547e4203d7e6bd467803ccd5e
      e96e8e2d
    • A
      Temporarily disable sync_fault_injection (#9979) · e943bbdd
      Andrew Kryczka 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9979
      
      Reviewed By: siying
      
      Differential Revision: D36301555
      
      Pulled By: ajkr
      
      fbshipit-source-id: ed298d3484b6aad3ef19746e984bf4c52be33a9f
      e943bbdd
    • P
      Reorganize CircleCI workflows (#9981) · e8d604cf
      Peter Dillinger 提交于
      Summary:
      Condense down to 8 groups rather than 20+ for ease of browsing
      pages like
      https://app.circleci.com/pipelines/github/facebook/rocksdb?branch=main&filter=all
      
      Also, run nightly builds at 1AM or 2AM Pacific (depending on daylight
      time) rather than 4PM or 5PM Pacific, so that they actually use each
      day's landed changes.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9981
      
      Test Plan:
      CI
      And manually inspected
      ```
      grep -Eo 'build-[^: ]*' .circleci/config.yml | sort | uniq -c | less
      ```
      to ensure I didn't orphan anything
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D36317634
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 1c10d29d6b5d60ce3dd1364cd91f175380075ff3
      e8d604cf
  9. 11 5月, 2022 2 次提交
    • Y
      Support single delete in ldb (#9469) · 26768edb
      yaphet 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9469
      
      Reviewed By: riversand963
      
      Differential Revision: D33953484
      
      fbshipit-source-id: f4e84a2d9865957d744c7e84ff02ffbb0a62b0a8
      26768edb
    • P
      Avoid some warnings-as-error in CircleCI+unity+AVX512F (#9978) · 0d1613aa
      Peter Dillinger 提交于
      Summary:
      Example failure when compiling on sufficiently new hardware and built-in headers:
      
      ```
      In file included from /usr/local/lib/gcc/x86_64-linux-gnu/12.1.0/include/immintrin.h:49,
                       from ./util/bloom_impl.h:21,
                       from table/block_based/filter_policy.cc:31,
                       from unity.cc:167:
      In function '__m512i _mm512_shuffle_epi32(__m512i, _MM_PERM_ENUM)',
          inlined from 'void XXH3_accumulate_512_avx512(void*, const void*, const void*)' at util/xxhash.h:3605:58,
          inlined from 'void XXH3_accumulate(xxh_u64*, const xxh_u8*, const xxh_u8*, size_t, XXH3_f_accumulate_512)' at util/xxhash.h:4229:17,
          inlined from 'void XXH3_hashLong_internal_loop(xxh_u64*, const xxh_u8*, size_t, const xxh_u8*, size_t, XXH3_f_accumulate_512, XXH3_f_scrambleAcc)' at util/xxhash.h:4251:24,
          inlined from 'XXH128_hash_t XXH3_hashLong_128b_internal(const void*, size_t, const xxh_u8*, size_t, XXH3_f_accumulate_512, XXH3_f_scrambleAcc)' at util/xxhash.h:5065:32,
          inlined from 'XXH128_hash_t XXH3_hashLong_128b_withSecret(const void*, size_t, XXH64_hash_t, const void*, size_t)' at util/xxhash.h:5104:39:
      /usr/local/lib/gcc/x86_64-linux-gnu/12.1.0/include/avx512fintrin.h:4459:50: error: '__Y' may be used uninitialized [-Werror=maybe-uninitialized]
      ```
      
      https://app.circleci.com/pipelines/github/facebook/rocksdb/13295/workflows/1695fb5c-40c1-423b-96b4-45107dc3012d/jobs/360416
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9978
      
      Test Plan:
      I was able to re-run in CircleCI with ssh, see the failure, ssh in and
      verify that adding -fno-avx512f fixed the failure. Will watch build-linux-unity-and-headers
      
      Reviewed By: riversand963
      
      Differential Revision: D36296028
      
      Pulled By: pdillinger
      
      fbshipit-source-id: ba5955cf2ac730f57d1d18c2f517e92f34be77a3
      0d1613aa