• C
    Add memtable per key-value checksum (#10281) · fd165c86
    Changyu Bi 提交于
    Summary:
    Append per key-value checksum to internal key. These checksums are verified on read paths including Get, Iterator and during Flush. Get and Iterator will return `Corruption` status if there is a checksum verification failure. Flush will make DB become read-only upon memtable entry checksum verification failure.
    
    Pull Request resolved: https://github.com/facebook/rocksdb/pull/10281
    
    Test Plan:
    - Added new unit test cases: `make check`
    - Benchmark on memtable insert
    ```
    TEST_TMPDIR=/dev/shm/memtable_write ./db_bench -benchmarks=fillseq -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100
    
    # avg over 10 runs
    Baseline: 1166936 ops/sec
    memtable 2 bytes kv checksum : 1.11674e+06 ops/sec (-4%)
    memtable 2 bytes kv checksum + write batch 8 bytes kv checksum: 1.08579e+06 ops/sec (-6.95%)
    write batch 8 bytes kv checksum: 1.17979e+06 ops/sec (+1.1%)
    ```
    -  Benchmark on only memtable read: ops/sec dropped 31% for `readseq` due to time spend on verifying checksum.
    ops/sec for `readrandom` dropped ~6.8%.
    ```
    # Readseq
    sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillseq,readseq"[-X20]" -disable_wal=true -max_write_buffer_number=100 -num=10000000 -min_write_buffer_number_to_merge=100
    
    readseq [AVG    20 runs] : 7432840 (± 212005) ops/sec;  822.3 (± 23.5) MB/sec
    readseq [MEDIAN 20 runs] : 7573878 ops/sec;  837.9 MB/sec
    
    With -memtable_protection_bytes_per_key=2:
    
    readseq [AVG    20 runs] : 5134607 (± 119596) ops/sec;  568.0 (± 13.2) MB/sec
    readseq [MEDIAN 20 runs] : 5232946 ops/sec;  578.9 MB/sec
    
    # Readrandom
    sudo TEST_TMPDIR=/dev/shm/memtable_read ./db_bench -benchmarks=fillrandom,readrandom"[-X10]" -disable_wal=true -max_write_buffer_number=100 -num=1000000 -min_write_buffer_number_to_merge=100
    readrandom [AVG    10 runs] : 140236 (± 3938) ops/sec;    9.8 (± 0.3) MB/sec
    readrandom [MEDIAN 10 runs] : 140545 ops/sec;    9.8 MB/sec
    
    With -memtable_protection_bytes_per_key=2:
    readrandom [AVG    10 runs] : 130632 (± 2738) ops/sec;    9.1 (± 0.2) MB/sec
    readrandom [MEDIAN 10 runs] : 130341 ops/sec;    9.1 MB/sec
    ```
    
    - Stress test: `python3 -u tools/db_crashtest.py whitebox --duration=1800`
    
    Reviewed By: ajkr
    
    Differential Revision: D37607896
    
    Pulled By: cbi42
    
    fbshipit-source-id: fdaefb475629d2471780d4a5f5bf81b44ee56113
    fd165c86
db_crashtest.py 36.7 KB