1. 23 6月, 2020 3 次提交
    • P
      Minimize memory internal fragmentation for Bloom filters (#6427) · 5b2bbacb
      Peter Dillinger 提交于
      Summary:
      New experimental option BBTO::optimize_filters_for_memory builds
      filters that maximize their use of "usable size" from malloc_usable_size,
      which is also used to compute block cache charges.
      
      Rather than always "rounding up," we track state in the
      BloomFilterPolicy object to mix essentially "rounding down" and
      "rounding up" so that the average FP rate of all generated filters is
      the same as without the option. (YMMV as heavily accessed filters might
      be unluckily lower accuracy.)
      
      Thus, the option near-minimizes what the block cache considers as
      "memory used" for a given target Bloom filter false positive rate and
      Bloom filter implementation. There are no forward or backward
      compatibility issues with this change, though it only works on the
      format_version=5 Bloom filter.
      
      With Jemalloc, we see about 10% reduction in memory footprint (and block
      cache charge) for Bloom filters, but 1-2% increase in storage footprint,
      due to encoding efficiency losses (FP rate is non-linear with bits/key).
      
      Why not weighted random round up/down rather than state tracking? By
      only requiring malloc_usable_size, we don't actually know what the next
      larger and next smaller usable sizes for the allocator are. We pick a
      requested size, accept and use whatever usable size it has, and use the
      difference to inform our next choice. This allows us to narrow in on the
      right balance without tracking/predicting usable sizes.
      
      Why not weight history of generated filter false positive rates by
      number of keys? This could lead to excess skew in small filters after
      generating a large filter.
      
      Results from filter_bench with jemalloc (irrelevant details omitted):
      
          (normal keys/filter, but high variance)
          $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9
          Build avg ns/key: 29.6278
          Number of filters: 5516
          Total size (MB): 200.046
          Reported total allocated memory (MB): 220.597
          Reported internal fragmentation: 10.2732%
          Bits/key stored: 10.0097
          Average FP rate %: 0.965228
          $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory
          Build avg ns/key: 30.5104
          Number of filters: 5464
          Total size (MB): 200.015
          Reported total allocated memory (MB): 200.322
          Reported internal fragmentation: 0.153709%
          Bits/key stored: 10.1011
          Average FP rate %: 0.966313
      
          (very few keys / filter, optimization not as effective due to ~59 byte
           internal fragmentation in blocked Bloom filter representation)
          $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9
          Build avg ns/key: 29.5649
          Number of filters: 162950
          Total size (MB): 200.001
          Reported total allocated memory (MB): 224.624
          Reported internal fragmentation: 12.3117%
          Bits/key stored: 10.2951
          Average FP rate %: 0.821534
          $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory
          Build avg ns/key: 31.8057
          Number of filters: 159849
          Total size (MB): 200
          Reported total allocated memory (MB): 208.846
          Reported internal fragmentation: 4.42297%
          Bits/key stored: 10.4948
          Average FP rate %: 0.811006
      
          (high keys/filter)
          $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9
          Build avg ns/key: 29.7017
          Number of filters: 164
          Total size (MB): 200.352
          Reported total allocated memory (MB): 221.5
          Reported internal fragmentation: 10.5552%
          Bits/key stored: 10.0003
          Average FP rate %: 0.969358
          $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory
          Build avg ns/key: 30.7131
          Number of filters: 160
          Total size (MB): 200.928
          Reported total allocated memory (MB): 200.938
          Reported internal fragmentation: 0.00448054%
          Bits/key stored: 10.1852
          Average FP rate %: 0.963387
      
      And from db_bench (block cache) with jemalloc:
      
          $ ./db_bench -db=/dev/shm/dbbench.no_optimize -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false
          $ ./db_bench -db=/dev/shm/dbbench -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -optimize_filters_for_memory -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false
          $ (for FILE in /dev/shm/dbbench.no_optimize/*.sst; do ./sst_dump --file=$FILE --show_properties | grep 'filter block' ; done) | awk '{ t += $4; } END { print t; }'
          17063835
          $ (for FILE in /dev/shm/dbbench/*.sst; do ./sst_dump --file=$FILE --show_properties | grep 'filter block' ; done) | awk '{ t += $4; } END { print t; }'
          17430747
          $ #^ 2.1% additional filter storage
          $ ./db_bench -db=/dev/shm/dbbench.no_optimize -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000
          rocksdb.block.cache.index.add COUNT : 33
          rocksdb.block.cache.index.bytes.insert COUNT : 8440400
          rocksdb.block.cache.filter.add COUNT : 33
          rocksdb.block.cache.filter.bytes.insert COUNT : 21087528
          rocksdb.bloom.filter.useful COUNT : 4963889
          rocksdb.bloom.filter.full.positive COUNT : 1214081
          rocksdb.bloom.filter.full.true.positive COUNT : 1161999
          $ #^ 1.04 % observed FP rate
          $ ./db_bench -db=/dev/shm/dbbench -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -optimize_filters_for_memory -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000
          rocksdb.block.cache.index.add COUNT : 33
          rocksdb.block.cache.index.bytes.insert COUNT : 8448592
          rocksdb.block.cache.filter.add COUNT : 33
          rocksdb.block.cache.filter.bytes.insert COUNT : 18220328
          rocksdb.bloom.filter.useful COUNT : 5360933
          rocksdb.bloom.filter.full.positive COUNT : 1321315
          rocksdb.bloom.filter.full.true.positive COUNT : 1262999
          $ #^ 1.08 % observed FP rate, 13.6% less memory usage for filters
      
      (Due to specific key density, this example tends to generate filters that are "worse than average" for internal fragmentation. "Better than average" cases can show little or no improvement.)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6427
      
      Test Plan: unit test added, 'make check' with gcc, clang and valgrind
      
      Reviewed By: siying
      
      Differential Revision: D22124374
      
      Pulled By: pdillinger
      
      fbshipit-source-id: f3e3aa152f9043ddf4fae25799e76341d0d8714e
      5b2bbacb
    • M
      Make EncryptEnv inheritable (#6830) · 1092f19d
      Matthew Von-Maszewski 提交于
      Summary:
      EncryptEnv class is both declared and defined within env_encryption.cc.  This makes it really tough to derive new classes from that base.
      
      This branch moves declaration of the class to rocksdb/env_encryption.h.  The change facilitates making new encryption modules (such as an upcoming openssl AES CTR pull request) possible / easy.
      
      The only coding change was to add the EncryptEnv object to env_basic_test.cc.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6830
      
      Reviewed By: riversand963
      
      Differential Revision: D21706593
      
      Pulled By: ajkr
      
      fbshipit-source-id: 64d2da95a1569ceeb9b1549c3bec5404cf4c89f0
      1092f19d
    • Z
      Fix double define in IO_tracer (#7007) · d739318b
      Zhichao Cao 提交于
      Summary:
      Fix the following error
      
      "./trace_replay/io_tracer.h:20:20: error: redefinition of ‘const unsigned int rocksdb::{anonymous}::kCharSize’
       const unsigned int kCharSize = 1;
                          ^~~~~~~~~
      In file included from unity.cc:177:
      trace_replay/block_cache_tracer.cc:22:20: note: ‘const unsigned int rocksdb::{anonymous}::kCharSize’ previously defined here
       const unsigned int kCharSize = 1;"
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7007
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D22142618
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: e6dcd51ccc21d1f58df52cdc7a1c88e54cf4f6e8
      d739318b
  2. 20 6月, 2020 5 次提交
    • S
      Remove CircleCI clang build's verbose output (#7000) · 096beb78
      sdong 提交于
      Summary:
      As CirclrCI build's clang build is stable, verbose flag is less useful. On the other hand, the long outputs might create other problems. A non-reproducible failure "make: write error: stdout" might be related to it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7000
      
      Test Plan: Watch the run
      
      Reviewed By: pdillinger
      
      Differential Revision: D22118870
      
      fbshipit-source-id: a4157a4282adddcb0c55c0e9e53b2d9ce18bda66
      096beb78
    • S
      Remove an assertion in FlushAfterIntraL0CompactionCheckConsistencyFail (#7003) · dea4063b
      sdong 提交于
      Summary:
      FlushAfterIntraL0CompactionCheckConsistencyFail is flakey. It sometimes fails with:
      
      db/db_compaction_test.cc:5186: Failure
      Expected equality of these values:
        10
        NumTableFilesAtLevel(0)
          Which is: 3
      
      I don't see a clear reason why the assertion would always be true. The necessarily of the assertion is not clear either. Remove it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7003
      
      Test Plan: See the test still builds.
      
      Reviewed By: riversand963
      
      Differential Revision: D22129753
      
      fbshipit-source-id: 42f0bb05e32b369e8d726bfd3e35c29cf52fe008
      dea4063b
    • P
      Fix block checksum for >=4GB, refactor (#6978) · 25a0d0ca
      Peter Dillinger 提交于
      Summary:
      Although RocksDB falls over in various other ways with KVs
      around 4GB or more, this change fixes how XXH32 and XXH64 were being
      called by the block checksum code to support >= 4GB in case that should
      ever happen, or the code copied for other uses.
      
      This change is not a schema compatibility issue because the checksum
      verification code would checksum the first (block_size + 1) mod 2^32
      bytes while the checksum construction code would checksum the first
      block_size mod 2^32 plus the compression type byte, meaning the
      XXH32/64 checksums for >=4GB block would not match about 255/256 times.
      
      While touching this code, I refactored to consolidate redundant
      implementations, improving diagnostics and performance tracking in some
      cases. Also used less confusing language in those diagnostics.
      
      Makes https://github.com/facebook/rocksdb/issues/6875 obsolete.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6978
      
      Test Plan:
      I was able to write a test for this using an SST file writer
      and VerifyChecksum in a reader. The test fails before the fix, though
      I'm leaving the test disabled because I don't think it's worth the
      expense of running regularly.
      
      Reviewed By: gg814
      
      Differential Revision: D22143260
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 982993d16134e8c50bea2269047f901c1783726e
      25a0d0ca
    • A
      minor fixes for stress/crash contruns (#7006) · d76eed48
      Andrew Kryczka 提交于
      Summary:
      Avoid using `cf_consistency` together with `enable_compaction_filter` as
      the former heavily uses snapshots while the latter is incompatible with
      snapshots.
      
      Also fix a clang-analyze error for a write to a variable that is never
      read.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7006
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D22141679
      
      Pulled By: ajkr
      
      fbshipit-source-id: 1840ae238168818a9ab5973f90fd78c067399447
      d76eed48
    • P
      Remove racially charged terms "whitelist" and "blacklist" (#7008) · 88b42107
      Peter Dillinger 提交于
      Summary:
      We don't need them.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7008
      
      Test Plan: "make check" and ensure "make crash_test" starts
      
      Reviewed By: ajkr
      
      Differential Revision: D22143838
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 72c8e16603abc59f4954e304466bc4dc1f58f94e
      88b42107
  3. 19 6月, 2020 7 次提交
  4. 18 6月, 2020 2 次提交
    • S
      Fix the bug that compressed cache is disabled in read-only DBs (#6990) · 223b57ee
      sdong 提交于
      Summary:
      Compressed block cache is disabled in https://github.com/facebook/rocksdb/pull/4650 for no good reason. Re-enable it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6990
      
      Test Plan: Add a unit test to make sure a general function works with read-only DB + compressed block cache.
      
      Reviewed By: ltamasi
      
      Differential Revision: D22072755
      
      fbshipit-source-id: 2a55df6363de23a78979cf6c747526359e5dc7a1
      223b57ee
    • Z
      Store DB identity and DB session ID in SST files (#6983) · 94d04529
      Zitan Chen 提交于
      Summary:
      `db_id` and `db_session_id` are now part of the table properties for all formats and stored in SST files. This adds about 99 bytes to each new SST file.
      
      The `TablePropertiesNames` for these two identifiers are `rocksdb.creating.db.identity` and `rocksdb.creating.session.identity`.
      
      In addition, SST files generated from SstFileWriter and Repairer have DB identity “SST Writer” and “DB Repairer”, respectively. Their DB session IDs are generated in the same way as `DB::GetDbSessionId`.
      
      A table property test is added.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6983
      
      Test Plan: make check and some manual tests.
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D22048826
      
      Pulled By: gg814
      
      fbshipit-source-id: afdf8c11424a6f509b5c0b06dafad584a80103c9
      94d04529
  5. 17 6月, 2020 2 次提交
  6. 16 6月, 2020 3 次提交
    • Y
      Let best-efforts recovery ignore CURRENT file (#6970) · 9bfd46d0
      Yanqin Jin 提交于
      Summary:
      Best-efforts recovery does not check the content of CURRENT file to determine which MANIFEST to recover from. However, it still checks the presence of CURRENT file to determine whether to create a new DB during `open()`. Therefore, we can tweak the logic in `open()` a little bit so that best-efforts recovery does not rely on CURRENT file at all.
      
      Test plan (dev server):
      make check
      ./db_basic_test --gtest_filter=DBBasicTest.RecoverWithNoCurrentFile
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6970
      
      Reviewed By: anand1976
      
      Differential Revision: D22013990
      
      Pulled By: riversand963
      
      fbshipit-source-id: db552a1868c60ed70e1f7cd252a3a076eb8ea58f
      9bfd46d0
    • L
      Fix uninitialized memory read in table_test (#6980) · aa8f1331
      Levi Tamasi 提交于
      Summary:
      When using parameterized tests, `gtest` sometimes prints the test
      parameters. If no other printing method is available, it essentially
      produces a hex dump of the object. This can cause issues with valgrind
      with types like `TestArgs` in `table_test`, where the object layout has
      gaps (with uninitialized contents) due to the members' alignment
      requirements. The patch fixes the uninitialized reads by providing an
      `operator<<` for `TestArgs` and also makes sure all members are
      initialized (in a consistent order) on all code paths.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6980
      
      Test Plan: `valgrind --leak-check=full ./table_test`
      
      Reviewed By: siying
      
      Differential Revision: D22045536
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 6f5920ac28c712d0aa88162fffb80172ed769c32
      aa8f1331
    • Z
      Add a DB Session ID (#6959) · 88db97b0
      Zitan Chen 提交于
      Summary:
      Added DB::GetDbSessionId by using the same format and machinery as DB::GetDbIdentity.
      The DB Session ID is generated (and therefore, updated) each time a DB object is opened. It is written to the LOG file right after the line of “DB SUMMARY”.
      A test for the uniqueness, for different openings and during the same opening, is also added.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6959
      
      Test Plan: Passed make check
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D21951721
      
      Pulled By: gg814
      
      fbshipit-source-id: 958a48a612db49a39998ea703cded45987d3fa8b
      88db97b0
  7. 14 6月, 2020 2 次提交
    • Z
      Fix persistent cache on windows (#6932) · 9c24a5cb
      Zhen Li 提交于
      Summary:
      Persistent cache feature caused rocks db crash on windows. I posted a issue for it, https://github.com/facebook/rocksdb/issues/6919. I found this is because no "persistent_cache_key_prefix" is generated for persistent cache. Looking repo history, "GetUniqueIdFromFile" is not implemented on Windows. So my fix is adding "NewId()" function in "persistent_cache" and using it to generate prefix for persistent cache. In this PR, i also re-enable related test cases defined in "db_test2" and "persistent_cache_test" for windows.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6932
      
      Test Plan:
      1. run related test cases in "db_test2" and "persistent_cache_test" on windows and see it passed.
      2. manually run db_bench.exe with "read_cache_path" and verified.
      
      Reviewed By: riversand963
      
      Differential Revision: D21911608
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: cdfd938d54a385edbb2836b13aaa1d39b0a6f1c2
      9c24a5cb
    • C
      Make it able to lower cpu priority to specific level in threadpool (#6969) · f7613e2a
      Cheng Chang 提交于
      Summary:
      `Env::LowerThreadPoolCPUPriority` takes a new parameter `CpuPriority` to be able to lower to a specific priority such as `CpuPriority::kIdle`, previously, the priority is always lowered to `CpuPriority::kLow`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6969
      
      Test Plan: unit test `EnvPosixTest::LowerThreadPoolCpuPriority` added to `env_test.cc`.
      
      Reviewed By: siying
      
      Differential Revision: D22011169
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 568878c24a924912e35cef00c552d4a63431cdf4
      f7613e2a
  8. 13 6月, 2020 7 次提交
    • Y
      Add stress test for best-efforts recovery (#6819) · 15d9f28d
      Yanqin Jin 提交于
      Summary:
      Add crash test for the case of best-efforts recovery.
      After a certain amount of time, we kill the db_stress process, randomly delete some certain table files and restart db_stress. Given the randomness of file deletion, it is difficult to verify against a reference for data correctness. Therefore, we just check that the db can restart successfully.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6819
      
      Test Plan:
      ```
      ./db_stress -best_efforts_recovery=true -disable_wal=1 -reopen=0
      ./db_stress -best_efforts_recovery=true -disable_wal=0 -skip_verifydb=1 -verify_db_one_in=0 -continuous_verification_interval=0
      make crash_test_with_best_efforts_recovery
      ```
      
      Reviewed By: anand1976
      
      Differential Revision: D21436753
      
      Pulled By: riversand963
      
      fbshipit-source-id: 0b3605c922a16c37ed17d5ab6682ca4240e47926
      15d9f28d
    • L
      Turn HarnessTest in table_test into a parameterized test (#6974) · bacd6edc
      Levi Tamasi 提交于
      Summary:
      `HarnessTest` in `table_test.cc` currently tests many parameter
      combinations sequentially in a loop. This is problematic from
      a testing perspective, since if the test fails, we have no way of
      knowing how many/which combinations have failed. It can also cause timeouts on
      our test system due to the sheer number of combinations tested.
      (Specifically, the parallel compression threads parameter added by
      https://github.com/facebook/rocksdb/pull/6262 seems to have been the last straw.)
      There is some DIY code there that splits the load among eight test cases
      but that does not appear to be sufficient anymore.
      
      Instead, the patch turns `HarnessTest` into a parameterized test, so all the
      parameter combinations can be tested separately and potentially
      concurrently. It also cleans up the tests a little, fixes
      `RandomizedLongDB`, which did not get updated when the parallel
      compression threads parameter was added, and turns `FooterTests` into a
      standalone test case (since it does not actually need a fixture class).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6974
      
      Test Plan: `make check`
      
      Reviewed By: siying
      
      Differential Revision: D22029572
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 51baea670771c33928f2eb3902bd69dcf540aa41
      bacd6edc
    • S
      Reduce test coverage in older VS versions (#6966) · 7e2ac0c3
      sdong 提交于
      Summary:
      With Appveyor we run the same set of tests for older versions of VS as the latest version. It creates extra hanging which we don't plan to investigate. Instead, minimize tests run there. The full tests on Windows are already covered in CircleCI.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6966
      
      Test Plan: Watch appveyor runs.
      
      Reviewed By: pdillinger
      
      Differential Revision: D22025383
      
      fbshipit-source-id: 079dff9e8213bc750a47f4add90fdbf18de9d737
      7e2ac0c3
    • Z
      fix build with 'USE_HDFS' on windows (#6950) · d63f86e5
      Zhen Li 提交于
      Summary:
      Build with "USE_HDFS" failed with below errors on Windows. This PR is trying to fix them
      Severity	Code	Description	Project	File	Line	Suppression State
      Error (active)	E0020	identifier "ssize_t" is undefined	rocksdb	D:\Git\rocksdb\rocksdb\env\env_hdfs.cc	127
      Error (active)	E1696	cannot open source file "sys/time.h"	rocksdb	D:\Git\rocksdb\rocksdb\env\env_hdfs.cc	15
      Error	C2065	'pthread_t': undeclared identifier	rocksdb	d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h	166
      Error	C3861	'pthread_self': identifier not found	rocksdb	d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h	167
      Error	C1083	Cannot open include file: 'sys/time.h': No such file or directory	rocksdb	d:\git\rocksdb\rocksdb\env\env_hdfs.cc	15
      Error	C2065	'pthread_t': undeclared identifier	db_bench	d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h	166
      Error	C3861	'pthread_self': identifier not found	db_bench	d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h	167
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6950
      
      Test Plan:
      1. manually test build with "USE_HDFS" on Windows, verified HDFS Env related function by db_bench.exe.
      D:\Git\rocksdb\build\Debug>db_bench.exe --hdfs="abfs://test@rdbtest2.dfs.core.windows.net" --num=100 --benchmarks="fillseq,readseq,fillseekseq" --db="abfs://test@rdbtest2.dfs.core.windows.net/test"
      2020-06-05 20:42:21,102 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      2020-06-05 20:42:22,646 WARN utils.SSLSocketFactoryEx: Failed to load OpenSSL. Falling back to the JSSE default.
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      RocksDB:    version 6.10
      Keys:       16 bytes each
      Values:     100 bytes each (50 bytes after compression)
      Entries:    100
      Prefix:    0 bytes
      Keys per prefix:    0
      RawSize:    0.0 MB (estimated)
      FileSize:   0.0 MB (estimated)
      Write rate: 0 bytes/second
      Read rate: 0 ops/second
      Compression: Snappy
      Compression sampling rate: 0
      Memtablerep: skip_list
      Perf Level: 1
      WARNING: Assertions are enabled; benchmarks unnecessarily slow
      ------------------------------------------------
      Initializing RocksDB Options from the specified file
      Initializing RocksDB Options from command-line flags
      DB path: [abfs://test@rdbtest2.dfs.core.windows.net/test]
      fillseq      :    1138.350 micros/op 877 ops/sec;    0.1 MB/s
      DB path: [abfs://test@rdbtest2.dfs.core.windows.net/test]
      readseq      :      63.580 micros/op 15627 ops/sec;    1.7 MB/s
      DB path: [abfs://test@rdbtest2.dfs.core.windows.net/test]
      fillseekseq  :      45.615 micros/op 21762 ops/sec;
      
      Reviewed By: cheng-chang
      
      Differential Revision: D21964806
      
      Pulled By: riversand963
      
      fbshipit-source-id: 9d7413178ece0113d11bc4398583f7d0590d5dbd
      d63f86e5
    • S
      Circle CI's clang build to really use clang (#6965) · 9810f400
      sdong 提交于
      Summary:
      The CircleCI's Clang flavor has a bug that doesn't really use CLANG. Fix it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6965
      
      Test Plan: See CI results.
      
      Reviewed By: pdillinger
      
      Differential Revision: D22025355
      
      fbshipit-source-id: e86922b9152e9f5732e5099d0ce41da9226ff806
      9810f400
    • A
      update HISTORY.md for 6.11 release (#6972) · af58d927
      Andrew Kryczka 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6972
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D22021953
      
      Pulled By: ajkr
      
      fbshipit-source-id: 4debbafe45b5939fd28549230eebf6006eb43440
      af58d927
    • L
      Maintain the set of linked SSTs in BlobFileMetaData (#6945) · 83833637
      Levi Tamasi 提交于
      Summary:
      The `FileMetaData` objects associated with table files already contain the
      number of the oldest blob file referenced by the SST in question. This patch
      adds the inverse mapping to `BlobFileMetaData`, namely the set of table file
      numbers for which the oldest blob file link points to the given blob file (these
      are referred to as *linked SSTs*). This mapping will be used by the GC logic.
      
      Implementation-wise, the patch builds on the `BlobFileMetaDataDelta`
      functionality introduced in https://github.com/facebook/rocksdb/pull/6835: newly linked/unlinked SSTs are
      accumulated in `BlobFileMetaDataDelta`, and the changes to the linked SST set
      are applied in one shot when the new `Version` is saved. The patch also reworks
      the blob file related consistency checks in `VersionBuilder` so they validate the
      consistency of the forward table file -> blob file links and the backward blob file ->
      table file links for blob files that are part of the `Version`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6945
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D21912228
      
      Pulled By: ltamasi
      
      fbshipit-source-id: c5bc7acf6e729a8fccbb12672dd5cd00f6f000f8
      83833637
  9. 12 6月, 2020 4 次提交
    • Y
      Fail point-in-time WAL recovery upon IOError reading WAL (#6963) · 717749f4
      Yanqin Jin 提交于
      Summary:
      If `options.wal_recovery_mode == WALRecoveryMode::kPointInTimeRecovery`, RocksDB stops replaying WAL once hitting an error and discards the rest of the WAL. This can lead to data loss if the error occurs at an offset smaller than the last sync'ed offset.
      Ideally, RocksDB point-in-time recovery should permit recovery if the error occurs after last synced offset while fail recovery if error occurs before the last synced offset. However, RocksDB does not track the synced offset of WALs. Consequently, RocksDB does not know whether an error occurs before or after the last synced offset. An error can be one of the following.
      - WAL record checksum mismatch. This can result from both corruption of synced data and dropping of unsynced data during shutdown. We cannot be sure which one. In order not to defeat the original motivation to permit the latter case, we keep the original behavior of point-in-time WAL recovery.
      - IOError. This means the WAL can be bad, an indicator of whole file becoming unavailable, not to mention synced part of the WAL. Therefore, we choose to modify the behavior of point-in-time recovery and fail the database recovery.
      
      Test plan (devserver):
      make check
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6963
      
      Reviewed By: ajkr
      
      Differential Revision: D22011083
      
      Pulled By: riversand963
      
      fbshipit-source-id: f9cbf29a37dc5cc40d3fa62f89eed1ad67ca1536
      717749f4
    • L
      Revisit the handling of the case when a file is re-added to the same level (#6939) · d854abad
      Levi Tamasi 提交于
      Summary:
      https://github.com/facebook/rocksdb/pull/6901 subtly changed the handling of the corner case
      when a table file is deleted from a level, then re-added to the same level. (Note: this
      should be extremely rare; one scenario that comes to mind is a trivial move followed by
      a call to `ReFitLevel` that moves the file back to the original level.) Before that change,
      a new `FileMetaData` object was created as a result of this sequence; after the change,
      the original `FileMetaData` was essentially resurrected (since the deletion and the addition
      simply cancel each other out with the change). This patch restores the original behavior,
      which is more intuitive considering the interface, and in sync with how trivial moves are handled.
      (Also note that `FileMetaData` contains some mutable data members, the values of which
      might be different in the resurrected object and the freshly created one.)
      The PR also fixes a bug in this area: with the original pre-6901 code, `VersionBuilder`
      would add the same file twice to the same level in the scenario described above.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6939
      
      Test Plan: `make check`
      
      Reviewed By: ajkr
      
      Differential Revision: D21905580
      
      Pulled By: ltamasi
      
      fbshipit-source-id: da07ae45384ecf3c6c53506d106432d88a7ec9df
      d854abad
    • L
      Turn DBTest2.CompressionFailures into a parameterized test (#6968) · 722ebba8
      Levi Tamasi 提交于
      Summary:
      `DBTest2.CompressionFailures` currently tests many configurations
      sequentially using nested loops, which often leads to timeouts
      in our test system. The patch turns it into a parameterized test
      instead.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6968
      
      Test Plan: `make check`
      
      Reviewed By: siying
      
      Differential Revision: D22006954
      
      Pulled By: ltamasi
      
      fbshipit-source-id: f71f2f7108086b7651ecfce3d79a7fab24620b2c
      722ebba8
    • Z
      Ingest SST files with checksum information (#6891) · b3585a11
      Zhichao Cao 提交于
      Summary:
      Application can ingest SST files with file checksum information, such that during ingestion, DB is able to check data integrity and identify of the SST file. The PR introduces generate_and_verify_file_checksum to IngestExternalFileOption to control if the ingested checksum information should be verified with the generated checksum.
      
          1. If generate_and_verify_file_checksum options is *FALSE*: *1)* if DB does not enable SST file checksum, the checksum information ingested will be ignored; *2)* if DB enables the SST file checksum and the checksum function name matches the checksum function name in DB, we trust the ingested checksum, store it in Manifest. If the checksum function name does not match, we treat that as an error and fail the IngestExternalFile() call.
          2. If generate_and_verify_file_checksum options is *TRUE*: *1)* if DB does not enable SST file checksum, the checksum information ingested will be ignored; *2)* if DB enable the SST file checksum, we will use the checksum generator from DB to calculate the checksum for each ingested SST files after they are copied or moved. Then, compare the checksum results with the ingested checksum information: _A)_ if the checksum function name does not match, _verification always report true_ and we store the DB generated checksum information in Manifest. _B)_ if the checksum function name mach, and checksum match, ingestion continues and stores the checksum information in the Manifest. Otherwise, terminate file ingestion and report file corruption.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6891
      
      Test Plan: added unit test, pass make asan_check
      
      Reviewed By: pdillinger
      
      Differential Revision: D21935988
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 7b55f486632db467e76d72602218d0658aa7f6ed
      b3585a11
  10. 11 6月, 2020 2 次提交
    • L
      Use a per-thread path for the export directory in import_column_family_test (#6962) · fbe2d259
      Levi Tamasi 提交于
      Summary:
      This is required so that the test cases can safely be run in parallel.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6962
      
      Test Plan: `make check`
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D21980060
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 616b7a0b686155d3874848b9098c67ad3f47efcc
      fbe2d259
    • A
      save a key comparison in block seeks (#6646) · e6be168a
      Andrew Kryczka 提交于
      Summary:
      This saves up to two key comparisons in block seeks. The first key
      comparison saved is a redundant key comparison against the restart key
      where the linear scan starts. This comparison is saved in all cases
      except when the found key is in the first restart interval. The
      second key comparison saved is a redundant key comparison against the
      restart key where the linear scan ends. This is only saved in cases
      where all keys in the restart interval are less than the target
      (probability roughly `1/restart_interval`).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6646
      
      Test Plan:
      ran a benchmark with mostly default settings and counted key comparisons
      
      before: `user_key_comparison_count = 19399529`
      after: `user_key_comparison_count = 18431498`
      
      setup command:
      
      ```
      $ TEST_TMPDIR=/dev/shm/dbbench ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -level_compaction_dynamic_level_bytes=true -num=10000000
      ```
      
      benchmark command:
      
      ```
      $ TEST_TMPDIR=/dev/shm/dbbench/ ./db_bench -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=10000000 -compression_type=none -reads=1000000 -perf_level=3
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D20849707
      
      Pulled By: ajkr
      
      fbshipit-source-id: 1f01c5cd99ea771fd27974046e37b194f1cdcfac
      e6be168a
  11. 10 6月, 2020 3 次提交