1. 23 12月, 2020 3 次提交
    • S
      Apply the changes from: PS-5501 : Re-license PerconaFT 'locktree' to Apache V2 (#7801) · 10220909
      Sergei Petrunia 提交于
      Summary:
      commit d5178f513c0b4144a5ac9358ec0f6a3b54a28e76
      Author: George O. Lorch III <george.lorch@percona.com>
      Date:   Tue Mar 19 12:18:40 2019 -0700
      
          PS-5501 : Re-license PerconaFT 'locktree' to Apache V2
      
          - Fixed some incomplete relicensed files from previous round.
      
          - Added missing license text to some.
      
          - Relicensed more files to Apache V2 that locktree depends on.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7801
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25682430
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: deb8a0de3e76f3638672997bfbd300e2fffbe5f5
      10220909
    • S
      Minimize Timing Issue in test WALTrashCleanupOnOpen (#7796) · 9057d0a0
      sdong 提交于
      Summary:
      We saw DBWALTestWithParam/DBWALTestWithParam.WALTrashCleanupOnOpen sometimes fail with:
      
      db/db_sst_test.cc:575: Failure
      Expected: (trash_log_count) >= (1), actual: 0 vs 1
      
      The suspicious is that delete scheduling actually deleted all trash files based on rate, but it is not expected. This can be reproduced if we manually add sleep after DB is closed for serveral seconds. Minimize its chance by setting the delete rate to be lowest possible.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7796
      
      Test Plan: The test doesn't fail with the manual sleeping anymore
      
      Reviewed By: anand1976
      
      Differential Revision: D25675000
      
      fbshipit-source-id: a39fd05e1a83719c41014e48843792e752368e22
      9057d0a0
    • A
      Add tests in ASSERT_STATUS_CHECKED (#7793) · fbac1b3f
      Akanksha Mahajan 提交于
      Summary:
      add io_tracer_parser_test and prefetch_test under
      ASSERT_STATUS_CHECKED
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7793
      
      Test Plan: ASSERT_STATUS_CHECKED=1 make check -j64
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25673464
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 50e0b6f17160ddda206a521a7b47ee33e699a2d4
      fbac1b3f
  2. 22 12月, 2020 3 次提交
  3. 20 12月, 2020 2 次提交
    • J
      Update release version to 6.16 (#7782) · a8aeefd0
      Jay Zhuang 提交于
      Summary:
      Update release version to 6.8
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7782
      
      Reviewed By: siying
      
      Differential Revision: D25648579
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: c536d606868b95c5fb2ae8f19c17eb259d67bc51
      a8aeefd0
    • P
      aggregated-table-properties with GetMapProperty (#7779) · 4d1ac19e
      Peter Dillinger 提交于
      Summary:
      So that we can more easily get aggregate live table data such
      as total filter, index, and data sizes.
      
      Also adds ldb support for getting properties
      
      Also fixed some missing/inaccurate related comments in db.h
      
      For example:
      
          $ ./ldb --db=testdb get_property rocksdb.aggregated-table-properties
          rocksdb.aggregated-table-properties.data_size: 102871
          rocksdb.aggregated-table-properties.filter_size: 0
          rocksdb.aggregated-table-properties.index_partitions: 0
          rocksdb.aggregated-table-properties.index_size: 2232
          rocksdb.aggregated-table-properties.num_data_blocks: 100
          rocksdb.aggregated-table-properties.num_deletions: 0
          rocksdb.aggregated-table-properties.num_entries: 15000
          rocksdb.aggregated-table-properties.num_merge_operands: 0
          rocksdb.aggregated-table-properties.num_range_deletions: 0
          rocksdb.aggregated-table-properties.raw_key_size: 288890
          rocksdb.aggregated-table-properties.raw_value_size: 198890
          rocksdb.aggregated-table-properties.top_level_index_size: 0
          $ ./ldb --db=testdb get_property rocksdb.aggregated-table-properties-at-level1
          rocksdb.aggregated-table-properties-at-level1.data_size: 80909
          rocksdb.aggregated-table-properties-at-level1.filter_size: 0
          rocksdb.aggregated-table-properties-at-level1.index_partitions: 0
          rocksdb.aggregated-table-properties-at-level1.index_size: 1787
          rocksdb.aggregated-table-properties-at-level1.num_data_blocks: 81
          rocksdb.aggregated-table-properties-at-level1.num_deletions: 0
          rocksdb.aggregated-table-properties-at-level1.num_entries: 12466
          rocksdb.aggregated-table-properties-at-level1.num_merge_operands: 0
          rocksdb.aggregated-table-properties-at-level1.num_range_deletions: 0
          rocksdb.aggregated-table-properties-at-level1.raw_key_size: 238210
          rocksdb.aggregated-table-properties-at-level1.raw_value_size: 163414
          rocksdb.aggregated-table-properties-at-level1.top_level_index_size: 0
          $
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7779
      
      Test Plan: Added a test to ldb_test.py
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25653103
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 2905469a08a64dd6b5510cbd7be2e64d3234d6d3
      4d1ac19e
  4. 19 12月, 2020 4 次提交
    • C
      Track WAL obsoletion when updating empty CF's log number (#7781) · fbce7a38
      Cheng Chang 提交于
      Summary:
      In the write path, there is an optimization: when a new WAL is created during SwitchMemtable, we update the internal log number of the empty column families to the new WAL. `FindObsoleteFiles` marks a WAL as obsolete if the WAL's log number is less than `VersionSet::MinLogNumberWithUnflushedData`. After updating the empty column families' internal log number, `VersionSet::MinLogNumberWithUnflushedData` might change, so some WALs might become obsolete to be purged from disk.
      
      For example, consider there are 3 column families: 0, 1, 2:
      1. initially, all the column families' log number is 1;
      2. write some data to cf0, and flush cf0, but the flush is pending;
      3. now a new WAL 2 is created;
      4. write data to cf1 and WAL 2, now cf0's log number is 1, cf1's log number is 2, cf2's log number is 2 (because cf1 and cf2 are empty, so their log numbers will be set to the highest log number);
      5. now cf0's flush hasn't finished, flush cf1, a new WAL 3 is created, and cf1's flush finishes, now cf0's log number is 1, cf1's log number is 3, cf2's log number is 3, since WAL 1 still contains data for the unflushed cf0, no WAL can be deleted from disk;
      6. now cf0's flush finishes, cf0's log number is 2 (because when cf0 was switching memtable, WAL 3 does not exist yet), cf1's log number is 3, cf2's log number is 3, so WAL 1 can be purged from disk now, but WAL 2 still cannot because `MinLogNumberToKeep()` is 2;
      7. write data to cf2 and WAL 3, because cf0 is empty, its log number is updated to 3, so now cf0's log number is 3, cf1's log number is 3, cf2's log number is 3;
      8. now if the background threads want to purge obsolete files from disk, WAL 2 can be purged because `MinLogNumberToKeep()` is 3. But there are only two flush results written to MANIFEST: the first is for flushing cf1, and the `MinLogNumberToKeep` is 1, the second is for flushing cf0, and the `MinLogNumberToKeep` is 2. So without this PR, if the DB crashes at this point and try to recover, `WalSet` will still expect WAL 2 to exist.
      
      When WAL tracking is enabled, we assume WALs will only become obsolete after a flush result is written to MANIFEST in `MemtableList::TryInstallMemtableFlushResults` (or its atomic flush counterpart). The above situation breaks this assumption.
      
      This PR tracks WAL obsoletion if necessary before updating the empty column families' log numbers.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7781
      
      Test Plan:
      watch existing tests and stress tests to pass.
      `make -j48 blackbox_crash_test` on devserver
      
      Reviewed By: ltamasi
      
      Differential Revision: D25631695
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: ca7fff967bdb42204b84226063d909893bc0a4ec
      fbce7a38
    • A
      Fix various small build issues, Java API naming (#7776) · 62afa968
      Adam Retter 提交于
      Summary:
      * Compatibility with older GCC.
      * Compatibility with older jemalloc libraries.
      * Remove Docker warning when building i686 binaries.
      * Fix case inconsistency in Java API naming (potential update to HISTORY.md deferred)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7776
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D25607235
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 7ab0fb7fa7a34e97ed0bec991f5081acb095777d
      62afa968
    • S
      Update code comment for options.ttl (#7775) · 75e4af14
      sdong 提交于
      Summary:
      The behavior of options.ttl has been updated long ago but we didn't update the code comments.
      Also update the periodic compaction's comment.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7775
      
      Test Plan: See it can still build through CI.
      
      Reviewed By: ajkr
      
      Differential Revision: D25592015
      
      fbshipit-source-id: b1db18b6787e7048ce6aedcbc3bb44493c9fc49b
      75e4af14
    • P
      Support optimize_filters_for_memory for Ribbon filter (#7774) · 239d17a1
      Peter Dillinger 提交于
      Summary:
      Primarily this change refactors the optimize_filters_for_memory
      code for Bloom filters, based on malloc_usable_size, to also work for
      Ribbon filters.
      
      This change also replaces the somewhat slow but general
      BuiltinFilterBitsBuilder::ApproximateNumEntries with
      implementation-specific versions for Ribbon (new) and Legacy Bloom
      (based on a recently deleted version). The reason is to emphasize
      speed in ApproximateNumEntries rather than 100% accuracy.
      
      Justification: ApproximateNumEntries (formerly CalculateNumEntry) is
      only used by RocksDB for range-partitioned filters, called each time we
      start to construct one. (In theory, it should be possible to reuse the
      estimate, but the abstractions provided by FilterPolicy don't really
      make that workable.) But this is only used as a heuristic estimate for
      hitting a desired partitioned filter size because of alignment to data
      blocks, which have various numbers of unique keys or prefixes. The two
      factors lead us to prioritize reasonable speed over 100% accuracy.
      
      optimize_filters_for_memory adds extra complication, because precisely
      calculating num_entries for some allowed number of bytes depends on state
      with optimize_filters_for_memory enabled. And the allocator-agnostic
      implementation of optimize_filters_for_memory, using malloc_usable_size,
      means we would have to actually allocate memory, many times, just to
      precisely determine how many entries (keys) could be added and stay below
      some size budget, for the current state. (In a draft, I got this
      working, and then realized the balance of speed vs. accuracy was all
      wrong.)
      
      So related to that, I have made CalculateSpace, an internal-only API
      only used for testing, non-authoritative also if
      optimize_filters_for_memory is enabled. This simplifies some code.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7774
      
      Test Plan:
      unit test updated, and for FilterSize test, range of tested
      values is greatly expanded (still super fast)
      
      Also tested `db_bench -benchmarks=fillrandom,stats -bloom_bits=10 -num=1000000 -partition_index_and_filters -format_version=5 [-optimize_filters_for_memory] [-use_ribbon_filter]` with temporary debug output of generated filter sizes.
      
      Bloom+optimize_filters_for_memory:
      
            1 Filter size: 197 (224 in memory)
          134 Filter size: 3525 (3584 in memory)
          107 Filter size: 4037 (4096 in memory)
          Total on disk: 904,506
          Total in memory: 918,752
      
      Ribbon+optimize_filters_for_memory:
      
            1 Filter size: 3061 (3072 in memory)
          110 Filter size: 3573 (3584 in memory)
           58 Filter size: 4085 (4096 in memory)
          Total on disk: 633,021 (-30.0%)
          Total in memory: 634,880 (-30.9%)
      
      Bloom (no offm):
      
            1 Filter size: 261 (320 in memory)
            1 Filter size: 3333 (3584 in memory)
          240 Filter size: 3717 (4096 in memory)
          Total on disk: 895,674 (-1% on disk vs. +offm; known tolerable overhead of offm)
          Total in memory: 986,944 (+7.4% vs. +offm)
      
      Ribbon (no offm):
      
            1 Filter size: 2949 (3072 in memory)
            1 Filter size: 3381 (3584 in memory)
          167 Filter size: 3701 (4096 in memory)
          Total on disk: 624,397 (-30.3% vs. Bloom)
          Total in memory: 690,688 (-30.0% vs. Bloom)
      
      Note that optimize_filters_for_memory is even more effective for Ribbon filter than for cache-local Bloom, because it can close the unused memory gap even tighter than Bloom filter, because of 16 byte increments for Ribbon vs. 64 byte increments for Bloom.
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25592970
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 606fdaa025bb790d7e9c21601e8ea86e10541912
      239d17a1
  5. 18 12月, 2020 1 次提交
  6. 17 12月, 2020 2 次提交
    • A
      Fix clang_analyze error (#7777) · 99f5a800
      Akanksha Mahajan 提交于
      Summary:
      Fix clang_analyze error
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7777
      
      Test Plan:
      USE_CLANG=1 TEST_TMPDIR=/dev/shm/rocksdb OPT=-g make -j64
      analyze
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25601675
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 30f58cf4d575a2d546c455fb43e856455eb72a07
      99f5a800
    • A
      Fix failing RocksJava test compilation and add CI (#7769) · 29d12748
      Adam Retter 提交于
      Summary:
      * Fixes a Java test compilation issue on macOS
      * Cleans up CircleCI RocksDBJava build config
      * Adds CircleCI for RocksDBJava on MacOS
      * Ensures backwards compatibility with older macOS via CircleCI
      * Fixes RocksJava static builds ordering
      * Adds missing RocksJava static builds to CircleCI for Mac and Linux
      * Improves parallelism in RocksJava builds
      * Reduces the size of the machines used for RocksJava CircleCI as they don't need to be so large (Saves credits)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7769
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D25601293
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 0a0bb9906f65438fe143487d78e37e1947364d08
      29d12748
  7. 16 12月, 2020 1 次提交
    • B
      Do not full scan obsolete files on compaction busy (#7739) · 2021392e
      Burton Li 提交于
      Summary:
      When ConcurrentTaskLimiter is enabled and there are too many outstanding compactions, BackgroundCompaction returns Status::Busy(), which shouldn't be treat as compaction failure.
      This caused performance issue when outstanding compactions reached the limit.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7739
      
      Reviewed By: cheng-chang
      
      Differential Revision: D25508319
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 3b181b16ada0ca3393cfa3a7412985764e79c719
      2021392e
  8. 15 12月, 2020 2 次提交
    • J
      Log sst number in Corruption status (#7767) · a0e4421e
      Jay Zhuang 提交于
      Summary:
      sst file number in corruption error would be very useful for debugging
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7767
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D25485872
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 67315b582cedeefbce6676015303ebe5bf6526a3
      a0e4421e
    • L
      Add initial blob support to batched MultiGet (#7766) · 1afbd194
      Levi Tamasi 提交于
      Summary:
      The patch adds initial support for reading blobs to the batched `MultiGet` API.
      The current implementation simply retrieves the blob values as the blob indexes
      are encountered; that is, reads from blob files are currently not batched. (This
      will be optimized in a separate phase.) In addition, the patch removes some dead
      code related to BlobDB from the batched `MultiGet` implementation, namely the
      `is_blob` / `is_blob_index` flags that are passed around in `DBImpl` and `MemTable` /
      `MemTableListVersion`. These were never hooked up to anything and wouldn't
      work anyways, since a single flag is not sufficient to communicate the "blobness"
      of multiple key-values.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7766
      
      Test Plan: `make check`
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25479290
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 7aba2d290e31876ee592bcf1adfd1018713a8000
      1afbd194
  9. 12 12月, 2020 3 次提交
    • P
      Use size_t for filter APIs, protect against overflow (#7726) · 003e72b2
      Peter Dillinger 提交于
      Summary:
      Deprecate CalculateNumEntry and replace with
      ApproximateNumEntries (better name) using size_t instead of int and
      uint32_t, to minimize confusing casts and bad overflow behavior
      (possible though probably not realistic). Bloom sizes are now explicitly
      capped at max size supported by implementations: just under 4GiB for
      fv=5 Bloom, and just under 512MiB for fv<5 Legacy Bloom. This
      hardening could help to set up for fuzzing.
      
      Also, since RocksDB only uses this information as an approximation
      for trying to hit certain sizes for partitioned filters, it's more important
      that the function be reasonably fast than for it to be completely
      accurate. It's hard enough to be 100% accurate for Ribbon (currently
      reversing CalculateSpace) that adding optimize_filters_for_memory
      into the mix is just not worth trying to be 100% accurate for num
      entries for bytes.
      
      Also:
      - Cleaned up filter_policy.h to remove MSVC warning handling and
      potentially unsafe use of exception for "not implemented"
      - Correct the number of entries limit beyond which current Ribbon
      implementation falls back on Bloom instead.
      - Consistently use "num_entries" rather than "num_entry"
      - Remove LegacyBloomBitsBuilder::CalculateNumEntry as it's essentially
      obsolete from general implementation
      BuiltinFilterBitsBuilder::CalculateNumEntries.
      - Fix filter_bench to skip some tests that don't make sense when only
      one or a small number of filters has been generated.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7726
      
      Test Plan:
      expanded existing unit tests for CalculateSpace /
      ApproximateNumEntries. Also manually used filter_bench to verify Legacy and
      fv=5 Bloom size caps work (much too expensive for unit test). Note that
      the actual bits per key is below requested due to space cap.
      
          $ ./filter_bench -impl=0 -bits_per_key=20 -average_keys_per_filter=256000000 -vary_key_count_ratio=0 -m_keys_total_max=256 -allow_bad_fp_rate
          ...
          Total size (MB): 511.992
          Bits/key stored: 16.777
          ...
          $ ./filter_bench -impl=2 -bits_per_key=20 -average_keys_per_filter=2000000000 -vary_key_count_ratio=0 -m_keys_total_max=2000
          ...
          Total size (MB): 4096
          Bits/key stored: 17.1799
          ...
          $
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25239800
      
      Pulled By: pdillinger
      
      fbshipit-source-id: f94e6d065efd31e05ec630ae1a82e6400d8390c4
      003e72b2
    • C
      Update SstFileWriter fuzzer to iterate and check all key-value pairs (#7761) · 49177951
      Cheng Chang 提交于
      Summary:
      as title
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7761
      
      Test Plan: cd fuzz && make sst_file_writer_fuzzer && ./sst_file_writer_fuzzer
      
      Reviewed By: pdillinger
      
      Differential Revision: D25430802
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 01436307df6f4c434bb608f44e1c8e4a1119f94f
      49177951
    • P
      Fix memory leak for ColumnFamily drop with live iterator (#7749) · b1ee1914
      Peter Dillinger 提交于
      Summary:
      Uncommon bug seen by ASAN with
      ColumnFamilyTest.LiveIteratorWithDroppedColumnFamily, if the last two
      references to a ColumnFamilyData are both SuperVersions (during
      InstallSuperVersion). The fix is to use UnrefAndTryDelete even in
      SuperVersion::Cleanup but with a parameter to avoid re-entering Cleanup
      on the same SuperVersion being cleaned up.
      
      ColumnFamilyData::Unref is considered unsafe so removed.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7749
      
      Test Plan: ./column_family_test --gtest_filter=*LiveIter* --gtest_repeat=100
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25354304
      
      Pulled By: pdillinger
      
      fbshipit-source-id: e78f3a3f67c40013b8432f31d0da8bec55c5321c
      b1ee1914
  10. 11 12月, 2020 3 次提交
  11. 10 12月, 2020 9 次提交
    • A
      Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) · 8ff6557e
      Adam Retter 提交于
      Summary:
      Second batch of adding more tests to ASSERT_STATUS_CHECKED.
      
      * external_sst_file_basic_test
      * checkpoint_test
      * db_wal_test
      * db_block_cache_test
      * db_logical_block_size_cache_test
      * db_blob_index_test
      * optimistic_transaction_test
      * transaction_test
      * point_lock_manager_test
      * write_prepared_transaction_test
      * write_unprepared_transaction_test
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7698
      
      Reviewed By: cheng-chang
      
      Differential Revision: D25441664
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 9e78867f32321db5d4833e95eb96c5734526ef00
      8ff6557e
    • M
      Fix use of positional args in BUCK rules (#7760) · 8e2749fd
      Michael Lee 提交于
      Summary:
      Prefer to use keyword args rather than positional args for Buck rules. This appears to be the only remaining instance for `custom_unittest`
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7760
      
      Test Plan: Search for other instances of `custom_unittest` without `name`
      
      Reviewed By: cheng-chang
      
      Differential Revision: D25439887
      
      Pulled By: mzlee
      
      fbshipit-source-id: 518c541a5c01207c7b0c1f7322addf5cc4f09f92
      8e2749fd
    • M
      Invalidate iterator on transaction clear (#7733) · 71239908
      Manuel Ung 提交于
      Summary:
      Some clients do not close their iterators until after the transaction finishes. To handle this case, we will invalidate any iterators on transaction clear.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7733
      
      Reviewed By: cheng-chang
      
      Differential Revision: D25261158
      
      Pulled By: lth
      
      fbshipit-source-id: b91320f00c54cbe0e6882b794b34f3bb5640dbc0
      71239908
    • C
      Carry over min_log_number_to_keep_2pc in new MANIFEST (#7747) · 80159f6e
      Cheng Chang 提交于
      Summary:
      When two phase commit is enabled, `VersionSet::min_log_number_to_keep_2pc` is set during flush.
      But when a new MANIFEST is created, the `min_log_number_to_keep_2pc` is not carried over to the new MANIFEST. So if a new MANIFEST is created and then DB is reopened, the `min_log_number_to_keep_2pc` will be lost.  This may cause DB recovery errors.
      The bug is reproduced in a new unit test in `version_set_test.cc`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7747
      
      Test Plan: The new unit test in `version_set_test.cc` should pass.
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25350661
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: eee890d5b19f15769069670692e270ae31044ece
      80159f6e
    • A
      Ensure that MultiGet works properly with compressed cache (#7756) · 8a1488ef
      anand76 提交于
      Summary:
      Ensure that when direct IO is enabled and a compressed block cache is
      configured, MultiGet inserts compressed data blocks into the compressed
      block cache.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7756
      
      Test Plan: Add unit test to db_basic_test
      
      Reviewed By: cheng-chang
      
      Differential Revision: D25416240
      
      Pulled By: anand1976
      
      fbshipit-source-id: 75d57526370c9c0a45ff72651f3278dbd8a9086f
      8a1488ef
    • C
      Add a test for disabling tracking WAL (#7757) · 3c2a4488
      Cheng Chang 提交于
      Summary:
      If WAL tracking was enabled, then disabled during reopen, the previously tracked WALs should be removed from MANIFEST.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7757
      
      Test Plan: a new unit test `DBBasicTest.DisableTrackWal` is added.
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25410508
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 9d8d9e665066135930a7c1035bb8c2f68bded6a0
      3c2a4488
    • C
      Add a new db_map_fuzzer (#7762) · 89cc06b3
      Cheng Chang 提交于
      Summary:
      Execute randomly generated operations on both a DB and a std::map,
      then reopen the DB and make sure that iterating the DB produces the
      same key-value pairs as iterating through the std::map.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7762
      
      Test Plan: cd fuzz && make db_map_fuzzer && ./db_map_fuzzer
      
      Reviewed By: pdillinger
      
      Differential Revision: D25437485
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 3a93f7efd046b194193e45d2ab1ad81565510781
      89cc06b3
    • C
      Always track WAL obsoletion (#7759) · efe827ba
      Cheng Chang 提交于
      Summary:
      Currently, when a WAL becomes obsolete after flushing, if VersionSet::WalSet does not contain the WAL, we do not track the WAL obsoletion event in MANIFEST.
      
      But consider this case:
      * WAL 10 is synced, a VersionEdit is LogAndApplied to MANIFEST to log this WAL addition event, but the VersionEdit is not applied to WalSet yet since its corresponding ManifestWriter is still pending in the write queue;
      * Since the above ManifestWriter is blocking, the LogAndApply will block on a conditional variable and release the db mutex, so another LogAndApply can proceed to enqueue other VersionEdits concurrently;
      * Now flush happens, and WAL 10 becomes obsolete, although WalSet does not contain WAL 10 yet, we should call LogAndApply to enqueue a VersionEdit to indicate the obsoletion of WAL 10;
      * otherwise, when the queued edit indicating WAL 10 addition is logged to MANIFEST, and DB crashes and reopens, the WAL 10 might have been removed from disk, but it still exists in MANIFEST.
      
      This PR changes the behavior to: always `LogAndApply` any WAL addition or obsoletion event, without considering the order issues caused by concurrency, but when applying the edits to `WalSet`, do not add the WALs if they are already obsolete. In this approach, the logical events of WAL addition and obsoletion are always tracked in MANIFEST, so we can inspect the MANIFEST and know all the previous WAL events, but we choose to ignore certain events due to the concurrency issues such as the case above, or the case in https://github.com/facebook/rocksdb/pull/7725.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7759
      
      Test Plan: make check
      
      Reviewed By: pdillinger
      
      Differential Revision: D25423089
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 9cb9a7fbc1875bf954f2a42f9b6cfd6d49a7b21c
      efe827ba
    • S
      LockTree library, originally from PerconaFT (#7753) · 98236fb1
      Sergei Petrunia 提交于
      Summary:
      To be used for implementing Range Locking.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7753
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D25378980
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 801a9c5cd92a84654ca2586b73e8f69001e89320
      98236fb1
  12. 09 12月, 2020 4 次提交
    • A
      Add further tests to ASSERT_STATUS_CHECKED (1) (#7679) · 7b2216c9
      Adam Retter 提交于
      Summary:
      First batch of adding more tests to ASSERT_STATUS_CHECKED.
      
      * db_iterator_test
      * db_memtable_test
      * db_merge_operator_test
      * db_merge_operand_test
      * write_batch_test
      * write_batch_with_index_test
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7679
      
      Reviewed By: ajkr
      
      Differential Revision: D25399270
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 3017d0a686aec5cd2d743fc2acbbf75df239f3ba
      7b2216c9
    • P
      Fix build on FreeBSD/powerpc64(le) (#7732) · 66e54c59
      pkubaj 提交于
      Summary:
      To build on FreeBSD, arch_ppc_probe needs to be adapted to FreeBSD.
      
      Since FreeBSD uses elf_aux_info as an getauxval equivalent, use it and include necessary headers:
      - machine/cpu.h for PPC_FEATURE2_HAS_VEC_CRYPTO,
      - sys/auxv.h for elf_aux_info,
      - sys/elf_common.h for AT_HWCAP2.
      
      elf_aux_info isn't checked for being available, because it's available since FreeBSD 12.0. rocksdb assumes using Clang on FreeBSD, but powerpc* platforms switch to Clang only since 13.0.
      
      This patch makes rocksdb build on FreeBSD on powerpc64 and powerpc64le platforms.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7732
      
      Reviewed By: ltamasi
      
      Differential Revision: D25399194
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 9c905147d75f98cd2557dd2f86a940b8e6c5afcd
      66e54c59
    • V
      Adding ARM AT_HWCAP support for FreeBSD (#7750) · 93c6c18c
      Vincent Milum Jr 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7750
      
      Reviewed By: ltamasi
      
      Differential Revision: D25400609
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 13b15e2f490acc011b648fbd9615ea8e580cccc7
      93c6c18c
    • C
      Do not track obsolete WALs in MANIFEST even if they are synced (#7725) · 07030c6f
      Cheng Chang 提交于
      Summary:
      Consider the case:
      1. All column families are flushed, so all WALs become obsolete, but no WAL is removed from disk yet because the removal is asynchronous, a VersionEdit is written to MANIFEST indicating that WALs before a certain WAL number are obsolete, let's say this number is 3;
      2. `SyncWAL` is called, so all the on-disk WALs are synced, and if track_and_verify_wal_in_manifest=true, the WALs will be tracked in MANIFEST, let's say the WAL numbers are 1 and 2;
      3. DB crashes;
      4. During DB recovery, when replaying MANIFEST, we first see that WAL with number < 3 are obsolete, then we see that WAL 1 and 2 are synced, so according to current implementation of `WalSet`, the `WalSet` will be recovered to include WAL 1 and 2;
      5. WAL 1 and 2 are asynchronously deleted from disk, then the WAL verification algorithm fails with `Corruption: missing WAL`.
      
      The above case is reproduced in a new unit test `DBBasicTestTrackWal::DoNotTrackObsoleteWal`.
      
      The fix is to maintain the upper bound of the obsolete WAL numbers, any WAL with number less than the maintained number is considered to be obsolete, so shouldn't be tracked even if they are later synced. The number is maintained in `WalSet`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7725
      
      Test Plan:
      1. a new unit test `DBBasicTestTrackWal::DoNotTrackObsoleteWal` is added.
      2. run `make crash_test` on devserver.
      
      Reviewed By: riversand963
      
      Differential Revision: D25238914
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: f5dccd57c3d89f19565ec5731f2d42f06d272b72
      07030c6f
  13. 08 12月, 2020 3 次提交
    • Y
      Refactor ProcessManifestWrites a little bit (#7751) · 11c4be22
      Yanqin Jin 提交于
      Summary:
      This PR removes a nested loop inside ProcessManifestWrites. The new
      implementation has the same behavior as the old code with simpler logic
      and lower complexity.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7751
      
      Test Plan:
      make check
      Run make crash_test on devserver and succeeds 3 times.
      
      Reviewed By: ltamasi
      
      Differential Revision: D25363526
      
      Pulled By: riversand963
      
      fbshipit-source-id: 27e681949dacd7501a752e5e517b9e85b54ccb2e
      11c4be22
    • S
      Range Locking: Allow different LockManagers, add Range Lock definitions (#7443) · d8bd9fc7
      Sergei Petrunia 提交于
      Summary:
      This PR has two commits:
      1.  Modify the code to allow different Lock Managers (of any kind) to be used.  It is implied that a LockManager uses its own custom LockTracker.
      2.  Add definitions for Range Locking (class Endpoint and GetRangeLock() function.
      
      cheng-chang, is this what you've had in mind (should the PR have both item 1 and item 2?)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7443
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D24123172
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: c6548ad6d4cc3c25f68d13b29147bc6fdf357185
      d8bd9fc7
    • M
      Change ErrorHandler methods to return const Status& (#7539) · db03172d
      mrambacher 提交于
      Summary:
      This change eliminates the need for a lot of the PermitUncheckedError calls on return from ErrorHandler methods.  The calls are no longer needed as the status is returned as a reference rather than a copy.  Additionally, this means that the originating status (recovery_error_, bg_error_) is not cleared implicitly as a result of calling one of these methods.
      
      For this class, I do not know if the proper behavior should be to call PermitUncheckedError in the destructor or if the checked state should be cleared when the status is cleared.  I did tests both ways.  Without the code in the destructor, the status will need to be cleared in at least some of the places where it is set to OK.  When running tests, I found no instances where this class was destructed with a non-OK, non-checked Status.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7539
      
      Reviewed By: anand1976
      
      Differential Revision: D25340565
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 1730c035c81a475875ea745226112030ec25136c
      db03172d