1. 08 9月, 2021 3 次提交
  2. 02 9月, 2021 2 次提交
    • P
      Remove some unneeded code (#8736) · c9cd5d25
      Peter Dillinger 提交于
      Summary:
      * FullKey and ParseFullKey appear to serve no purpose in the public API
      (or anything else) so removed. Only use in one test updated.
      * NumberToString serves no purpose vs. ToString so removed, numerous
      calls updated
      * Remove unnecessary forward declarations in metadata.h by re-arranging
      class definitions.
      * Remove some unneeded semicolons
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8736
      
      Test Plan: existing tests
      
      Reviewed By: mrambacher
      
      Differential Revision: D30700039
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 1e436a576f511a6ed8b4d97af7cc8216bc729af2
      c9cd5d25
    • P
      Fix a buffer size race condition in BackupEngine (#8732) · 32752551
      Peter Dillinger 提交于
      Summary:
      If RateLimiter burst bytes changes during concurrent Restore
      operations
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8732
      
      Test Plan: updated unit test fails with TSAN before change, passes after
      
      Reviewed By: ajkr
      
      Differential Revision: D30683879
      
      Pulled By: pdillinger
      
      fbshipit-source-id: d0ddb3587ade91ee2a4d926b475acf7781b03086
      32752551
  3. 31 8月, 2021 2 次提交
    • A
      Fix a race in LRUCacheShard::Promote (#8717) · ec9f52ec
      anand76 提交于
      Summary:
      In ```LRUCacheShard::Promote```, a reference is released outside the LRU mutex. Fix the race condition.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8717
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30649206
      
      Pulled By: anand1976
      
      fbshipit-source-id: 09c0af05b2294a7fe2c02876a61b0bad6e3ada61
      ec9f52ec
    • P
      Built-in support for generating unique IDs, bug fix (#8708) · 13ded694
      Peter Dillinger 提交于
      Summary:
      Env::GenerateUniqueId() works fine on Windows and on POSIX
      where /proc/sys/kernel/random/uuid exists. Our other implementation is
      flawed and easily produces collision in a new multi-threaded test.
      As we rely more heavily on DB session ID uniqueness, this becomes a
      serious issue.
      
      This change combines several individually suitable entropy sources
      for reliable generation of random unique IDs, with goal of uniqueness
      and portability, not cryptographic strength nor maximum speed.
      
      Specifically:
      * Moves code for getting UUIDs from the OS to port::GenerateRfcUuid
      rather than in Env implementation details. Callers are now told whether
      the operation fails or succeeds.
      * Adds an internal API GenerateRawUniqueId for generating high-quality
      128-bit unique identifiers, by combining entropy from three "tracks":
        * Lots of info from default Env like time, process id, and hostname.
        * std::random_device
        * port::GenerateRfcUuid (when working)
      * Built-in implementations of Env::GenerateUniqueId() will now always
      produce an RFC 4122 UUID string, either from platform-specific API or
      by converting the output of GenerateRawUniqueId.
      
      DB session IDs now use GenerateRawUniqueId while DB IDs (not as
      critical) try to use port::GenerateRfcUuid but fall back on
      GenerateRawUniqueId with conversion to an RFC 4122 UUID.
      
      GenerateRawUniqueId is declared and defined under env/ rather than util/
      or even port/ because of the Env dependency.
      
      Likely follow-up: enhance GenerateRawUniqueId to be faster after the
      first call and to guarantee uniqueness within the lifetime of a single
      process (imparting the same property onto DB session IDs).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8708
      
      Test Plan:
      A new mini-stress test in env_test checks the various public
      and internal APIs for uniqueness, including each track of
      GenerateRawUniqueId individually. We can't hope to verify anywhere close
      to 128 bits of entropy, but it can at least detect flaws as bad as the
      old code. Serial execution of the new tests takes about 350 ms on
      my machine.
      
      Reviewed By: zhichao-cao, mrambacher
      
      Differential Revision: D30563780
      
      Pulled By: pdillinger
      
      fbshipit-source-id: de4c9ff4b2f581cf784fcedb5f39f16e5185c364
      13ded694
  4. 28 8月, 2021 1 次提交
    • M
      Update comments, fix typos. (#8721) · 6c2bd28a
      Merlin Mao 提交于
      Summary:
      - Removed the default empty constructors of `TraceWriter` and `TraceReader`.
      - Removed unused `ReadFooter()` from `ReplayerImpl`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8721
      
      Test Plan: None
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30609743
      
      Pulled By: autopear
      
      fbshipit-source-id: 7e2626b015bd57ebb408a2836b4b4217cea10002
      6c2bd28a
  5. 27 8月, 2021 1 次提交
  6. 25 8月, 2021 2 次提交
    • Y
      Fix a bug of secondary instance sequence going backward (#8653) · f235f4b0
      Yanqin Jin 提交于
      Summary:
      Recent refactor of `ReactiveVersionSet::ReadAndApply()` uses
      `ManifestTailer` whose `Iterate()` method can cause the db's
      `last_sequence_` to go backward. Consequently, read requests can see
      out-dated data. For example, latest changes to the primary will not be
      seen on the secondary even after a `TryCatchUpWithPrimary()` if no new
      write batches are read from the WALs and no new MANIFEST entries are
      read from the MANIFEST.
      
      Fix the bug so that `VersionEditHandler::CheckIterationResult` will
      never decrease `last_sequence_`, `last_allocated_sequence_` and
      `last_published_sequence_`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8653
      
      Test Plan: make check
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D30272084
      
      Pulled By: riversand963
      
      fbshipit-source-id: c6a49c534b2509b93ef62d8936ed0acd5b860eaa
      f235f4b0
    • Y
      Allow iterate refresh for secondary instance (#8700) · 229350ef
      Yanqin Jin 提交于
      Summary:
      Test plan
      make check
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8700
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30523907
      
      Pulled By: riversand963
      
      fbshipit-source-id: 68928ab4dafb64ce80ab7bc69d83727a4713ab91
      229350ef
  7. 24 8月, 2021 2 次提交
  8. 21 8月, 2021 4 次提交
    • L
      Update version.h and HISTORY.md for the 6.24 release (#8688) · 8c9e6897
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8688
      
      Reviewed By: ajkr, riversand963
      
      Differential Revision: D30467746
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 0fce0d42fe2fe3cb56d7a89607154b3b957f09b6
      8c9e6897
    • P
      Add Bloom/Ribbon hybrid API support (#8679) · 2a383f21
      Peter Dillinger 提交于
      Summary:
      This is essentially resurrection and fixing of the part of
      https://github.com/facebook/rocksdb/issues/8198 that was reverted in https://github.com/facebook/rocksdb/issues/8212, using data added in https://github.com/facebook/rocksdb/issues/8246. Basically,
      when configuring Ribbon filter, you can specify an LSM level before which
      Bloom will be used instead of Ribbon. But Bloom is only considered for
      Leveled and Universal compaction styles and file going into a known LSM
      level. This way, SST file writer, FIFO compaction, etc. use Ribbon filter as
      you would expect with NewRibbonFilterPolicy.
      
      So that this can be controlled with a single int value and so that flushes
      can be distinguished from intra-L0, we consider flush to go to level -1 for
      the purposes of this option. (Explained in API comment.)
      
      I also expect the most common and recommended Ribbon configuration to
      use Bloom during flush, to minimize slowing down writes and because according
      to my estimates, Ribbon only pays off if the structure lives in memory for
      more than an hour. Thus, I have changed the default for NewRibbonFilterPolicy
      to be this mild hybrid configuration. I don't really want to add something like
      NewHybridFilterPolicy because at least the mild hybrid configuration (Bloom for
      flush, Ribbon otherwise) should be considered a natural choice.
      
      C APIs also updated, but because they don't support overloading,
      rocksdb_filterpolicy_create_ribbon is kept pure ribbon for clarity and
      rocksdb_filterpolicy_create_ribbon_hybrid must be called for a hybrid
      configuration. While touching C API, I changed bits per key options from
      int to double.
      
      BuiltinFilterPolicy is needed so that LevelThresholdFilterPolicy doesn't inherit
      unused fields from BloomFilterPolicy.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8679
      
      Test Plan: new + updated tests, including crash test
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D30445797
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 6f5aeddfd6d79f7e55493b563c2d1d2d568892e1
      2a383f21
    • A
      Add a PerfContext counter for secondary cache hits (#8685) · f35042ca
      anand76 提交于
      Summary:
      Add a PerfContext counter.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8685
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30453957
      
      Pulled By: anand1976
      
      fbshipit-source-id: 42888a3ced240e1c44446d52d3b04adfb01f5665
      f35042ca
    • A
      Fix blob callback in compaction and atomic flush (#8681) · 5efec84c
      Akanksha Mahajan 提交于
      Summary:
      Pass BlobFileCompletionCallback  in case of atomic flush and
      compaction job which is currently nullptr(default parameter).
      BlobFileCompletionCallback is used in case of IntegratedBlobDB to report new blob files to
      SstFileManager.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8681
      
      Test Plan: CircleCI jobs
      
      Reviewed By: ltamasi
      
      Differential Revision: D30445998
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: ba48093843864faec57f1f365cce7b5a569c4021
      5efec84c
  9. 20 8月, 2021 1 次提交
    • M
      Fix some minor issues in the Customizable infrastructure (#8566) · 9eb002fc
      mrambacher 提交于
      Summary:
      - Fix issue with OptionType::Vector when the nested item is a Customizable with no names
      - Fix issue with OptionType::Vector to appropriately wrap the elements in a Vector;
      - Fix an issue with nested Customizable object with a null immutable object still appearing in the mutable options;
      - Fix/Add tests for null/empty customizable objects
      - Move the RegisterTestObjects from customizable_test into testutil.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8566
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30303724
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 33fa8ea2a3b663210cb356da05e64aab7585b1b5
      9eb002fc
  10. 19 8月, 2021 1 次提交
    • M
      Allow Replayer to report the results of TraceRecords. (#8657) · d10801e9
      Merlin Mao 提交于
      Summary:
      `Replayer::Execute()` can directly returns the result (e.g, request latency, DB::Get() return code, returned value, etc.)
      `Replayer::Replay()` reports the results via a callback function.
      
      New interface:
      `TraceRecordResult` in "rocksdb/trace_record_result.h".
      
      `DBTest2.TraceAndReplay` and `DBTest2.TraceAndManualReplay` are updated accordingly.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8657
      
      Reviewed By: ajkr
      
      Differential Revision: D30290216
      
      Pulled By: autopear
      
      fbshipit-source-id: 3c8d4e6b180ec743de1a9d9dcaee86064c74f0d6
      d10801e9
  11. 18 8月, 2021 2 次提交
    • Y
      Fix bug caused by releasing snapshot(s) during compaction (#8608) · 2b367fa8
      Yanqin Jin 提交于
      Summary:
      In debug mode, we are seeing assertion failure as follows
      
      ```
      db/compaction/compaction_iterator.cc:980: void rocksdb::CompactionIterator::PrepareOutput(): \
      Assertion `ikey_.type != kTypeDeletion && ikey_.type != kTypeSingleDeletion' failed.
      ```
      
      It is caused by releasing earliest snapshot during compaction between the execution of
      `NextFromInput()` and `PrepareOutput()`.
      
      In one case, as demonstrated in unit test `WritePreparedTransaction.ReleaseEarliestSnapshotDuringCompaction_WithSD2`,
      incorrect result may be returned by a following range scan if we disable assertion, as in opt compilation
      level: the SingleDelete marker's sequence number is zeroed out, but the preceding PUT is also
      outputted to the SST file after compaction. Due to the logic of DBIter, the PUT will not be
      skipped and will be returned by iterator in range scan. https://github.com/facebook/rocksdb/issues/8661 illustrates what happened.
      
      Fix by taking a more conservative approach: make compaction zero out sequence number only
      if key is in the earliest snapshot when the compaction starts.
      
      Another assertion failure is
      ```
      Assertion `current_user_key_snapshot_ == last_snapshot' failed.
      ```
      
      It's caused by releasing the snapshot between the PUT and SingleDelete during compaction.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8608
      
      Test Plan: make check
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D30145645
      
      Pulled By: riversand963
      
      fbshipit-source-id: 699f58e66faf70732ad53810ccef43935d3bbe81
      2b367fa8
    • L
      Add statistics support to integrated BlobDB (#8667) · 6878cedc
      Levi Tamasi 提交于
      Summary:
      The patch adds statistics support to the integrated BlobDB implementation,
      namely the tickers `BLOB_DB_BLOB_FILE_BYTES_READ` and
      `BLOB_DB_GC_{NUM_KEYS,BYTES}_RELOCATED`, and the histograms
      `BLOB_DB_(DE)COMPRESSION_MICROS`. (Some other statistics, like
      `BLOB_DB_BLOB_FILE_BYTES_WRITTEN`, `BLOB_DB_BLOB_FILE_SYNCED`,
      `BLOB_DB_BLOB_FILE_{READ,WRITE,SYNC}_MICROS` were already supported.)
      Note that the vast majority of the old BlobDB's tickers/histograms are not
      really applicable to the new implementation, since they e.g. pertain to calling
      dedicated BlobDB APIs (which the integrated BlobDB does not have) or are
      tied to the legacy BlobDB's design of writing blob files synchronously when
      a write API is called. Such statistics are marked "legacy BlobDB only" in
      `statistics.h`.
      
      Fixes https://github.com/facebook/rocksdb/issues/8645 .
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8667
      
      Test Plan: Ran `make check` and tested the new statistics using `db_bench`.
      
      Reviewed By: riversand963
      
      Differential Revision: D30356884
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 5f8a833faee60401c5643c2f0a6c0415488190a4
      6878cedc
  12. 17 8月, 2021 1 次提交
    • A
      Add a stat to count secondary cache hits (#8666) · add68bd2
      anand76 提交于
      Summary:
      Add a stat for secondary cache hits. The ```Cache::Lookup``` API had an unused ```stats``` parameter. This PR uses that to pass the pointer to a ```Statistics``` object that ```LRUCache``` uses to record the stat.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8666
      
      Test Plan: Update a unit test in lru_cache_test
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D30353816
      
      Pulled By: anand1976
      
      fbshipit-source-id: 2046f78b460428877a26ffdd2bb914ae47dfbe77
      add68bd2
  13. 16 8月, 2021 1 次提交
  14. 13 8月, 2021 1 次提交
    • M
      Code cleanup for trace replayer (#8652) · 74a652a4
      Merlin Mao 提交于
      Summary:
      - Remove extra `;` in trace_record.h
      - Remove some unnecessary `assert` in trace_record_handler.cc
      - Initialize `env_` after` exec_handler_` in `ReplayerImpl` to let db be asserted in creating the handler before getting `db->GetEnv()`.
      - Update history to include the new `TraceReader::Reset()`
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8652
      
      Reviewed By: ajkr
      
      Differential Revision: D30276872
      
      Pulled By: autopear
      
      fbshipit-source-id: 476ee162e0f241490c6209307448343a5b326b37
      74a652a4
  15. 12 8月, 2021 2 次提交
    • M
      Make TraceRecord and Replayer public (#8611) · f58d2767
      Merlin Mao 提交于
      Summary:
      New public interfaces:
      `TraceRecord` and `TraceRecord::Handler`, available in "rocksdb/trace_record.h".
      `Replayer`, available in `rocksdb/utilities/replayer.h`.
      
      User can use `DB::NewDefaultReplayer()` to create a Replayer to auto/manual replay a trace file.
      
      Unit tests:
      - `./db_test2 --gtest_filter="DBTest2.TraceAndReplay"`: Updated with the internal API changes.
      - `./db_test2 --gtest_filter="DBTest2.TraceAndManualReplay"`: New for manual replay.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8611
      
      Reviewed By: ajkr
      
      Differential Revision: D30266329
      
      Pulled By: autopear
      
      fbshipit-source-id: 1ecb3cbbedae0f6a67c18f0cc82e002b4d81b6f8
      f58d2767
    • J
      Add suggestion for btrfs user to disable preallocation (#8646) · 87e23587
      Jay Zhuang 提交于
      Summary:
      Add comment for `options.allow_fallocate` that btrfs
      preallocated space are not freed and a suggestion to disable
      preallocation.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8646
      
      Test Plan: No code change
      
      Reviewed By: ajkr
      
      Differential Revision: D30240050
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 75b7190bc8276ce8d8ac2d0cb9064b386cbf4768
      87e23587
  16. 10 8月, 2021 1 次提交
    • S
      Move old files to warm tier in FIFO compactions (#8310) · e7c24168
      sdong 提交于
      Summary:
      Some FIFO users want to keep the data for longer, but the old data is rarely accessed. This feature allows users to configure FIFO compaction so that data older than a threshold is moved to a warm storage tier.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8310
      
      Test Plan: Add several unit tests.
      
      Reviewed By: ajkr
      
      Differential Revision: D28493792
      
      fbshipit-source-id: c14824ea634814dee5278b449ab5c98b6e0b5501
      e7c24168
  17. 07 8月, 2021 2 次提交
    • L
      Fix the sorting of KeyContexts for batched MultiGet (#8633) · 87882736
      Levi Tamasi 提交于
      Summary:
      `CompareKeyContext::operator()` on the trunk has a bug: when comparing
      column family IDs, `lhs` is used for both sides of the comparison. This
      results in the `KeyContext`s getting sorted solely based on key, which
      in turn means that keys with the same column family do not necessarily
      form a single range in the sorted list. This violates an assumption of the
      batched `MultiGet` logic, leading to the same column family
      showing up multiple times in the list of `MultiGetColumnFamilyData`.
      The end result is the code attempting to check out the thread-local
      `SuperVersion` for the same CF multiple times, causing an
      assertion violation in debug builds and memory corruption/crash in
      release builds.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8633
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D30169182
      
      Pulled By: ltamasi
      
      fbshipit-source-id: a47710652df7e95b14b40fb710924c11a8478023
      87882736
    • P
      Make backup restore atomic, with sync option (#8568) · a7fd1d08
      Peter Dillinger 提交于
      Summary:
      Guarantees that if a restore is interrupted, DB::Open will fail. This works by
      restoring CURRENT first to CURRENT.tmp then as a final step renaming to CURRENT.
      
      Also makes restore respect BackupEngineOptions::sync (default true). When set,
      the restore is guaranteed persisted by the time it returns OK. Also makes the above
      atomicity guarantee work in case the interruption is power loss or OS crash (not just
      process interruption or crash).
      
      Fixes https://github.com/facebook/rocksdb/issues/8500
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8568
      
      Test Plan:
      added to backup mini-stress unit test. Passes with
      gtest_repeat=100 (whereas fails 7 times without the CURRENT.tmp)
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D29812605
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 24e9a993b305b1835ca95558fa7a7152e54cda8e
      a7fd1d08
  18. 06 8月, 2021 1 次提交
  19. 05 8月, 2021 2 次提交
  20. 04 8月, 2021 2 次提交
  21. 03 8月, 2021 1 次提交
  22. 31 7月, 2021 1 次提交
    • M
      Allow WAL dir to change with db dir (#8582) · ab7f7c9e
      mrambacher 提交于
      Summary:
      Prior to this change, the "wal_dir"  DBOption would always be set (defaults to dbname) when the DBOptions were sanitized.  Because of this setitng in the options file, it was not possible to rename/relocate a database directory after it had been created and use the existing options file.
      
      After this change, the "wal_dir" option is only set under specific circumstances.  Methods were added to the ImmutableDBOptions class to see if it is set and if it is set to something other than the dbname.  Additionally, a method was added to retrieve the effective value of the WAL dir (either the option or the dbname/path).
      
      Tests were added to the core and ldb to test that a database could be created and renamed without issue.  Additional tests for various permutations of wal_dir were also added.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8582
      
      Reviewed By: pdillinger, autopear
      
      Differential Revision: D29881122
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 67d3d033dc8813d59917b0a3fba2550c0efd6dfb
      ab7f7c9e
  23. 29 7月, 2021 1 次提交
  24. 27 7月, 2021 1 次提交
    • M
      Make EventListener into a Customizable Class (#8473) · 3aee4fbd
      mrambacher 提交于
      Summary:
      - Added Type/CreateFromString
      - Added ability to load EventListeners to DBOptions
      - Since EventListeners did not previously have a Name(), defaulted to "".  If there is no name, the listener cannot be loaded from the ObjectRegistry.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8473
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29901488
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 2d3a4aa6db1562ac03e7ad41b360e3521d486254
      3aee4fbd
  25. 23 7月, 2021 1 次提交
  26. 22 7月, 2021 1 次提交
    • J
      Avoid updating option if there's no value updated (#8518) · 42eaa45c
      Jay Zhuang 提交于
      Summary:
      Try avoid expensive updating options operation if
      `SetDBOptions()` does not change any option value.
      Skip updating is not guaranteed, for example, changing `bytes_per_sync`
      to `0` may still trigger updating, as the value could be sanitized.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8518
      
      Test Plan: added unittest
      
      Reviewed By: riversand963
      
      Differential Revision: D29672639
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: b7931de62ceea6f1bdff0d1209adf1197d3ed1f4
      42eaa45c