1. 01 12月, 2021 7 次提交
    • Y
      Add initial CMake support to plugin (#9214) · 29954b8b
      Yanqin Jin 提交于
      Summary:
      Not a CMake expert, and the current CMake build support added by this PR is
      unlikely the best way of doing it. Sending out the PR to demonstrate it
      can work.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9214
      
      Test Plan:
      Will need to update https://github.com/ajkr/dedupfs with CMake build.
      Also, PR https://github.com/facebook/rocksdb/issues/9170 and PR https://github.com/facebook/rocksdb/issues/9206 both include CMake support for their
      plugins, and can be used as a proof of concept.
      
      Reviewed By: ajkr
      
      Differential Revision: D32738273
      
      Pulled By: riversand963
      
      fbshipit-source-id: da87fb4377c716bbbd577a69763b48d22483f845
      29954b8b
    • A
      Add rocksdb_livefiles_column_family_name C API (#9232) · 552256cb
      Artem Krylysov 提交于
      Summary:
      Extend C API to add new function `rocksdb_livefiles_column_family_name`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9232
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D32736516
      
      Pulled By: ajkr
      
      fbshipit-source-id: a854256a0f4652c903ab5ad8355ded051ac19987
      552256cb
    • A
      Allow plugins to add pkg-config dependencies to rocksdb.pc (#9198) · 07456222
      Andreas Hindborg 提交于
      Summary:
      This patch fixes an issue that occur when dependencies of plugins are not
      installed to the same prefix as librocksdb. Because plugin dependencies are
      declared in the `Libs` field of rocksdb.pc, programs that link against
      librocksdb with `pkg-config --libs rocksdb` will link with `-L` flag for the
      path of librocksdb only. This patch allows plugin dependencies to be declared in
      the `Requires` field of rocksdb.pc, so that pkg-config will correctly provide
      `-L` flags for dependencies of plugins that are installed in other locations.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9198
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D32596620
      
      Pulled By: ajkr
      
      fbshipit-source-id: e17b2b6452b5f2e955b430140197c57e26a4a518
      07456222
    • L
      Fix num files in single compaction for universal compaction (#9168) · c712b68f
      leipeng 提交于
      Summary:
      https://github.com/facebook/rocksdb/issues/9026 fixed histogram NUM_FILES_IN_SINGLE_COMPACTION for level compaction, but missed fix for universal compaction.
      
      This PR fixed NUM_FILES_IN_SINGLE_COMPACTION for universal compaction.
      
      Quote from https://github.com/facebook/rocksdb/issues/9026:
      > currently histogram `NUM_FILES_IN_SINGLE_COMPACTION` just counted files in first level of compaction input, this fix counts files in all levels of compaction input.
      
      Thanks for ajkr pointed this missed fix!
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9168
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D32434494
      
      Pulled By: ajkr
      
      fbshipit-source-id: 93ea092af4afbd8dce67898ffb350cf26b065ed2
      c712b68f
    • P
      HISTORY for #9208 (#9227) · e8b5d05e
      Peter Dillinger 提交于
      Summary:
      Update HISTORY for bug fix. This is going into 6.27 initial
      release. (Technically 6.27.1)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9227
      
      Test Plan: n/a
      
      Reviewed By: ajkr
      
      Differential Revision: D32727912
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 75e7a81749a188a590d44ef47e261eaaa8667152
      e8b5d05e
    • A
      db_stress: db_stress segmentation fault (#9219) · 9c932816
      Aravind Ramesh 提交于
      Summary:
      db_stress asserts/seg-faults with below command (on debug and release builds)
      
      ```
      "rm -rf /tmp/rocksdbtest*; db_stress --ops_per_thread=1000 --reopen=5"
      =======================================
      Error opening unique id file for append: IO error: No such file or directory:
      While open a file for appending: /tmp/rocksdbtest-0/dbstress/.unique_ids:
      No such file or directory
      Choosing random keys with no overwrite
      Creating 2621440 locks
      Starting continuous_verification_thread
      2021/11/15-08:46:49  Initializing worker threads
      2021/11/15-08:46:49  Starting database operations
      2021/11/15-08:46:49  Reopening database for the 1th time
      WARNING: prefix_size is non-zero but memtablerep != prefix_hash
      DB path: [/tmp/rocksdbtest-0/dbstress]
      Segmentation fault
      =======================================
      
      ```
      StressTest() constructor deletes the directory "dbstress" because
      the option --destroy_db_initially is true by default in db_stress.
      
      This Seg fault happens on a new database, UniqueIdVerifier's constructor
      tries to read the ".unique_ids" file, if the file is not present,
      ReopenWritableFile() tries to create .unique_ids file, but fails
      as the directory db_stress is not available. The data_file_writer_
      is set as an invalid(null) pointer and in subsequent calls (~UniqueIdVerifier()
      and UniqueIdVerifier::Verify()) it accesses this null pointer and crashes.
      
      This patch creates db_stress directory if it is missing, so the .unique_ids file
      is created.
      Signed-off-by: NAravind Ramesh <aravind.ramesh@wdc.com>
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9219
      
      Reviewed By: ajkr
      
      Differential Revision: D32730151
      
      Pulled By: pdillinger
      
      fbshipit-source-id: f47baba56b380d93c3ba5608904756e86bbf14f5
      9c932816
    • M
      Fix GetOptionsPtr for Wrapped Customizable; Allow null options map (#9213) · 7aa31ba4
      mrambacher 提交于
      Summary:
      1.  Fix GetOptionsPtr for Wrapped (Inner() != nullptr) Customizable objects.  This allows the inner options to be returned via this method.
      
      2.  Allow the option type map to be nullptr.  This allows objects to be registered as options (for GetOptionsPtr) but not be used by the configuration methods.
      
      Added tests as appropriate.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9213
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D32718882
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 563203d1f006a2629060feb31c5dff9a233e1e83
      7aa31ba4
  2. 30 11月, 2021 3 次提交
  3. 25 11月, 2021 2 次提交
    • P
      Fix bug affecting GetSortedWalFiles, Backups, Checkpoint (#9208) · 2a67d475
      Peter Dillinger 提交于
      Summary:
      Saw error like this:
      `Backup failed -- IO error: No such file or directory: While opening a
      file for sequentially reading:
      /dev/shm/rocksdb/rocksdb_crashtest_blackbox/004426.log: No such file or
      directory`
      
      Unfortunately, GetSortedWalFiles (used by Backups, Checkpoint, etc.)
      relies on no file deletions happening while its operating, which
      means not only disabling (more) deletions, but ensuring any pending
      deletions are completed. Two fixes related to this:
      
      * There was a gap in several places between decrementing
      pending_purge_obsolete_files_ and incrementing bg_purge_scheduled_ where
      the db mutex would be released and GetSortedWalFiles (and others) could
      get false information that no deletions are pending.
      
      * The fix to https://github.com/facebook/rocksdb/issues/8591 (disabling deletions in GetSortedWalFiles) seems
      incomplete because it doesn't prevent pending deletions from occuring
      during the operation (if deletions not already disabled, the case that
      was to be fixed by the change).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9208
      
      Test Plan:
      existing tests (it's hard to write a test for interleavings
      that are now excluded - this is what stress test is for)
      
      Reviewed By: ajkr
      
      Differential Revision: D32630675
      
      Pulled By: pdillinger
      
      fbshipit-source-id: a121e3da648de130cd24d44c524232f4eb22f178
      2a67d475
    • J
      Fix compile warnings (#9199) · 5384f0af
      jsteemann 提交于
      Summary:
      * added missing override specifiers for overriden methods
        this fixes compiler warnings emitted by g++ and clang++ when compile option `-Wsuggest-override` is turned on.
      * fix compile warning with -Wmaybe-uninitialized
        g++-11 warns about a _potentially_ uninitialized variable when using `-Wmaybe_uninitialized`:
        ```
            env/env.cc: In member function ‘virtual rocksdb::Status rocksdb::Env::GetHostNameString(std::string*)’:
            env/env.cc:738:66: error: ‘hostname_buf’ may be used uninitialized [-Werror=maybe-uninitialized]
              738 |   Status s = GetHostName(hostname_buf.data(), hostname_buf.size());
                  |                                                                  ^
            In file included from /usr/include/c++/11/tuple:39,
                             from /usr/include/c++/11/functional:54,
                             from ./include/rocksdb/env.h:22,
                             from env/env.cc:10:
            /usr/include/c++/11/array:176:7: note: by argument 1 of type ‘const std::array<char, 256>*’ to ‘constexpr std::array<_Tp, _Nm>::size_type std::array<_Tp, _Nm>::size() const [with _Tp = char; long unsigned int _Nm = 256]’ declared here
              176 |       size() const noexcept { return _Nm; }
                  |       ^~~~
            env/env.cc:737:37: note: ‘hostname_buf’ declared here
              737 |   std::array<char, kMaxHostNameLen> hostname_buf;
        ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9199
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D32630703
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 9ea3010b1105a582548e3c3c0db4475b201e4a10
      5384f0af
  4. 24 11月, 2021 1 次提交
  5. 23 11月, 2021 2 次提交
    • Y
      Fix internal build error (#9195) · dc0ee3e5
      Yanqin Jin 提交于
      Summary:
      Internal build reported:
      ```
      rocksdb/listener.h:470:3: error: extra ';' inside a struct [-Werror,-Wextra-semi]
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9195
      
      Test Plan: import to fbcode and compile.
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D32590138
      
      Pulled By: riversand963
      
      fbshipit-source-id: ca4ed9cca210a1a9a12d3de17c789ef9151c57e8
      dc0ee3e5
    • Y
      Print file checksum in hex (#9196) · 12e98add
      Yanqin Jin 提交于
      Summary:
      Printing file checksum (usually an integer) in non-hex format is barely useful. To make the matter
      worse, it can mess with the output format. If you use `less` to redirect the output of `ldb manifest_dump`,
      non-hex file checksum can cause `less` not to function as expected.
      
      Also output some additional fields to json output.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9196
      
      Test Plan: manually test `ldb manifest_dump`.
      
      Reviewed By: ajkr
      
      Differential Revision: D32590253
      
      Pulled By: riversand963
      
      fbshipit-source-id: de434b7e60dd05b0b7cb76eff2240b21f9ae4b32
      12e98add
  6. 21 11月, 2021 1 次提交
    • P
      Fix some CI output (#9193) · d561432d
      Peter Dillinger 提交于
      Summary:
      Address some issues with https://github.com/facebook/rocksdb/issues/9188
      * Internal CI doesn't render \r as anything, so use \n for "not
      connected to terminal" case
      * CircleCI apparently uses a pseudo-tty for output and although
      rerdirect stdout (because of EAGAIN bug) we don't redirect stderr, so it
      is detected as a terminal-connected case. Fix by redirecting stderr
      also.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9193
      
      Test Plan: manual, CI
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D32581128
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 5ae7c3209128d8dbd4153c5b9fdb2b810e6deb2e
      d561432d
  7. 20 11月, 2021 12 次提交
    • Z
      Disable the QPS verification in test temporally (#9190) · 4340e1ff
      Zhichao Cao 提交于
      Summary:
      Disable the QPS verification in test temporally, which causes the test failure due to different system delays.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9190
      
      Test Plan: make check
      
      Reviewed By: siying
      
      Differential Revision: D32576289
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 1df972e77dd82eed5af3462e5db5e141aadf8fae
      4340e1ff
    • Y
      Update HISTORY and version.h for 6.27 release (#9192) · 81016436
      Yanqin Jin 提交于
      Summary:
      As title.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9192
      
      Reviewed By: ltamasi
      
      Differential Revision: D32578141
      
      Pulled By: riversand963
      
      fbshipit-source-id: 16216451c87e383ca8fd309acf15106e46172aaa
      81016436
    • L
      Update HISTORY for PR 9187 (#9191) · 3a9f5574
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9191
      
      Reviewed By: riversand963
      
      Differential Revision: D32577939
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 3c52067a0c3e9219c1aafdb711718dfcce5dedf5
      3a9f5574
    • Y
      Fix an assertion failure when ManifestTailer switches to new Manifest in multi-cf mode (#9143) · 43ac7a27
      Yanqin Jin 提交于
      Summary:
      Original unit test fail to test the case of multi-cf mode switching to new manifest. The assertion
      failure will trigger when the primary instance reopens and secondary continues to tail the
      newly-created MANIFEST. Fix the assertion failure and update existing unit tests.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9143
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D32574233
      
      Pulled By: riversand963
      
      fbshipit-source-id: 857ddbe994019091276458abebcf8e2b65340468
      43ac7a27
    • L
      Support readahead during compaction for blob files (#9187) · dc5de45a
      Levi Tamasi 提交于
      Summary:
      The patch adds a new BlobDB configuration option `blob_compaction_readahead_size`
      that can be used to enable prefetching data from blob files during compaction.
      This is important when using storage with higher latencies like HDDs or remote filesystems.
      If enabled, prefetching is used for all cases when blobs are read during compaction,
      namely garbage collection, compaction filters (when the existing value has to be read from
      a blob file), and `Merge` (when the value of the base `Put` is stored in a blob file).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9187
      
      Test Plan: Ran `make check` and the stress/crash test.
      
      Reviewed By: riversand963
      
      Differential Revision: D32565512
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 87be9cebc3aa01cc227bec6b5f64d827b8164f5d
      dc5de45a
    • P
      Fix backward compatibility with 2.5 through 2.7 (#9189) · cd4ea675
      Peter Dillinger 提交于
      Summary:
      A bug in https://github.com/facebook/rocksdb/issues/9163 can cause checksum verification to fail if
      parsing a properties block fails.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9189
      
      Test Plan:
      check_format_compatible.sh (never quite works locally but
      this particular case seems fixed using variants of SHORT_TEST=1).
      And added new unit test case.
      
      Reviewed By: ajkr
      
      Differential Revision: D32574626
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 6fa5c8595737b71a3c3d011a52daf6d6c08715d7
      cd4ea675
    • J
      Deprecating `iter_start_seqnum` and `preserve_deletes` (#9091) · 6cde8d21
      Jay Zhuang 提交于
      Summary:
      `ReadOptions::iter_start_seqnum` and `DBOptions::preserve_deletes` are
      deprecated, please try using user defined timestamp feature instead.
      The feature is used to support differential snapshots, but not well
      maintained (https://github.com/facebook/rocksdb/issues/6837, https://github.com/facebook/rocksdb/issues/8472) and the interface is not user friendly which
      returns an internal key from the iterator. The user defined timestamp
      feature is a more flexible feature to support similar usecase, please
      switch to that if you have such usecase.
      The deprecated feature will be removed in a future release.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9091
      
      Test Plan:
      check LOG
      
      Fix https://github.com/facebook/rocksdb/issues/9090
      
      Reviewed By: ajkr
      
      Differential Revision: D32071750
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: b882c4668dd1bf26ce03c4c192f1bba584bf6104
      6cde8d21
    • P
      Print failures in parallel `make check` (#9188) · 3ce4d4f5
      Peter Dillinger 提交于
      Summary:
      Generating megabytes of successful test output has caused
      issues / inconveniences for CI and internal sandcastle runs. This
      changes their configuration to only print output from failed tests.
      (Successful test output is still available in files under t/.)
      
      This likewise changes default behavior of parallel `make check` as
      a quick team poll showed interest in that.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9188
      
      Test Plan:
      Seed some test failures and observe
      * `make -j24 check` (new behavior)
      * `PRINT_PARALLEL_OUTPUTS=1 make -j24 check` (old CI behavior)
      * `QUIET_PARALLEL_TESTS=1 make -j24 check` (old manual run behavior)
      
      Reviewed By: siying
      
      Differential Revision: D32567392
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 8d8fb64aebd16bca103b11e3bd1f13c488a69611
      3ce4d4f5
    • A
      Some small changes to RocksJava build (#9186) · ad40b0be
      Adam Retter 提交于
      Summary:
      1. Added a target for building a bundle jar for Sonatype Nexus - sometimes if the OSS Maven Central is misbehaving, it is quicker to upload a bundle to be processed for release.
      
      2. Simplify the publish code by using a for-loop.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9186
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D32564469
      
      Pulled By: ajkr
      
      fbshipit-source-id: aaceac27e9143fb65b61dad2a46df346586672cd
      ad40b0be
    • S
      Track each SST's timestamp information as user properties (#9093) · e12753eb
      slk 提交于
      Summary:
      Track each SST's timestamp information as user properties https://github.com/facebook/rocksdb/issues/8959
      
      Rockdb has supported user-defined timestamp feature. Application can specify a timestamp
      when writing each k-v pair. When data flush from memory to disk file called SST files.
      Each SST files consist of multiple data blocks and several metadata blocks. Among the metadata
      blocks, there is one called Properties block that tracks some pre-defined properties of this SST file.
      
      This PR is for collecting the properties of min and max timestamps of all keys in the file. With those
      properties the SST file is more convenient to tell whether the keys in the SST have timestamps or not.
      
      The changes involved are as follows:
      
      1) Add a class TimestampTablePropertiesCollector to collect min/max timestamp when add keys to table,
         The way TimestampTablePropertiesCollector use to compare timestamp of key should defined by
         user by implementing the Comparator::CompareTimestamp function in the user defined comparator.
      2) Add corresponding unit tests.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9093
      
      Reviewed By: ltamasi
      
      Differential Revision: D32406927
      
      Pulled By: riversand963
      
      fbshipit-source-id: 25922971b7e67bacf4d53a1fb67c4c5ddaa61573
      e12753eb
    • S
      Fix flaky DBTest2.RateLimitedCompactionReads (#9185) · 12117b26
      sdong 提交于
      Summary:
      DBTest2.RateLimitedCompactionReads sometime shows following failure:
      
        what():  db/db_test2.cc:3976: Failure
      Expected equality of these values:
        i + 1
          Which is: 4
        NumTableFilesAtLevel(0)
          Which is: 0
      
      The assertion itself doesn't appear to be correct. Fix it.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9185
      
      Test Plan: Removing an assertion shouldn't break anything.
      
      Reviewed By: ajkr
      
      Differential Revision: D32549530
      
      fbshipit-source-id: 9993372d8af89161f903337a13f3e316e690a6b8
      12117b26
    • Y
      Fix a bug in FlushJob picking more memtables beyond synced WALs (#9142) · 1e8322c0
      Yanqin Jin 提交于
      Summary:
      After RocksDB 6.19 and before this PR, RocksDB FlushJob may pick more memtables to flush beyond synced WALs.
      This can be problematic if there are multiple column families, since it can prematurely advance the flushed column
      family's log_number. Should subsequent attempts fail to sync the latest WALs and the database goes
      through a recovery, it may detect corrupted WAL number below the flushed column family's log number
      and complain about column family inconsistency.
      To fix, we record the maximum memtable ID of the column family being flushed. Then we call SyncClosedLogs()
      so that all closed WALs at the time when memtable ID is recorded will be synced.
      I also disabled a unit test temporarily due to reasons described in https://github.com/facebook/rocksdb/issues/9151
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9142
      
      Test Plan: make check
      
      Reviewed By: ajkr
      
      Differential Revision: D32299956
      
      Pulled By: riversand963
      
      fbshipit-source-id: 0da75888177d91905cf8c9d00605b73afb5970a7
      1e8322c0
  8. 19 11月, 2021 6 次提交
    • A
      Adhere to per-DB concurrency limit when bottom-pri compactions exist (#9179) · 8cf4294e
      Andrew Kryczka 提交于
      Summary:
      - Fixed bug where bottom-pri manual compactions were counting towards `bg_compaction_scheduled_` instead of `bg_bottom_compaction_scheduled_`. It seems to have no negative effect.
      - Fixed bug where automatic compaction scheduling did not consider `bg_bottom_compaction_scheduled_`. Now automatic compactions cannot be scheduled that exceed the per-DB compaction concurrency limit (`max_compactions`) when some existing compactions are bottommost.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9179
      
      Test Plan: new unit test for manual/automatic. Also verified the existing automatic/automatic test ("ConcurrentBottomPriLowPriCompactions") hanged until changing it to explicitly enable concurrency.
      
      Reviewed By: riversand963
      
      Differential Revision: D32488048
      
      Pulled By: ajkr
      
      fbshipit-source-id: 20c4c0693678e81e43f85ed3cc3402fcf26e3310
      8cf4294e
    • A
      Add listener API that notifies on IOError (#9177) · 4a7c1dc3
      Akanksha Mahajan 提交于
      Summary:
      Add a new API in listener.h that notifies about IOErrors on
      Read/Write/Append/Flush etc. The API reports about IOStatus, filename, Operation
      name, offset and length.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9177
      
      Test Plan: Added new unit tests
      
      Reviewed By: anand1976
      
      Differential Revision: D32470627
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 189a717033590ae227b3beae8b1e7e185e4cdc12
      4a7c1dc3
    • A
      Check that newIteratorWithBase regardless of WBWI Overwrite Mode (#8134) · d9493232
      Adam Retter 提交于
      Summary:
      The behaviour of WBWI has changed when calling newIteratorWithBase when overwrite is set to true or false. This PR simply adds tests to assert the new correct behaviour.
      
      Closes https://github.com/facebook/rocksdb/issues/7370
      Closes https://github.com/facebook/rocksdb/pull/8134
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9107
      
      Reviewed By: pdillinger
      
      Differential Revision: D32099475
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 245f483f73db866cc8a51219a2bff2e09e59faa0
      d9493232
    • P
      Improve / clean up meta block code & integrity (#9163) · 230660be
      Peter Dillinger 提交于
      Summary:
      * Checksums are now checked on meta blocks unless specifically
      suppressed or not applicable (e.g. plain table). (Was other way around.)
      This means a number of cases that were not checking checksums now are,
      including direct read TableProperties in Version::GetTableProperties
      (fixed in meta_blocks ReadTableProperties), reading any block from
      PersistentCache (fixed in BlockFetcher), read TableProperties in
      SstFileDumper (ldb/sst_dump/BackupEngine) before table reader open,
      maybe more.
      * For that to work, I moved the global_seqno+TableProperties checksum
      logic to the shared table/ code, because that is used by many utilies
      such as SstFileDumper.
      * Also for that to work, we have to know when we're dealing with a block
      that has a checksum (trailer), so added that capability to Footer based
      on magic number, and from there BlockFetcher.
      * Knowledge of trailer presence has also fixed a problem where other
      table formats were reading blocks including bytes for a non-existant
      trailer--and awkwardly kind-of not using them, e.g. no shared code
      checking checksums. (BlockFetcher compression type was populated
      incorrectly.) Now we only read what is needed.
      * Minimized code duplication and differing/incompatible/awkward
      abstractions in meta_blocks.{cc,h} (e.g. SeekTo in metaindex block
      without parsing block handle)
      * Moved some meta block handling code from table_properties*.*
      * Moved some code specific to block-based table from shared table/ code
      to BlockBasedTable class. The checksum stuff means we can't completely
      separate it, but things that don't need to be in shared table/ code
      should not be.
      * Use unique_ptr rather than raw ptr in more places. (Note: you can
      std::move from unique_ptr to shared_ptr.)
      
      Without enhancements to GetPropertiesOfAllTablesTest (see below),
      net reduction of roughly 100 lines of code.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9163
      
      Test Plan:
      existing tests and
      * Enhanced DBTablePropertiesTest.GetPropertiesOfAllTablesTest to verify that
      checksums are now checked on direct read of table properties by TableCache
      (new test would fail before this change)
      * Also enhanced DBTablePropertiesTest.GetPropertiesOfAllTablesTest to test
      putting table properties under old meta name
      * Also generally enhanced that same test to actually test what it was
      supposed to be testing already, by kicking things out of table cache when
      we don't want them there.
      
      Reviewed By: ajkr, mrambacher
      
      Differential Revision: D32514757
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 507964b9311d186ae8d1131182290cbd97a99fa9
      230660be
    • Z
      Fix the analyzer test failure caused by inaccurate timing wait (#9181) · f4294669
      Zhichao Cao 提交于
      Summary:
      Fix the analyzer test failure caused by inaccurate timing wait. The wait time at different system might be different or cause the delay, now we do not accurately count the lines. Only in a very rare extreme case, test will ignore the part exceed the timing of 1 second.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9181
      
      Test Plan: make check
      
      Reviewed By: pdillinger
      
      Differential Revision: D32511319
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: e694c8cb465c750cfa5a43dab3eff6707b9a11c8
      f4294669
    • H
      Account Bloom/Ribbon filter construction memory in global memory limit (#9073) · 74544d58
      Hui Xiao 提交于
      Summary:
      Note: This PR is the 4th part of a bigger PR stack (https://github.com/facebook/rocksdb/pull/9073) and will rebase/merge only after the first three PRs (https://github.com/facebook/rocksdb/pull/9070, https://github.com/facebook/rocksdb/pull/9071, https://github.com/facebook/rocksdb/pull/9130) merge.
      
      **Context:**
      Similar to https://github.com/facebook/rocksdb/pull/8428, this PR is to track memory usage during (new) Bloom Filter (i.e,FastLocalBloom) and Ribbon Filter (i.e, Ribbon128) construction, moving toward the goal of [single global memory limit using block cache capacity](https://github.com/facebook/rocksdb/wiki/Projects-Being-Developed#improving-memory-efficiency). It also constrains the size of the banding portion of Ribbon Filter during construction by falling back to Bloom Filter if that banding is, at some point, larger than the available space in the cache under `LRUCacheOptions::strict_capacity_limit=true`.
      
      The option to turn on this feature is `BlockBasedTableOptions::reserve_table_builder_memory = true` which by default is set to `false`. We [decided](https://github.com/facebook/rocksdb/pull/9073#discussion_r741548409) not to have separate option for separate memory user in table building therefore their memory accounting are all bundled under one general option.
      
      **Summary:**
      - Reserved/released cache for creation/destruction of three main memory users with the passed-in `FilterBuildingContext::cache_res_mgr` during filter construction:
         - hash entries (i.e`hash_entries`.size(), we bucket-charge hash entries during insertion for performance),
         - banding (Ribbon Filter only, `bytes_coeff_rows` +`bytes_result_rows` + `bytes_backtrack`),
         - final filter (i.e, `mutable_buf`'s size).
            - Implementation details: in order to use `CacheReservationManager::CacheReservationHandle` to account final filter's memory, we have to store the `CacheReservationManager` object and `CacheReservationHandle` for final filter in `XXPH3BitsFilterBuilder` as well as  explicitly delete the filter bits builder when done with the final filter in block based table.
      - Added option fo run `filter_bench` with this memory reservation feature
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9073
      
      Test Plan:
      - Added new tests in `db_bloom_filter_test` to verify filter construction peak cache reservation under combination of  `BlockBasedTable::Rep::FilterType` (e.g, `kFullFilter`, `kPartitionedFilter`), `BloomFilterPolicy::Mode`(e.g, `kFastLocalBloom`, `kStandard128Ribbon`, `kDeprecatedBlock`) and `BlockBasedTableOptions::reserve_table_builder_memory`
        - To address the concern for slow test: tests with memory reservation under `kFullFilter` + `kStandard128Ribbon` and `kPartitionedFilter` take around **3000 - 6000 ms** and others take around **1500 - 2000 ms**, in total adding **20000 - 25000 ms** to the test suit running locally
      - Added new test in `bloom_test` to verify Ribbon Filter fallback on large banding in FullFilter
      - Added test in `filter_bench` to verify that this feature does not significantly slow down Bloom/Ribbon Filter construction speed. Local result averaged over **20** run as below:
         - FastLocalBloom
            - baseline `./filter_bench -impl=2 -quick -runs 20 | grep 'Build avg'`:
               - **Build avg ns/key: 29.56295** (DEBUG_LEVEL=1), **29.98153** (DEBUG_LEVEL=0)
            - new feature (expected to be similar as above)`./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg'`:
               - **Build avg ns/key: 30.99046** (DEBUG_LEVEL=1), **30.48867** (DEBUG_LEVEL=0)
            - new feature of RibbonFilter with fallback  (expected to be similar as above) `./filter_bench -impl=2 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` :
               - **Build avg ns/key: 31.146975** (DEBUG_LEVEL=1), **30.08165** (DEBUG_LEVEL=0)
      
          - Ribbon128
             - baseline `./filter_bench -impl=3 -quick -runs 20 | grep 'Build avg'`:
                 - **Build avg ns/key: 129.17585** (DEBUG_LEVEL=1), **130.5225** (DEBUG_LEVEL=0)
             - new feature  (expected to be similar as above) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true | grep 'Build avg' `:
                 - **Build avg ns/key: 131.61645** (DEBUG_LEVEL=1), **132.98075** (DEBUG_LEVEL=0)
             - new feature of RibbonFilter with fallback (expected to be a lot faster than above due to fallback) `./filter_bench -impl=3 -quick -runs 20 -reserve_table_builder_memory=true -strict_capacity_limit=true | grep 'Build avg'` :
                - **Build avg ns/key: 52.032965** (DEBUG_LEVEL=1), **52.597825** (DEBUG_LEVEL=0)
                - And the warning message of `"Cache reservation for Ribbon filter banding failed due to cache full"` is indeed logged to console.
      
      Reviewed By: pdillinger
      
      Differential Revision: D31991348
      
      Pulled By: hx235
      
      fbshipit-source-id: 9336b2c60f44d530063da518ceaf56dac5f9df8e
      74544d58
  9. 18 11月, 2021 2 次提交
    • P
      Don't allow parallel crash_test in Makefile (#9180) · 4f678b52
      Peter Dillinger 提交于
      Summary:
      Using deps for running blackbox and whitebox allows them to be
      parallelized, which doesn't seem to be working well.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9180
      
      Test Plan: make -j24 crash_test
      
      Reviewed By: siying
      
      Differential Revision: D32500851
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 364288c8d023b93e7ca2724ea40edae2f4eb0407
      4f678b52
    • D
      Fix integer overflow in TraceOptions (#9157) · c9539ede
      Davide Angelocola 提交于
      Summary:
      Hello from a happy user of rocksdb java :-)
      
      Default constructor of TraceOptions is supposed to initialize size to 64GB but the expression contains an integer overflow.
      
      Simple test case with JShell:
      ```
      jshell> 64 * 1024 * 1024 * 1024
      $1 ==> 0
      
      jshell> 64L * 1024 * 1024 * 1024
      $2 ==> 68719476736
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9157
      
      Reviewed By: pdillinger, zhichao-cao
      
      Differential Revision: D32369273
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 6a0c95fff7a91f27ff15d65b662c6b101756b450
      c9539ede
  10. 17 11月, 2021 4 次提交