1. 16 7月, 2021 5 次提交
    • Z
      The formal parameter types of CompressionOptions constructor should b… (#8510) · b678cb1f
      zaorangyang 提交于
      Summary:
      …e consistent with the member variables's
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8510
      
      Reviewed By: ajkr
      
      Differential Revision: D29654067
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 908baaddfb20c266db7c5aca6a87971393d62ee6
      b678cb1f
    • B
      Mempurge support for wal (#8528) · 206845c0
      Baptiste Lemaire 提交于
      Summary:
      In this PR, `mempurge` is made compatible with the Write Ahead Log: in case of recovery, the DB is now capable of recovering the data that was "mempurged" and kept in the `imm()` list of immutable memtables.
      The twist was to add a uint64_t to the `memtable` struct to store the number of the earliest log file containing entries from the `memtable`. When a `Flush` operation is replaced with a `MemPurge`, the `VersionEdit` (which usually contains the new min log file number to pick up for recovery and the level 0 file path of the newly created SST file) is no longer appended to the manifest log, and every time the `deleteWal` method is called, a check is made on the list of immutable memtables.
      This PR also includes a unit test that verifies that no data is lost upon Reopening of the database when the mempurge feature is activated. This extensive unit test includes two column families, with valid data contained in the imm() at time of "crash"/reopening (recovery).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8528
      
      Reviewed By: pdillinger
      
      Differential Revision: D29701097
      
      Pulled By: bjlemaire
      
      fbshipit-source-id: 072a900fb6ccc1edcf5eef6caf88f3060238edf9
      206845c0
    • L
      Replace the namespace "rocksdb" to "ROCKSDB_NAMESPACE" (#8531) · 4e4ec169
      longlijian 提交于
      Summary:
      For more detail can reference the https://github.com/facebook/rocksdb/issues/6433
      (https://github.com/facebook/rocksdb/pull/6433)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8531
      
      Reviewed By: siying
      
      Differential Revision: D29717057
      
      Pulled By: ajkr
      
      fbshipit-source-id: 3ccad9501e5612590e54a7cf8c447118f323c7f4
      4e4ec169
    • P
      Work around falsely reported data race on LRUHandle::flags (#8539) · 5ad32276
      Peter Dillinger 提交于
      Summary:
      Some bits are mutated and read while holding a lock, other
      immutable bits (esp. secondary cache compatibility) can be read by
      arbitrary threads without holding a lock. AFAIK, this doesn't cause an
      issue on any architecture we care about, because you will get some
      legitimate version of the value that includes the initialization, as
      long as synchronization guarantees the initialization happens before the
      read.
      
      I've only seen this in https://github.com/facebook/rocksdb/issues/8538 so far, but it should be fixed regardless.
      Otherwise, we'll surely get these false reports again some time.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8539
      
      Test Plan: some local TSAN test runs and in CircleCI
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29720262
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 365fd7e565577c648815161f71b339bcb5ce12d5
      5ad32276
    • J
      Add missing steps for cmake build (#8524) · 31193a73
      Jay Zhuang 提交于
      Summary:
      Some cmake and test configuration are set in pre-steps
      enviroment variables. Add the missing steps.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8524
      
      Test Plan: CI pass
      
      Reviewed By: siying
      
      Differential Revision: D29682731
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: afda1acf6a7b76989db450442b0b27f387388b9d
      31193a73
  2. 15 7月, 2021 1 次提交
  3. 14 7月, 2021 2 次提交
  4. 13 7月, 2021 7 次提交
  5. 12 7月, 2021 2 次提交
    • A
      Correct CVS -> CSV typo (#8513) · 5afd1e30
      Adam Retter 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8513
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D29654066
      
      Pulled By: mrambacher
      
      fbshipit-source-id: b8f492fe21edd37fe1f1c5a4a0e9153f58bbf3e2
      5afd1e30
    • A
      Avoid passing existing BG error to WriteStatusCheck (#8511) · d1b70b05
      anand76 提交于
      Summary:
      In ```DBImpl::WriteImpl()```, we call ```PreprocessWrite()``` which, among other things, checks the BG error and returns it set. This return status is later on passed to ```WriteStatusCheck()```, which calls ```SetBGError()```. This results in a spurious call, and info logs, on every user write request. We should avoid passing the ```PreprocessWrite()``` return status to ```WriteStatusCheck()```, as the former would have called ```SetBGError()``` already if it encountered any new errors, such as error when creating a new WAL file.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8511
      
      Test Plan: Run existing tests
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29639917
      
      Pulled By: anand1976
      
      fbshipit-source-id: 19234163969e1645dbeb273712aaf5cd9ea2b182
      d1b70b05
  6. 10 7月, 2021 2 次提交
    • B
      Make mempurge a background process (equivalent to in-memory compaction). (#8505) · 837705ad
      Baptiste Lemaire 提交于
      Summary:
      In https://github.com/facebook/rocksdb/issues/8454, I introduced a new process baptized `MemPurge` (memtable garbage collection). This new PR is built upon this past mempurge prototype.
      In this PR, I made the `mempurge` process a background task, which provides superior performance since the mempurge process does not cling on the db_mutex anymore, and addresses severe restrictions from the past iteration (including a scenario where the past mempurge was failling, when a memtable was mempurged but was still referred to by an iterator/snapshot/...).
      Now the mempurge process ressembles an in-memory compaction process: the stack of immutable memtables is filtered out, and the useful payload is used to populate an output memtable. If the output memtable is filled at more than 60% capacity (arbitrary heuristic) the mempurge process is aborted and a regular flush process takes place, else the output memtable is kept in the immutable memtable stack. Note that adding this output memtable to the `imm()` memtable stack does not trigger another flush process, so that the flush thread can go to sleep at the end of a successful mempurge.
      MemPurge is activated by making the `experimental_allow_mempurge` flag `true`. When activated, the `MemPurge` process will always happen when the flush reason is `kWriteBufferFull`.
      The 3 unit tests confirm that this process supports `Put`, `Get`, `Delete`, `DeleteRange` operators and is compatible with `Iterators` and `CompactionFilters`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8505
      
      Reviewed By: pdillinger
      
      Differential Revision: D29619283
      
      Pulled By: bjlemaire
      
      fbshipit-source-id: 8a99bee76b63a8211bff1a00e0ae32360aaece95
      837705ad
    • Q
      Add ribbon filter to C API (#8486) · bb485e98
      qieqieplus 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8486
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D29625501
      
      Pulled By: pdillinger
      
      fbshipit-source-id: e6e2a455ae62a71f3a202278a751b9bba17ad03c
      bb485e98
  7. 09 7月, 2021 3 次提交
  8. 08 7月, 2021 2 次提交
    • S
      FaultInjectionTestFS::DeleteFilesCreatedAfterLastDirSync() to recover… (#8501) · b1a53db3
      sdong 提交于
      Summary:
      … small overwritten files.
      If a file is overwritten with renamed and the parent path is not synced, FaultInjectionTestFS::DeleteFilesCreatedAfterLastDirSync() will delete the file. However, RocksDB relies on file renaming to be atomic no matter whether the parent directory is synced or not, and the current behavior breaks the assumption and caused some false positive: https://github.com/facebook/rocksdb/pull/8489
      
      Since the atomic renaming is used in CURRENT files, to fix the problem, in FaultInjectionTestFS::DeleteFilesCreatedAfterLastDirSync(), we recover the state of overwritten file if the file is small.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8501
      
      Test Plan: Run stress test for a while and see it doesn't break.
      
      Reviewed By: anand1976
      
      Differential Revision: D29594384
      
      fbshipit-source-id: 589b5c2f0a9d2aca53752d7bdb0231efa5b3ae92
      b1a53db3
    • A
      Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) · ed8eb436
      Andrew Kryczka 提交于
      Summary:
      Various tests had disabled valgrind due to it slowing down and timing
      out (as is the case right now) the CI runs. Where a test was disabled with no comment,
      I assumed slowness was the cause. For these tests that were slow under
      valgrind, as well as the ones identified in https://github.com/facebook/rocksdb/issues/8352, this PR moves them
      behind the compiler flag `-DROCKSDB_FULL_VALGRIND_RUN`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8475
      
      Test Plan: running `make full_valgrind_test`, `make valgrind_test`, `make check`; will verify they appear working correctly
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D29504843
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2aac90749cfbd30d5ce11cb29a07a1b9314eeea7
      ed8eb436
  9. 07 7月, 2021 6 次提交
  10. 02 7月, 2021 6 次提交
    • B
      Memtable "MemPurge" prototype (#8454) · 9dc887ec
      Baptiste Lemaire 提交于
      Summary:
      Implement an experimental feature called "MemPurge", which consists in purging "garbage" bytes out of a memtable and reuse the memtable struct instead of making it immutable and eventually flushing its content to storage.
      The prototype is by default deactivated and is not intended for use. It is intended for correctness and validation testing. At the moment, the "MemPurge" feature can be switched on by using the `options.experimental_allow_mempurge` flag. For this early stage, when the allow_mempurge flag is set to `true`, all the flush operations will be rerouted to perform a MemPurge. This is a temporary design decision that will give us the time to explore meaningful heuristics to use MemPurge at the right time for relevant workloads . Moreover, the current MemPurge operation only supports `Puts`, `Deletes`, `DeleteRange` operations, and handles `Iterators` as well as `CompactionFilter`s that are invoked at flush time .
      Three unit tests are added to `db_flush_test.cc` to test if MemPurge works correctly (and checks that the previously mentioned operations are fully supported thoroughly tested).
      One noticeable design decision is the timing of the MemPurge operation in the memtable workflow: for this prototype, the mempurge happens when the memtable is switched (and usually made immutable). This is an inefficient process because it implies that the entirety of the MemPurge operation happens while holding the db_mutex. Future commits will make the MemPurge operation a background task (akin to the regular flush operation) and aim at drastically enhancing the performance of this operation. The MemPurge is also not fully "WAL-compatible" yet, but when the WAL is full, or when the regular MemPurge operation fails (or when the purged memtable still needs to be flushed), a regular flush operation takes place. Later commits will also correct these behaviors.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8454
      
      Reviewed By: anand1976
      
      Differential Revision: D29433971
      
      Pulled By: bjlemaire
      
      fbshipit-source-id: 6af48213554e35048a7e03816955100a80a26dc5
      9dc887ec
    • A
      Call OnCompactionCompleted API in case of DisableManualCompaction (#8469) · c76778e2
      Akanksha Mahajan 提交于
      Summary:
      Call OnCompactionCompleted API in case of
      DisableManualCompaction() with updated Status::Incomplete
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8469
      
      Reviewed By: ajkr
      
      Differential Revision: D29475517
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: a1726c5e6ee18c0b5097ea04f5e6975fbe108055
      c76778e2
    • P
      Add -report_open_timing to db_bench (#8464) · b2073770
      Peter (Stig) Edwards 提交于
      Summary:
      Hello and thanks for RocksDB,
      
      This PR adds support for ```-report_open_timing true``` to ```db_bench```.
      It can be useful when tuning RocksDB on filesystem/env with high latencies for file level operations (create/delete/rename...) seen during ```((Optimistic)Transaction)DB::Open```.
      
      Some examples:
      
      ```
      > db_bench -benchmarks updaterandom -num 1 -db /dev/shm/db_bench
      > db_bench -benchmarks updaterandom -num 0 -db /dev/shm/db_bench -use_existing_db true -report_open_timing true -readonly true 2>&1 | grep OpenDb
      OpenDb:     3.90133 milliseconds
      > db_bench -benchmarks updaterandom -num 0 -db /dev/shm/db_bench -use_existing_db true -report_open_timing true -use_secondary_db true 2>&1 | grep OpenDb
      OpenDb:     3.33414 milliseconds
      > db_bench -benchmarks updaterandom -num 0 -db /dev/shm/db_bench -use_existing_db true -report_open_timing true 2>&1 | grep -A1 OpenDb
      OpenDb:     6.05423 milliseconds
      
      > db_bench -benchmarks updaterandom -num 1
      > db_bench -benchmarks updaterandom -num 0 -use_existing_db true -report_open_timing true -readonly true 2>&1 | grep OpenDb
      OpenDb:     4.06859 milliseconds
      > db_bench -benchmarks updaterandom -num 0 -use_existing_db true -report_open_timing true -use_secondary_db true 2>&1 | grep OpenDb
      OpenDb:     2.85794 milliseconds
      > db_bench -benchmarks updaterandom -num 0 -use_existing_db true -report_open_timing true 2>&1 | grep OpenDb
      OpenDb:     6.46376 milliseconds
      
      > db_bench -benchmarks updaterandom -num 1 -db /clustered_fs/db_bench
      > db_bench -benchmarks updaterandom -num 0 -db /clustered_fs/db_bench -use_existing_db true -report_open_timing true -readonly true 2>&1 | grep OpenDb
      OpenDb:     3.79805 milliseconds
      > db_bench -benchmarks updaterandom -num 0 -db /clustered_fs/db_bench -use_existing_db true -report_open_timing true -use_secondary_db true 2>&1 | grep OpenDb
      OpenDb:     3.00174 milliseconds
      > db_bench -benchmarks updaterandom -num 0 -db /clustered_fs/db_bench -use_existing_db true -report_open_timing true 2>&1 | grep OpenDb
      OpenDb:     24.8732 milliseconds
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8464
      
      Reviewed By: hx235
      
      Differential Revision: D29398096
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 8f05dc3284f084612a3f30234e39e1c37548f50c
      b2073770
    • Z
      Inject fatal write failures to db_stress when DB is running (#8479) · a95a776d
      Zhichao Cao 提交于
      Summary:
      add the injest_error_severity to control if it is a retryable IO Error or a fatal or unrecoverable error. Use a flag to indicate, if fatal error comes, the flag is set and db is stopped (but not corrupted).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8479
      
      Test Plan: run  ./db_stress --reopen=0 --read_fault_one_in=1000 --write_fault_one_in=5 --disable_wal=true --write_buffer_size=3000000 -writepercent=5 -readpercent=50 --injest_error_severity=2 --column_families=1, make check
      
      Reviewed By: anand1976
      
      Differential Revision: D29524271
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 1aa9fb9b5655b0adba6f5ad12005ca8c074c795b
      a95a776d
    • A
      Enable crash test to run using fbcode components (#8471) · 41d32152
      anand76 提交于
      Summary:
      Add a new test ```fbcode_crash_test``` to rocksdb-lego-determinator. This test allows the crash test to be run on Facebook Sandcastle infra using fbcode components. Also use the default Env in db_stress to access the expected values path as it requires a memory mapped file and may not work with custom Envs.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8471
      
      Reviewed By: ajkr
      
      Differential Revision: D29474722
      
      Pulled By: anand1976
      
      fbshipit-source-id: 7d086d82dd7091ae48e08cb4ace763ce3e3b87ef
      41d32152
    • M
      Fix TSAN issue (#8477) · d45b8377
      mrambacher 提交于
      Summary:
      Added mutex to fix TSAN issue
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8477
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29517053
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 661ccb1f495b7d34874a79e0a3d7aea1123d6047
      d45b8377
  11. 01 7月, 2021 3 次提交
    • S
      Stress Test to inject write failures in reopen (#8474) · ba224b75
      sdong 提交于
      Summary:
      Previously Stress can inject metadata write failures when reopening a DB. We extend it to file append too, in the same way.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8474
      
      Test Plan: manually run crash test with various setting and make sure the failures are triggered as expected.
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29503116
      
      fbshipit-source-id: e73a446e80ccbd09301a579280e56ff949381fab
      ba224b75
    • M
      Fix PrepareOptions for Customizable Classes (#8468) · 41c4b665
      mrambacher 提交于
      Summary:
      Added the Customizable::ConfigureNewObject method.  The method will configure the object if options are found and invoke PrepareOptions if the flag is set properly.
      
      Added tests to test that PrepareOptions is properly called and to test if PrepareOptions fails.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8468
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29494703
      
      Pulled By: mrambacher
      
      fbshipit-source-id: d5767dee5d7a98620ac66190262101cd0aa9d2b7
      41c4b665
    • A
      Fix assertion failure when releasing a handle after secondary cache lookup fails (#8470) · a0cbb694
      anand76 提交于
      Summary:
      When the secondary cache lookup fails, we may still allocate a handle and charge the cache for metadata usage. If the cache is full, this can cause the usage to go over capacity. Later, when a (unrelated) handle is released, it trips up an assertion that checks that usage is less than capacity. To prevent this assertion failure, don't charge the cache for a failed secondary cache lookup.
      
      Tests:
      Run crash_test
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8470
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D29474713
      
      Pulled By: anand1976
      
      fbshipit-source-id: 27191969c95470a7b070d292b458efce71395bf2
      a0cbb694
  12. 30 6月, 2021 1 次提交