1. 08 3月, 2022 5 次提交
  2. 05 3月, 2022 5 次提交
    • D
      Adding Social Banner in Support of Ukraine (#9652) · f20b6747
      Dmitry Vinnik 提交于
      Summary:
      Our mission at [Meta Open Source](https://opensource.facebook.com/) is to empower communities through open source, and we believe that it means building a welcoming and safe environment for all. As a part of this work, we are adding this banner in support for Ukraine during this crisis.
      
      ## Testing
      <img width="1080" alt="image" src="https://user-images.githubusercontent.com/12485205/156454047-9c153135-f3a6-41f7-adbe-8139759565ae.png">
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9652
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D34647211
      
      Pulled By: dmitryvinn-fb
      
      fbshipit-source-id: b89cdc7eafcc58b1f503ee8e1939e43bffcb3b3f
      f20b6747
    • P
      Test refactoring for Backups+Temperatures (#9655) · ce60d0cb
      Peter Dillinger 提交于
      Summary:
      In preparation for more support for file Temperatures in BackupEngine,
      this change does some test refactoring:
      * Move DBTest2::BackupFileTemperature test to
      BackupEngineTest::FileTemperatures, with some updates to make it work
      in the new home. This test will soon be expanded for deeper backup work.
      * Move FileTemperatureTestFS from db_test2.cc to db_test_util.h, to
      support sharing because of above moved test, but split off the "no link"
      part to the test needing it.
      * Use custom FileSystems in backupable_db_test rather than custom Envs,
      because going through Env file interfaces doesn't support temperatures.
      * Fix RemapFileSystem to map DirFsyncOptions::renamed_new_name
      parameter to FsyncWithDirOptions, which was required because this
      limitation caused a crash only after moving to higher fidelity of
      FileSystem interface (vs. LegacyDirectoryWrapper throwing away some
      parameter details)
      * `backupable_options_` -> `engine_options_` as part of the ongoing
      work to get rid of the obsolete "backupable" naming.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9655
      
      Test Plan: test code updates only
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D34622183
      
      Pulled By: pdillinger
      
      fbshipit-source-id: f24b7a596a89b9e089e960f4e5d772575513e93f
      ce60d0cb
    • H
      Attempt to deflake DBLogicalBlockSizeCacheTest.CreateColumnFamilies (#9516) · fc61e98a
      Hui Xiao 提交于
      Summary:
      **Context:**
      `DBLogicalBlockSizeCacheTest.CreateColumnFamilies` is flaky on a rare occurrence of assertion failure below
      ```
      db/db_logical_block_size_cache_test.cc:210
      Expected equality of these values:
        1
        cache_->GetRefCount(cf_path_0_)
          Which is: 2
      ```
      
      Root-cause: `ASSERT_OK(db->DestroyColumnFamilyHandle(cfs[0]));` in the test may not successfully decrease the ref count of `cf_path_0_` since the decreasing only happens in the clean-up of `ColumnFamilyData` when `ColumnFamilyData` has no referencing to it, which may not be true when `db->DestroyColumnFamilyHandle(cfs[0])` is called since background work such as `DumpStats()` can hold reference to that `ColumnFamilyData` (suggested and repro-d by ajkr ). Similar case `ASSERT_OK(db->DestroyColumnFamilyHandle(cfs[1]));`.
      
      See following for a deterministic repro:
      ```
       diff --git a/db/db_impl/db_impl.cc b/db/db_impl/db_impl.cc
      index 196b428a3..4e7a834c4 100644
       --- a/db/db_impl/db_impl.cc
      +++ b/db/db_impl/db_impl.cc
      @@ -956,10 +956,16 @@ void DBImpl::DumpStats() {
               // near-atomically.
               // Get a ref before unlocking
               cfd->Ref();
      +        if (cfd->GetName() == "cf1" || cfd->GetName() == "cf2") {
      +          TEST_SYNC_POINT("DBImpl::DumpStats:PostCFDRef");
      +        }
               {
                 InstrumentedMutexUnlock u(&mutex_);
                 cfd->internal_stats()->CollectCacheEntryStats(/*foreground=*/false);
               }
      +        if (cfd->GetName() == "cf1" || cfd->GetName() == "cf2") {
      +          TEST_SYNC_POINT("DBImpl::DumpStats::PreCFDUnrefAndTryDelete");
      +        }
               cfd->UnrefAndTryDelete();
             }
           }
       diff --git a/db/db_logical_block_size_cache_test.cc b/db/db_logical_block_size_cache_test.cc
      index 1057871c9..c3872c036 100644
       --- a/db/db_logical_block_size_cache_test.cc
      +++ b/db/db_logical_block_size_cache_test.cc
      @@ -9,6 +9,7 @@
       #include "env/io_posix.h"
       #include "rocksdb/db.h"
       #include "rocksdb/env.h"
      +#include "test_util/sync_point.h"
      
       namespace ROCKSDB_NAMESPACE {
       class EnvWithCustomLogicalBlockSizeCache : public EnvWrapper {
      @@ -183,6 +184,15 @@ TEST_F(DBLogicalBlockSizeCacheTest, CreateColumnFamilies) {
         ASSERT_EQ(1, cache_->GetRefCount(dbname_));
      
         std::vector<ColumnFamilyHandle*> cfs;
      +  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
      +  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->LoadDependency(
      +      {{"DBLogicalBlockSizeCacheTest::CreateColumnFamilies::PostSetupTwoCFH",
      +        "DBImpl::DumpStats:StartRunning"},
      +       {"DBImpl::DumpStats:PostCFDRef",
      +        "DBLogicalBlockSizeCacheTest::CreateColumnFamilies::PreDeleteTwoCFH"},
      +       {"DBLogicalBlockSizeCacheTest::CreateColumnFamilies::"
      +        "PostFinishCheckingRef",
      +        "DBImpl::DumpStats::PreCFDUnrefAndTryDelete"}});
         ASSERT_OK(db->CreateColumnFamilies(cf_options, {"cf1", "cf2"}, &cfs));
         ASSERT_EQ(2, cache_->Size());
         ASSERT_TRUE(cache_->Contains(dbname_));
      @@ -190,7 +200,7 @@ TEST_F(DBLogicalBlockSizeCacheTest, CreateColumnFamilies) {
         ASSERT_TRUE(cache_->Contains(cf_path_0_));
         ASSERT_EQ(2, cache_->GetRefCount(cf_path_0_));
         }
      
          // Delete one handle will not drop cache because another handle is still
         // referencing cf_path_0_.
      +  TEST_SYNC_POINT(
      +      "DBLogicalBlockSizeCacheTest::CreateColumnFamilies::PostSetupTwoCFH");
      +  TEST_SYNC_POINT(
      +      "DBLogicalBlockSizeCacheTest::CreateColumnFamilies::PreDeleteTwoCFH");
         ASSERT_OK(db->DestroyColumnFamilyHandle(cfs[0]));
         ASSERT_EQ(2, cache_->Size());
         ASSERT_TRUE(cache_->Contains(dbname_));
      @@ -209,16 +221,20 @@ TEST_F(DBLogicalBlockSizeCacheTest, CreateColumnFamilies) {
         ASSERT_TRUE(cache_->Contains(cf_path_0_));
          // Will fail
         ASSERT_EQ(1, cache_->GetRefCount(cf_path_0_));
      
         // Delete the last handle will drop cache.
         ASSERT_OK(db->DestroyColumnFamilyHandle(cfs[1]));
         ASSERT_EQ(1, cache_->Size());
         ASSERT_TRUE(cache_->Contains(dbname_));
         // Will fail
         ASSERT_EQ(1, cache_->GetRefCount(dbname_));
      
      +  TEST_SYNC_POINT(
      +      "DBLogicalBlockSizeCacheTest::CreateColumnFamilies::"
      +      "PostFinishCheckingRef");
         delete db;
         ASSERT_EQ(0, cache_->Size());
         ASSERT_OK(DestroyDB(dbname_, options,
             {{"cf1", cf_options}, {"cf2", cf_options}}));
      +  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
       }
      ```
      
      **Summary**
      - Removed the flaky assertion
      - Clarified the comments for the test
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9516
      
      Test Plan:
      - CI
      - Monitor for future flakiness
      
      Reviewed By: ajkr
      
      Differential Revision: D34055232
      
      Pulled By: hx235
      
      fbshipit-source-id: 9bf83ae5fa88bf6fc829876494d4692082e4c357
      fc61e98a
    • H
      Dynamic toggling of BlockBasedTableOptions::detect_filter_construct_corruption (#9654) · 4a776d81
      Hui Xiao 提交于
      Summary:
      **Context/Summary:**
      As requested, `BlockBasedTableOptions::detect_filter_construct_corruption` can now be dynamically configured using `DB::SetOptions` after this PR
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9654
      
      Test Plan: - New unit test
      
      Reviewed By: pdillinger
      
      Differential Revision: D34622609
      
      Pulled By: hx235
      
      fbshipit-source-id: c06773ef3d029e6bf1724d3a72dffd37a8ec66d9
      4a776d81
    • A
      Avoid usage of ReopenWritableFile in db_stress (#9649) · 3362a730
      anand76 提交于
      Summary:
      The UniqueIdVerifier constructor currently calls ReopenWritableFile on
      the FileSystem, which might not be supported. Instead of relying on
      reopening the unique IDs file for writing, create a new file and copy
      the original contents.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9649
      
      Test Plan: Run db_stress
      
      Reviewed By: pdillinger
      
      Differential Revision: D34572307
      
      Pulled By: anand1976
      
      fbshipit-source-id: 3a777908582d79dae57488d4278bad126774f698
      3362a730
  3. 04 3月, 2022 1 次提交
    • J
      Improve build speed (#9605) · 67542bfa
      Jay Zhuang 提交于
      Summary:
      Improve the CI build speed:
      - split the macos tests to 2 parallel jobs
      - split tsan tests to 2 parallel jobs
      - move non-shm tests to nightly build
      - slow jobs use lager machine
      - fast jobs use smaller machine
      - add microbench to no-test jobs
      - add run-microbench to nightly build
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9605
      
      Reviewed By: riversand963
      
      Differential Revision: D34358982
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: d5091b3f4ef6d25c5c37920fb614f3342ee60e4a
      67542bfa
  4. 03 3月, 2022 4 次提交
    • Y
      Fix bug causing incorrect data returned by snapshot read (#9648) · 659a16d5
      Yanqin Jin 提交于
      Summary:
      This bug affects use cases that meet the following conditions
      - (has only the default column family or disables WAL) and
      - has at least one event listener
      - atomic flush is NOT affected.
      
      If the above conditions meet, then RocksDB can release the db mutex before picking all the
      existing memtables to flush. In the meantime, a snapshot can be created and db's sequence
      number can still be incremented. The upcoming flush will ignore this snapshot.
      A later read using this snapshot can return incorrect result.
      
      To fix this issue, we call the listeners callbacks after picking the memtables so that we avoid
      creating snapshots during this interval.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9648
      
      Test Plan: make check
      
      Reviewed By: ajkr
      
      Differential Revision: D34555456
      
      Pulled By: riversand963
      
      fbshipit-source-id: 1438981e9f069a5916686b1a0ad7627f734cf0ee
      659a16d5
    • Y
      Do not rely on ADL when invoking std::max_element (#9608) · 73fd589b
      Yuriy Chernyshov 提交于
      Summary:
      Certain STLs use raw pointers and ADL does not work for them.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9608
      
      Reviewed By: ajkr
      
      Differential Revision: D34583012
      
      Pulled By: riversand963
      
      fbshipit-source-id: 7de6bbc8a080c3e7243ce0d758fe83f1663168aa
      73fd589b
    • J
      Fix corruption error when compressing blob data with zlib. (#9572) · 926ee138
      jingkai.yuan 提交于
      Summary:
      The plain data length may not be big enough if the compression actually expands data. So use deflateBound() to get the upper limit on the compressed output before deflate().
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9572
      
      Reviewed By: riversand963
      
      Differential Revision: D34326475
      
      Pulled By: ajkr
      
      fbshipit-source-id: 4b679cb7a83a62782a127785b4d5eb9aa4646449
      926ee138
    • J
      Unschedule manual compaction from thread-pool queue (#9625) · db864796
      Jay Zhuang 提交于
      Summary:
      PR https://github.com/facebook/rocksdb/issues/9557 introduced a race condition between manual compaction
      foreground thread and background compaction thread.
      This PR adds the ability to really unschedule manual compaction from
      thread-pool queue by differentiate tag name for manual compaction and
      other tasks.
      Also fix an issue that db `close()` didn't cancel the manual compaction thread.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9625
      
      Test Plan: unittest not hang
      
      Reviewed By: ajkr
      
      Differential Revision: D34410811
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: cb14065eabb8cf1345fa042b5652d4f788c0c40c
      db864796
  5. 02 3月, 2022 9 次提交
  6. 01 3月, 2022 3 次提交
    • A
      Improve build detect for RISCV (#9366) · 7d7e88c7
      Adam Retter 提交于
      Summary:
      Related to: https://github.com/facebook/rocksdb/pull/9215
      
      * Adds build_detect_platform support for RISCV on Linux (at least on SiFive Unmatched platforms)
      
      This still leaves some linking issues on RISCV remaining (e.g. when building `db_test`):
      ```
      /usr/bin/ld: ./librocksdb_debug.a(memtable.o): in function `__gnu_cxx::new_allocator<char>::deallocate(char*, unsigned long)':
      /usr/include/c++/10/ext/new_allocator.h:133: undefined reference to `__atomic_compare_exchange_1'
      /usr/bin/ld: ./librocksdb_debug.a(memtable.o): in function `std::__atomic_base<bool>::compare_exchange_weak(bool&, bool, std::memory_order, std::memory_order)':
      /usr/include/c++/10/bits/atomic_base.h:464: undefined reference to `__atomic_compare_exchange_1'
      /usr/bin/ld: /usr/include/c++/10/bits/atomic_base.h:464: undefined reference to `__atomic_compare_exchange_1'
      /usr/bin/ld: /usr/include/c++/10/bits/atomic_base.h:464: undefined reference to `__atomic_compare_exchange_1'
      /usr/bin/ld: /usr/include/c++/10/bits/atomic_base.h:464: undefined reference to `__atomic_compare_exchange_1'
      /usr/bin/ld: ./librocksdb_debug.a(memtable.o):/usr/include/c++/10/bits/atomic_base.h:464: more undefined references to `__atomic_compare_exchange_1' follow
      /usr/bin/ld: ./librocksdb_debug.a(db_impl.o): in function `rocksdb::DBImpl::NewIteratorImpl(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyData*, unsigned long, rocksdb::ReadCallback*, bool, bool)':
      /home/adamretter/rocksdb/db/db_impl/db_impl.cc:3019: undefined reference to `__atomic_exchange_1'
      /usr/bin/ld: ./librocksdb_debug.a(write_thread.o): in function `rocksdb::WriteThread::Writer::CreateMutex()':
      /home/adamretter/rocksdb/./db/write_thread.h:205: undefined reference to `__atomic_compare_exchange_1'
      /usr/bin/ld: ./librocksdb_debug.a(write_thread.o): in function `rocksdb::WriteThread::SetState(rocksdb::WriteThread::Writer*, unsigned char)':
      /home/adamretter/rocksdb/db/write_thread.cc:222: undefined reference to `__atomic_compare_exchange_1'
      collect2: error: ld returned 1 exit status
      make: *** [Makefile:1449: db_test] Error 1
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9366
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D34377664
      
      Pulled By: mrambacher
      
      fbshipit-source-id: c86f9d0cd1cb0c18de72b06f1bf5847f23f51118
      7d7e88c7
    • A
      Handle failures in block-based table size/offset approximation (#9615) · 0a89cea5
      Andrew Kryczka 提交于
      Summary:
      In crash test with fault injection, we were seeing stack traces like the following:
      
      ```
      https://github.com/facebook/rocksdb/issues/3 0x00007f75f763c533 in __GI___assert_fail (assertion=assertion@entry=0x1c5b2a0 "end_offset >= start_offset", file=file@entry=0x1c580a0 "table/block_based/block_based_table_reader.cc", line=line@entry=3245,
      function=function@entry=0x1c60e60 "virtual uint64_t rocksdb::BlockBasedTable::ApproximateSize(const rocksdb::Slice&, const rocksdb::Slice&, rocksdb::TableReaderCaller)") at assert.c:101
      https://github.com/facebook/rocksdb/issues/4 0x00000000010ea9b4 in rocksdb::BlockBasedTable::ApproximateSize (this=<optimized out>, start=..., end=..., caller=<optimized out>) at table/block_based/block_based_table_reader.cc:3224
      https://github.com/facebook/rocksdb/issues/5 0x0000000000be61fb in rocksdb::TableCache::ApproximateSize (this=0x60f0000161b0, start=..., end=..., fd=..., caller=caller@entry=rocksdb::kCompaction, internal_comparator=..., prefix_extractor=...) at db/table_cache.cc:719
      https://github.com/facebook/rocksdb/issues/6 0x0000000000c3eaec in rocksdb::VersionSet::ApproximateSize (this=<optimized out>, v=<optimized out>, f=..., start=..., end=..., caller=<optimized out>) at ./db/version_set.h:850
      https://github.com/facebook/rocksdb/issues/7 0x0000000000c6ebc3 in rocksdb::VersionSet::ApproximateSize (this=<optimized out>, options=..., v=v@entry=0x621000047500, start=..., end=..., start_level=start_level@entry=0, end_level=<optimized out>, caller=<optimized out>)
      at db/version_set.cc:5657
      https://github.com/facebook/rocksdb/issues/8 0x000000000166e894 in rocksdb::CompactionJob::GenSubcompactionBoundaries (this=<optimized out>) at ./include/rocksdb/options.h:1869
      https://github.com/facebook/rocksdb/issues/9 0x000000000168c526 in rocksdb::CompactionJob::Prepare (this=this@entry=0x7f75f3ffcf00) at db/compaction/compaction_job.cc:546
      ```
      
      The problem occurred in `ApproximateSize()` when the index `Seek()` for the first `ApproximateDataOffsetOf()` encountered an I/O error, while the second `Seek()` did not. In the old code that scenario caused `start_offset == data_size` , thus it was easy to trip the assertion that `end_offset >= start_offset`.
      
      The fix is to set `start_offset == 0` when the first index `Seek()` fails, and `end_offset == data_size` when the second index `Seek()` fails. I doubt these give an "on average correct" answer for how this function is used, but I/O errors in index seeks are hopefully rare, it looked consistent with what was already there, and it was easier to calculate.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9615
      
      Test Plan:
      run the repro command for a while and stopped seeing coredumps -
      
      ```
      $ while !  ./db_stress --block_size=128 --cache_size=32768 --clear_column_family_one_in=0 --column_families=1 --continuous_verification_interval=0 --db=/dev/shm/rocksdb_crashtest --delpercent=4 --delrangepercent=1 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --index_type=2 --iterpercent=10  --kill_random_test=18887 --max_key=1000000 --max_bytes_for_level_base=2048576 --nooverwritepercent=1 --open_files=-1 --open_read_fault_one_in=32 --ops_per_thread=1000000 --prefixpercent=5 --read_fault_one_in=0 --readpercent=45 --reopen=0 --skip_verifydb=1 --subcompactions=2 --target_file_size_base=524288 --test_batches_snapshots=0 --value_size_mult=32 --write_buffer_size=524288 --writepercent=35  ; do : ; done
      ```
      
      Reviewed By: pdillinger
      
      Differential Revision: D34383069
      
      Pulled By: ajkr
      
      fbshipit-source-id: fac26c3b20ea962e75387515ba5f2724dc48719f
      0a89cea5
    • S
      Fix trivial Javadoc omissions (#9534) · ddb7620a
      stefan-zobel 提交于
      Summary:
      - fix spelling of `valueSizeSofLimit` and add "param" description in ReadOptions
      - add 3 missing "return" in RocksDB
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9534
      
      Reviewed By: riversand963
      
      Differential Revision: D34131186
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 7eb7ec177906052837180b291d67fb1c838c49e1
      ddb7620a
  7. 28 2月, 2022 1 次提交
    • A
      Dedicate cacheline for DB mutex (#9637) · 9983eecd
      Andrew Kryczka 提交于
      Summary:
      We found a case of cacheline bouncing due to writers locking/unlocking `mutex_` and readers accessing `block_cache_tracer_`. We discovered it only after the issue was fixed by https://github.com/facebook/rocksdb/issues/9462 shifting the `DBImpl` members such that `mutex_` and `block_cache_tracer_` were naturally placed in separate cachelines in our regression testing setup. This PR forces the cacheline alignment of `mutex_` so we don't accidentally reintroduce the problem.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9637
      
      Reviewed By: riversand963
      
      Differential Revision: D34502233
      
      Pulled By: ajkr
      
      fbshipit-source-id: 46aa313b7fe83e80c3de254e332b6fb242434c07
      9983eecd
  8. 26 2月, 2022 2 次提交
  9. 24 2月, 2022 2 次提交
    • S
      Streaming Compression API for WAL compression. (#9619) · 21345d28
      Siddhartha Roychowdhury 提交于
      Summary:
      Implement a streaming compression API (compress/uncompress) to use for WAL compression. The log_writer would use the compress class/API to compress a record before writing it out in chunks. The log_reader would use the uncompress class/API to uncompress the chunks and combine into a single record.
      
      Added unit test to verify the API for different sizes/compression types.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9619
      
      Test Plan: make -j24 check
      
      Reviewed By: anand1976
      
      Differential Revision: D34437346
      
      Pulled By: sidroyc
      
      fbshipit-source-id: b180569ad2ddcf3106380f8758b556cc0ad18382
      21345d28
    • B
      Add a secondary cache implementation based on LRUCache 1 (#9518) · f706a9c1
      Bo Wang 提交于
      Summary:
      **Summary:**
      RocksDB uses a block cache to reduce IO and make queries more efficient. The block cache is based on the LRU algorithm (LRUCache) and keeps objects containing uncompressed data, such as Block, ParsedFullFilterBlock etc. It allows the user to configure a second level cache (rocksdb::SecondaryCache) to extend the primary block cache by holding items evicted from it. Some of the major RocksDB users, like MyRocks, use direct IO and would like to use a primary block cache for uncompressed data and a secondary cache for compressed data. The latter allows us to mitigate the loss of the Linux page cache due to direct IO.
      
      This PR includes a concrete implementation of rocksdb::SecondaryCache that integrates with compression libraries such as LZ4 and implements an LRU cache to hold compressed blocks.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9518
      
      Test Plan:
      In this PR, the lru_secondary_cache_test.cc includes the following tests:
      1. The unit tests for the secondary cache with either compression or no compression, such as basic tests, fails tests.
      2. The integration tests with both primary cache and this secondary cache .
      
      **Follow Up:**
      
      1. Statistics (e.g. compression ratio) will be added in another PR.
      2. Once this implementation is ready, I will do some shadow testing and benchmarking with UDB to measure the impact.
      
      Reviewed By: anand1976
      
      Differential Revision: D34430930
      
      Pulled By: gitbw95
      
      fbshipit-source-id: 218d78b672a2f914856d8a90ff32f2f5b5043ded
      f706a9c1
  10. 23 2月, 2022 5 次提交
    • Y
      Support WBWI for keys having timestamps (#9603) · 6f125998
      Yanqin Jin 提交于
      Summary:
      This PR supports inserting keys to a `WriteBatchWithIndex` for column families that enable user-defined timestamps
      and reading the keys back. **The index does not have timestamps.**
      
      Writing a key to WBWI is unchanged, because the underlying WriteBatch already supports it.
      When reading the keys back, we need to make sure to distinguish between keys with and without timestamps before
      comparison.
      
      When user calls `GetFromBatchAndDB()`, no timestamp is needed to query the batch, but a timestamp has to be
      provided to query the db. The assumption is that data in the batch must be newer than data from the db.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9603
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D34354849
      
      Pulled By: riversand963
      
      fbshipit-source-id: d25d1f84e2240ce543e521fa30595082fb8db9a0
      6f125998
    • A
      Fix test race conditions with OnFlushCompleted() (#9617) · 8ca433f9
      Andrew Kryczka 提交于
      Summary:
      We often see flaky tests due to `DB::Flush()` or `DBImpl::TEST_WaitForFlushMemTable()` not waiting until event listeners complete. For example, https://github.com/facebook/rocksdb/issues/9084, https://github.com/facebook/rocksdb/issues/9400, https://github.com/facebook/rocksdb/issues/9528, plus two new ones this week: "EventListenerTest.OnSingleDBFlushTest" and "DBFlushTest.FireOnFlushCompletedAfterCommittedResult". I ran a `make check` with the below race condition-coercing patch and fixed  issues it found besides old BlobDB.
      
      ```
       diff --git a/db/db_impl/db_impl_compaction_flush.cc b/db/db_impl/db_impl_compaction_flush.cc
      index 0e1864788..aaba68c4a 100644
       --- a/db/db_impl/db_impl_compaction_flush.cc
      +++ b/db/db_impl/db_impl_compaction_flush.cc
      @@ -861,6 +861,8 @@ void DBImpl::NotifyOnFlushCompleted(
              mutable_cf_options.level0_stop_writes_trigger);
         // release lock while notifying events
         mutex_.Unlock();
      +  bg_cv_.SignalAll();
      +  sleep(1);
         {
           for (auto& info : *flush_jobs_info) {
             info->triggered_writes_slowdown = triggered_writes_slowdown;
      ```
      
      The reason I did not fix old BlobDB issues is because it appears to have a fundamental (non-test) issue. In particular, it uses an EventListener to keep track of the files. OnFlushCompleted() could be delayed until even after a compaction involving that flushed file completes, causing the compaction to unexpectedly delete an untracked file.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9617
      
      Test Plan: `make check` including the race condition coercing patch
      
      Reviewed By: hx235
      
      Differential Revision: D34384022
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2652ded39b415277c5d6a628414345223930514e
      8ca433f9
    • A
      Enable core dumps in TSAN/UBSAN crash tests (#9616) · 96978e4d
      Andrew Kryczka 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9616
      
      Reviewed By: hx235
      
      Differential Revision: D34383489
      
      Pulled By: ajkr
      
      fbshipit-source-id: e4299000ef38073ec57e6ab5836150fdf8ce43d4
      96978e4d
    • A
      Combine data members of IOStatus with Status (#9549) · d795a730
      anand76 提交于
      Summary:
      Combine the data members retryable_, data_loss_ and scope_ of IOStatus
      with Status, as protected members. IOStatus is now defined as a derived class of Status with
      no new data, but additional methods. This will allow us to eventually
      track the result of FileSystem calls in RocksDB with one variable
      instead of two.
      
      Benchmark commands and results are below. The performance after changes seems slightly better.
      
      ```./db_bench -db=/data/mysql/rocksdb/prefix_scan -benchmarks="fillseq" -key_size=32 -value_size=512 -num=5000000 -use_direct_io_for_flush_and_compaction=true -target_file_size_base=16777216```
      
      ```./db_bench -use_existing_db=true --db=/data/mysql/rocksdb/prefix_scan -benchmarks="readseq,seekrandom,readseq" -key_size=32 -value_size=512 -num=5000000 -seek_nexts=10000 -use_direct_reads=true -duration=60 -ops_between_duration_checks=1 -readonly=true -adaptive_readahead=false -threads=1 -cache_size=10485760000```
      
      Before -
      seekrandom   :    3715.432 micros/op 269 ops/sec; 1394.9 MB/s (16149 of 16149 found)
      seekrandom   :    3687.177 micros/op 271 ops/sec; 1405.6 MB/s (16273 of 16273 found)
      seekrandom   :    3709.646 micros/op 269 ops/sec; 1397.1 MB/s (16175 of 16175 found)
      
      readseq      :       0.369 micros/op 2711321 ops/sec; 1406.6 MB/s
      readseq      :       0.363 micros/op 2754092 ops/sec; 1428.8 MB/s
      readseq      :       0.372 micros/op 2688046 ops/sec; 1394.6 MB/s
      
      After -
      seekrandom   :    3606.830 micros/op 277 ops/sec; 1436.9 MB/s (16636 of 16636 found)
      seekrandom   :    3594.467 micros/op 278 ops/sec; 1441.9 MB/s (16693 of 16693 found)
      seekrandom   :    3597.919 micros/op 277 ops/sec; 1440.5 MB/s (16677 of 16677 found)
      
      readseq      :       0.354 micros/op 2822809 ops/sec; 1464.5 MB/s
      readseq      :       0.358 micros/op 2795080 ops/sec; 1450.1 MB/s
      readseq      :       0.354 micros/op 2822889 ops/sec; 1464.5 MB/s
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9549
      
      Reviewed By: pdillinger
      
      Differential Revision: D34310362
      
      Pulled By: anand1976
      
      fbshipit-source-id: 54b27756edf9c9ecfe730a2dce542a7a46743096
      d795a730
    • P
      configure microbenchmarks, regenerate targets (#9599) · ba65cfff
      Patrick Somaru 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9599
      
      Reviewed By: jay-zhuang, hodgesds
      
      Differential Revision: D34214408
      
      fbshipit-source-id: 6932200772f52ce77e550646ee3d1a928295844a
      ba65cfff
  11. 22 2月, 2022 1 次提交
    • A
      Fix DBTest2.BackupFileTemperature memory leak (#9610) · 3379d146
      Andrew Kryczka 提交于
      Summary:
      Valgrind was failing with the below error because we forgot to destroy
      the `BackupEngine` object:
      
      ```
      ==421173== Command: ./db_test2 --gtest_filter=DBTest2.BackupFileTemperature
      ==421173==
      Note: Google Test filter = DBTest2.BackupFileTemperature
      [==========] Running 1 test from 1 test case.
      [----------] Global test environment set-up.
      [----------] 1 test from DBTest2
      [ RUN      ] DBTest2.BackupFileTemperature
      --421173-- WARNING: unhandled amd64-linux syscall: 425
      --421173-- You may be able to write your own handler.
      --421173-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
      --421173-- Nevertheless we consider this a bug.  Please report
      --421173-- it at http://valgrind.org/support/bug_reports.html.
      [       OK ] DBTest2.BackupFileTemperature (3366 ms)
      [----------] 1 test from DBTest2 (3371 ms total)
      
      [----------] Global test environment tear-down
      [==========] 1 test from 1 test case ran. (3413 ms total)
      [  PASSED  ] 1 test.
      ==421173==
      ==421173== HEAP SUMMARY:
      ==421173==     in use at exit: 13,042 bytes in 195 blocks
      ==421173==   total heap usage: 26,022 allocs, 25,827 frees, 27,555,265 bytes allocated
      ==421173==
      ==421173== 8 bytes in 1 blocks are possibly lost in loss record 6 of 167
      ==421173==    at 0x4838DBF: operator new(unsigned long) (vg_replace_malloc.c:344)
      ==421173==    by 0x8D4606: allocate (new_allocator.h:114)
      ==421173==    by 0x8D4606: allocate (alloc_traits.h:445)
      ==421173==    by 0x8D4606: _M_allocate (stl_vector.h:343)
      ==421173==    by 0x8D4606: reserve (vector.tcc:78)
      ==421173==    by 0x8D4606: rocksdb::BackupEngineImpl::Initialize() (backupable_db.cc:1174)
      ==421173==    by 0x8D5473: Initialize (backupable_db.cc:918)
      ==421173==    by 0x8D5473: rocksdb::BackupEngine::Open(rocksdb::BackupEngineOptions const&, rocksdb::Env*, rocksdb::BackupEngine**) (backupable_db.cc:937)
      ==421173==    by 0x50AC8F: Open (backup_engine.h:585)
      ==421173==    by 0x50AC8F: rocksdb::DBTest2_BackupFileTemperature_Test::TestBody() (db_test2.cc:6996)
      ...
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9610
      
      Test Plan:
      ```
      $ make -j24 ROCKSDBTESTS_SUBSET=db_test2 valgrind_check_some
      ```
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D34371210
      
      Pulled By: ajkr
      
      fbshipit-source-id: 68154fcb0c51b28222efa23fa4ee02df8d925a18
      3379d146
  12. 21 2月, 2022 1 次提交
  13. 19 2月, 2022 1 次提交