1. 27 4月, 2021 3 次提交
    • M
      Rename variables in ImmutableCFOptions to avoid conflicts with ImmutableDBOptions (#8227) · 0ca6d629
      mrambacher 提交于
      Summary:
      Renaming ImmutableCFOptions::info_log and statistics to logger and stats.  This is stage 2 in creating an ImmutableOptions class.  It is necessary because the names match those in ImmutableOptions and have different types.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8227
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D28000967
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 3bf2aa04e8f1e8724d825b7deacf41080c14420b
      0ca6d629
    • M
      Fix cast-function-type warning (#8230) · c2c7d5e9
      Mr-Leshiy 提交于
      Summary:
      Fixing cast-function-type which is appears during the following build:
      ```bash
      cmake ..  -DFAIL_ON_WARNINGS=ON -DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc -DCMAKE_CXX_COMPILER=x86_64-w64-mingw32-g++ -DCMAKE_SYSTEM_NAME=Windows
      make rocksdb
      ```
      Here is the log:
      ```
      /home/leshiy/Work/rocksdb/port/win/env_win.cc: In constructor ‘rocksdb::port::WinClock::WinClock()’:
      /home/leshiy/Work/rocksdb/port/win/env_win.cc:92:9: error: cast between incompatible function types from ‘FARPROC’ {aka ‘long long int (*)()’} to ‘rocksdb::port::WinClock::FnGetSystemTimePreciseAsFileTime’ {aka ‘void (*)(_FILETIME*)’} [-Werror=cast-function-type]
         92 |         (FnGetSystemTimePreciseAsFileTime)GetProcAddress(
            |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         93 |             module, "GetSystemTimePreciseAsFileTime");
            |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      cc1plus: all warnings being treated as errors
      make[2]: *** [CMakeFiles/rocksdb.dir/build.make:4337: CMakeFiles/rocksdb.dir/port/win/env_win.cc.obj] Error 1
      make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/rocksdb.dir/all] Error 2
      make: *** [Makefile:91: all] Error 2
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8230
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D28000215
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 874782cf48f70470e3fbd9097585bf42e810ca61
      c2c7d5e9
    • A
      WBWI Internal Move implementation from .h into .cpp (#8229) · 2760c2ae
      Adam Retter 提交于
      Summary:
      Moves some of the structural refactoring from https://github.com/facebook/rocksdb/pull/8135 into this PR.
      This just cleans up the code by moving implementation out of the .h file and into the .cc file.
      
      Should be considered for merge before both https://github.com/facebook/rocksdb/pull/7214 and https://github.com/facebook/rocksdb/pull/8135
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8229
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D27999669
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 6eccecbf1f11bb9f5a173e86d1e7bc448bc96071
      2760c2ae
  2. 26 4月, 2021 2 次提交
  3. 24 4月, 2021 1 次提交
    • S
      Eliminate double-buffering of keys in block_based_table_builder (#8219) · cc1c3ee5
      Saketh Are 提交于
      Summary:
      The block_based_table_builder buffers some blocks in memory to construct a good compression dictionary. Before this commit, the keys from each block were buffered separately for convenience. However, the buffered block data implicitly contains all keys. This commit eliminates the redundant key buffers and reduces memory usage.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8219
      
      Reviewed By: ajkr
      
      Differential Revision: D27945851
      
      Pulled By: saketh-are
      
      fbshipit-source-id: caf3cac1217201e080a1e24b542bedf20973afee
      cc1c3ee5
  4. 23 4月, 2021 5 次提交
    • S
      Expose JemallocNodumpAllocator to C API (#8178) · d65d7d65
      Sahir Hoda 提交于
      Summary:
      Add new C APIs to create the JemallocNodumpAllocator and set it on a Cache object.
      
      `make test` passes with and without `DISABLE_JEMALLOC=1`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8178
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D27944631
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2531729aa285a8985c58f22f093c4d53029c4a7b
      d65d7d65
    • M
      Make types of Immutable/Mutable Options fields match that of the underlying Option (#8176) · 01e460d5
      mrambacher 提交于
      Summary:
      This PR is a first step at attempting to clean up some of the Mutable/Immutable Options code.  With this change, a DBOption and a ColumnFamilyOption can be reconstructed from their Mutable and Immutable equivalents, respectively.
      
      readrandom tests do not show any performance degradation versus master (though both are slightly slower than the current 6.19 release).
      
      There are still fields in the ImmutableCFOptions that are not CF options but DB options.  Eventually, I would like to move those into an ImmutableOptions (= ImmutableDBOptions+ImmutableCFOptions).  But that will be part of a future PR to minimize changes and disruptions.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8176
      
      Reviewed By: pdillinger
      
      Differential Revision: D27954339
      
      Pulled By: mrambacher
      
      fbshipit-source-id: ec6b805ba9afe6e094bffdbd76246c2d99aa9fad
      01e460d5
    • J
      Add internal compaction API for Secondary instance (#8171) · f0fca2b1
      Jay Zhuang 提交于
      Summary:
      Add compaction API for secondary instance, which compact the files to a secondary DB path without installing to the LSM tree.
      The API will be used to remote compaction.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8171
      
      Test Plan: `make check`
      
      Reviewed By: ajkr
      
      Differential Revision: D27694545
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 8ff3ec1bffdb2e1becee994918850c8902caf731
      f0fca2b1
    • H
      Add ZenFS to plugin list (#8218) · e85d8a65
      Hans Holmberg 提交于
      Summary:
      Add ZenFS, a file system for zoned block devices, to PLUGINS.md
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8218
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D27944376
      
      Pulled By: ajkr
      
      fbshipit-source-id: c9ea2e9814001ccd7c56d7ef4d38e20dfeb48d1e
      e85d8a65
    • Z
      Fix the false positive alert of CF consistency check in WAL recovery (#8207) · 09a9ec3a
      Zhichao Cao 提交于
      Summary:
      In current RocksDB, in recover the information form WAL, we do the consistency check for each column family when one WAL file is corrupted and PointInTimeRecovery is set. However, it will report a false positive alert on "SST file is ahead of WALs" when one of the CF current log number is greater than the corrupted WAL number (CF contains the data beyond the corrupted WAl) due to a new column family creation during flush. In this case, a new WAL is created (it is empty) during a flush. Also, due to some reason (e.g., storage issue or crash happens before SyncCloseLog is called), the old WAL is corrupted. The new CF has no data, therefore, it does not have the consistency issue.
      
      Fix: when checking cfd->GetLogNumber() > corrupted_wal_number also check cfd->GetLiveSstFilesSize() > 0. So the CFs with no SST file data will skip the check here.
      
      Note potential ignored inconsistency caused due to fix: empty CF can also be caused by write+delete. In this case, after flush, there is no SST files being generated. However, this CF still have the log in the WAL. When the WAL is corrupted, the DB might be inconsistent.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8207
      
      Test Plan: added unit test, make crash_test
      
      Reviewed By: riversand963
      
      Differential Revision: D27898839
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 931fc2d8b92dd00b4169bf84b94e712fd688a83e
      09a9ec3a
  5. 22 4月, 2021 4 次提交
    • M
      Add check to cmake to see if we need to link against -latomic (#8183) · 47b424f4
      mrambacher 提交于
      Summary:
      For some compilers/environments (e.g. Clang, riscv64), we need to link against -latomic.  Check if this is a requirement and add the library to the third-party libs if it is.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8183
      
      Reviewed By: pdillinger
      
      Differential Revision: D27773564
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 68e15d823144f83fb02221c7bf5b1e43323419bf
      47b424f4
    • Y
      Ignore comparator name mismatch in ldb manifest dump (#8216) · 31435276
      Yanqin Jin 提交于
      Summary:
      RocksDB allows user-specified custom comparators which may not be known to `ldb`,
      a built-in tool for checking/mutating the database. Therefore, column family comparator
      names mismatch encountered during manifest dump should not prevent the dumping from
      proceeding.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8216
      
      Test Plan:
      ```
      make check
      ```
      
      Also manually do the following
      ```
      KEEP_DB=1 ./db_with_timestamp_basic_test
      ./ldb --db=<db> manifest_dump --verbose
      ```
      The ldb should succeed and print something like:
      ```
      ...
      --------------- Column family "default"  (ID 0) --------------
      log number: 6
      comparator: <TestComparator>, but the comparator object is not available.
      ...
      ```
      
      Reviewed By: ltamasi
      
      Differential Revision: D27927581
      
      Pulled By: riversand963
      
      fbshipit-source-id: f610b2c842187d17f575362070209ee6b74ec6d4
      31435276
    • S
      Add comment to DisableManualCompaction() (#8186) · 4985cea1
      sdong 提交于
      Summary:
      Add comment to DisableManualCompaction() which was missing.
      Also explictly return from DBImpl::CompactRange() to avoid memtable flush when manual compaction is disabled.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8186
      
      Test Plan: Run existing unit tests.
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D27744517
      
      fbshipit-source-id: 449548a48905903b888dc9612bd17480f6596a71
      4985cea1
    • A
      Stall writes in WriteBufferManager when memory_usage exceeds buffer_size (#7898) · 596e9008
      Akanksha Mahajan 提交于
      Summary:
      When WriteBufferManager is shared across DBs and column families
      to maintain memory usage under a limit, OOMs have been observed when flush cannot
      finish but writes continuously insert to memtables.
      In order to avoid OOMs, when memory usage goes beyond buffer_limit_ and DBs tries to write,
      this change will stall incoming writers until flush is completed and memory_usage
      drops.
      
      Design: Stall condition: When total memory usage exceeds WriteBufferManager::buffer_size_
      (memory_usage() >= buffer_size_) WriterBufferManager::ShouldStall() returns true.
      
      DBImpl first block incoming/future writers by calling write_thread_.BeginWriteStall()
      (which adds dummy stall object to the writer's queue).
      Then DB is blocked on a state State::Blocked (current write doesn't go
      through). WBStallInterface object maintained by every DB instance is added to the queue of
      WriteBufferManager.
      
      If multiple DBs tries to write during this stall, they will also be
      blocked when check WriteBufferManager::ShouldStall() returns true.
      
      End Stall condition: When flush is finished and memory usage goes down, stall will end only if memory
      waiting to be flushed is less than buffer_size/2. This lower limit will give time for flush
      to complete and avoid continous stalling if memory usage remains close to buffer_size.
      
      WriterBufferManager::EndWriteStall() is called,
      which removes all instances from its queue and signal them to continue.
      Their state is changed to State::Running and they are unblocked. DBImpl
      then signal all incoming writers of that DB to continue by calling
      write_thread_.EndWriteStall() (which removes dummy stall object from the
      queue).
      
      DB instance creates WBMStallInterface which is an interface to block and
      signal DBs during stall.
      When DB needs to be blocked or signalled by WriteBufferManager,
      state_for_wbm_ state is changed accordingly (RUNNING or BLOCKED).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7898
      
      Test Plan: Added a new test db/db_write_buffer_manager_test.cc
      
      Reviewed By: anand1976
      
      Differential Revision: D26093227
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: 2bbd982a3fb7033f6de6153aa92a221249861aae
      596e9008
  6. 21 4月, 2021 4 次提交
  7. 20 4月, 2021 5 次提交
    • J
      Fix unittest no space issue (#8204) · a89740fb
      Jay Zhuang 提交于
      Summary:
      Unittest reports no space from time to time, which can be reproduced on a small memory machine with SHM. It's caused by large WAL files generated during the test, which is preallocated, but didn't truncate during close(). Adding the missing APIs to set preallocation.
      It added arm test as nightly build, as the test runs more than 1 hour.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8204
      
      Test Plan: test on small memory arm machine
      
      Reviewed By: mrambacher
      
      Differential Revision: D27873145
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: f797c429d6bc13cbcc673bc03fcc72adda55f506
      a89740fb
    • J
      Move arm build from travis to circleci (#8203) · a345b4d6
      Jay Zhuang 提交于
      Summary:
      Moving ARM build from travis to CircleCI.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8203
      
      Test Plan: CI
      
      Reviewed By: ajkr
      
      Differential Revision: D27861753
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 5e36a67f6fbb921c2ed80b284ba2de485411937b
      a345b4d6
    • Y
      Handle rename() failure in non-local FS (#8192) · a376c220
      Yanqin Jin 提交于
      Summary:
      In a distributed environment, a file `rename()` operation can succeed on server (remote)
      side, but the client can somehow return non-ok status to RocksDB. Possible reasons include
      network partition, connection issue, etc. This happens in `rocksdb::SetCurrentFile()`, which
      can be called in `LogAndApply() -> ProcessManifestWrites()` if RocksDB tries to switch to a
      new MANIFEST. We currently always delete the new MANIFEST if an error occurs.
      
      This is problematic in distributed world. If the server-side successfully updates the CURRENT
      file via renaming, then a subsequent `DB::Open()` will try to look for the new MANIFEST and fail.
      
      As a fix, we can track the execution result of IO operations on the new MANIFEST.
      - If IO operations on the new MANIFEST fail, then we know the CURRENT must point to the original
        MANIFEST. Therefore, it is safe to remove the new MANIFEST.
      - If IO operations on the new MANIFEST all succeed, but somehow we end up in the clean up
        code block, then we do not know whether CURRENT points to the new or old MANIFEST. (For local
        POSIX-compliant FS, it should still point to old MANIFEST, but it does not matter if we keep the
        new MANIFEST.) Therefore, we keep the new MANIFEST.
          - Any future `LogAndApply()` will switch to a new MANIFEST and update CURRENT.
          - If process reopens the db immediately after the failure, then the CURRENT file can point
            to either the new MANIFEST or the old one, both of which exist. Therefore, recovery can
            succeed and ignore the other.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8192
      
      Test Plan: make check
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D27804648
      
      Pulled By: riversand963
      
      fbshipit-source-id: 9c16f2a5ce41bc6aadf085e48449b19ede8423e4
      a376c220
    • L
      Fix a data race related to DB properties (#8206) · 0c6e4674
      Levi Tamasi 提交于
      Summary:
      Historically, the DB properties `rocksdb.cur-size-active-mem-table`,
      `rocksdb.cur-size-all-mem-tables`, and `rocksdb.size-all-mem-tables` called
      the method `MemTable::ApproximateMemoryUsage` for mutable memtables,
      which is not safe without synchronization. This resulted in data races with
      memtable inserts. The patch changes the code handling these properties
      to use `MemTable::ApproximateMemoryUsageFast` instead, which returns a
      cached value backed by an atomic variable. Two test cases had to be updated
      for this change. `MemoryTest.MemTableAndTableReadersTotal` was fixed by
      increasing the value size used so each value ends up in its own memtable,
      which was the original intention (note: the test has been broken in the sense
      that the test code didn't consider that memtable sizes below 64 KB get
      increased to 64 KB by `SanitizeOptions`, and has been passing only by
      accident). `DBTest.MemoryUsageWithMaxWriteBufferSizeToMaintain` relies on
      completely up-to-date values and thus was changed to use `ApproximateMemoryUsage`
      directly instead of going through the DB properties. Note: this should be safe in this case
      since there's only a single thread involved.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8206
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D27866811
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 7bd754d0565e0a65f1f7f0e78ffc093beef79394
      0c6e4674
    • Y
      Handle blob files when options.best_efforts_recovery is true (#8180) · b0e20194
      Yanqin Jin 提交于
      Summary:
      If `options.best_efforts_recovery == true`, RocksDB currently tolerates missing table files and recovers to the latest version without missing table files (not considering WAL). It is necessary to handle blob files as well to make the feature more complete.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8180
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D27840556
      
      Pulled By: riversand963
      
      fbshipit-source-id: 041685d0dc2e7779ac4f0374c07a8a327704aa5e
      b0e20194
  8. 19 4月, 2021 1 次提交
  9. 17 4月, 2021 3 次提交
    • A
      Update release version to 6.20 (#8199) · 531a5f88
      Akanksha Mahajan 提交于
      Summary:
      Update release version to 6.20
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8199
      
      Test Plan: No code change
      
      Reviewed By: ajkr
      
      Differential Revision: D27838750
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: f02f722fc6bdd37d626d47a0e932bbecea3507a8
      531a5f88
    • P
      Ribbon long-term support, starting level support (#8198) · 10196d7e
      Peter Dillinger 提交于
      Summary:
      Since the Ribbon filter schema seems good (compatible back to
      6.15.0), this change commits to long term support of the SST schema,
      even though we expect the API for enabling Ribbon to change (still
      called NewExperimentalRibbonFilterPolicy).
      
      This also adds support for "hybrid" configuration in which some levels
      use Bloom (higher levels, lower numbered) for speed and the rest use
      Ribbon (lower levels, higher numbered) for memory space efficiency.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8198
      
      Test Plan: unit test added, crash test support
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D27831232
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 90e528677689474d293ed6710b42ba89fbd5b5ab
      10196d7e
    • A
      Fix Windows strcmp for Unicode (#8190) · 90e24569
      Adam Retter 提交于
      Summary:
      The code for strcmp that was present does work when compiled for Windows unicode file paths.
      
      Needs backporting to:
      * 6.17.fb
      * 6.18.fb
      * 6.19.fb
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8190
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D27765588
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 89f8a5ac61fd7edc758340dfd335b0a5f96dae6e
      90e24569
  10. 16 4月, 2021 5 次提交
  11. 15 4月, 2021 2 次提交
  12. 14 4月, 2021 1 次提交
  13. 13 4月, 2021 3 次提交
    • Y
      Disable IOStatsContext/PerfContext if no thread local (#8117) · fd00f39f
      Yanqin Jin 提交于
      Summary:
      Before this PR, `get_iostats_context()` will silently return a nullptr if no thread_local support is detected.
      This can be the result of build_detect_platform's failure to compile the simple code snippet on certain platforms, as
      reported in https://github.com/facebook/mysql-5.6/issues/904.
      To be safe, we should fail the compilation if user does not opt out IOStatsContext and
      ROCKSDB_SUPPORT_THREAD_LOCAL is not defined.
      
      If RocksDB relies on c++11, can we just always use thread_local? It turns out there might be
      performance concerns (https://github.com/facebook/rocksdb/issues/5774),
      which is beyond the scope of this PR. We can revisit this later. Here, we stick to the original impl.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8117
      
      Reviewed By: ajkr
      
      Differential Revision: D27356847
      
      Pulled By: riversand963
      
      fbshipit-source-id: f7d5776842277598d8341b955febb601946801ae
      fd00f39f
    • P
      Misc Backup API enhancements (#8170) · bb750925
      Peter Dillinger 提交于
      Summary:
      * CreateNewBackup(WithMetadata) returning the BackupID of new backup
      through optional new output param. This is especially useful with the
      new mutithreading support, so that you can transactionally determine the
      ID of a backup you create.
      * GetBackupInfo / GetLatestBackupInfo for individual backups, so that
      you don't have to comb through a vector of backups if you don't want to.
      
      Updated HISTORY.md (including re: BlobDB support as new feature)
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8170
      
      Test Plan:
      Added test logic to existing tests, to minimize increase in
      cost of running tests
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D27680410
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 1fc45b73d81aae293ccd4a43d9583d7fd915d3eb
      bb750925
    • X
      Add util/crc32c_arm64.cc to TARGETS (#8168) · 8972dd1f
      Xavier Deguillard 提交于
      Summary:
      When compiling RocksDB with Buck for ARM64, the linker complains about missing crc32 symbols that are defined in the crc32c_arm64.cc file. Since this file wasn't included in the build this is totally expected
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8168
      
      Test Plan:
      The following no longer fails to link rocksdb:
        buck build mode/mac-xcode //eden/fs/service:edenfs#macosx-arm64
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D27664627
      
      Pulled By: xavierd
      
      fbshipit-source-id: fb9d7a538599ee7a08882f87628731de6e641f8d
      8972dd1f
  14. 10 4月, 2021 1 次提交