1. 26 1月, 2021 3 次提交
    • M
      Add a SystemClock class to capture the time functions of an Env (#7858) · 12f11373
      mrambacher 提交于
      Summary:
      Introduces and uses a SystemClock class to RocksDB.  This class contains the time-related functions of an Env and these functions can be redirected from the Env to the SystemClock.
      
      Many of the places that used an Env (Timer, PerfStepTimer, RepeatableThread, RateLimiter, WriteController) for time-related functions have been changed to use SystemClock instead.  There are likely more places that can be changed, but this is a start to show what can/should be done.  Over time it would be nice to migrate most (if not all) of the uses of the time functions from the Env to the SystemClock.
      
      There are several Env classes that implement these functions.  Most of these have not been converted yet to SystemClock implementations; that will come in a subsequent PR.  It would be good to unify many of the Mock Timer implementations, so that they behave similarly and be tested similarly (some override Sleep, some use a MockSleep, etc).
      
      Additionally, this change will allow new methods to be introduced to the SystemClock (like https://github.com/facebook/rocksdb/issues/7101 WaitFor) in a consistent manner across a smaller number of classes.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7858
      
      Reviewed By: pdillinger
      
      Differential Revision: D26006406
      
      Pulled By: mrambacher
      
      fbshipit-source-id: ed10a8abbdab7ff2e23d69d85bd25b3e7e899e90
      12f11373
    • A
      In IOTracing, add filename with each operation in trace file. (#7885) · 1d226018
      Akanksha Mahajan 提交于
      Summary:
      1. In IOTracing, add filename with each IOTrace record. Filename is stored in file object (Tracing Wrappers).
               2. Change the logic of figuring out which additional information (file_size,
                  length, offset etc) needs to be store with each operation
                  which is different for different operations.
                  When new information will be added in future (depends on operation),
                  this change would make the future additions simple.
      
      Logic: In IOTraceRecord, io_op_data is added and its
               bitwise positions represent which additional information need
               to added in the record from enum IOTraceOp. Values in IOTraceOp represent bitwise positions.
               So if length and offset needs to be stored (IOTraceOp::kIOLen
               is 1 and IOTraceOp::kIOOffset is 2), position 1 and 2 (from rightmost bit) will be set
               and io_op_data will contain 110.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7885
      
      Test Plan: Updated io_tracer_test and verified the trace file manually.
      
      Reviewed By: anand1976
      
      Differential Revision: D25982353
      
      Pulled By: akankshamahajan15
      
      fbshipit-source-id: ebfc5539cc0e231d7794a6b42b73f5403e360b22
      1d226018
    • L
      Do not explicitly flush blob files when using the integrated BlobDB (#7892) · 431e8afb
      Levi Tamasi 提交于
      Summary:
      In the original stacked BlobDB implementation, which writes blobs to blob files
      immediately and treats blob files as logs, it makes sense to flush the file after
      writing each blob to protect against process crashes; however, in the integrated
      implementation, which builds blob files in the background jobs, this unnecessarily
      reduces performance. This patch fixes this by simply adding a `do_flush` flag to
      `BlobLogWriter`, which is set to `true` by the stacked implementation and to `false`
      by the new code. Note: the change itself is trivial but the tests needed some work;
      since in the new implementation, blobs are now buffered, adding a blob to
      `BlobFileBuilder` is no longer guaranteed to result in an actual I/O. Therefore, we can
      no longer rely on `FaultInjectionTestEnv` when testing failure cases; instead, we
      manipulate the return values of I/O methods directly using `SyncPoint`s.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7892
      
      Test Plan: `make check`
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D26022814
      
      Pulled By: ltamasi
      
      fbshipit-source-id: b3dce419f312137fa70d84cdd9b908fd5d60d8cd
      431e8afb
  2. 22 1月, 2021 5 次提交
  3. 21 1月, 2021 3 次提交
  4. 20 1月, 2021 4 次提交
    • C
      Make it able to ignore WAL related VersionEdits in older versions (#7873) · e4494829
      Cheng Chang 提交于
      Summary:
      Although the tags for `WalAddition`, `WalDeletion` are after `kTagSafeIgnoreMask`, to actually be able to skip these entries in older versions of RocksDB, we require that they are encoded with their encoded size as the prefix. This requirement is not met in the current codebase, so a downgraded DB may fail to open if these entries exist in the MANIFEST.
      
      If a DB wants to downgrade, and its MANIFEST contains `WalAddition` or `WalDeletion`, it can set `track_and_verify_wals_in_manifest` to `false`, then restart twice, then downgrade. On the first restart, a new MANIFEST will be created with a `WalDeletion` indicating that all previously tracked WALs are removed from MANIFEST. On the second restart, since there is  no tracked WALs in MANIFEST now, a new MANIFEST will be created with neither `WalAddition` nor `WalDeletion`. Then the DB can downgrade.
      
      Tags for `BlobFileAddition`, `BlobFileGarbage` also have the same problem, but this PR focuses on solving the problem for WAL edits.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7873
      
      Test Plan: Added a `VersionEditTest::IgnorableTags` unit test to verify all entries with tags larger than `kTagSafeIgnoreMask` can actually be skipped and won't affect parsing of other entries.
      
      Reviewed By: ajkr
      
      Differential Revision: D25935930
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 7a02fdba4311d6084328c14aed110a26d08c3efb
      e4494829
    • C
      Update HISTORY.md (#7874) · 928dea0e
      Cheng Chang 提交于
      Summary:
      I find that the `track_and_verify_wals_in_manifest` option was only removed from 6.15 branch's HISTORY, but still appears under 6.15 in master branch's HISTORY. It should be moved to 6.16 since that's when the feature should be available.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7874
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25935971
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: fe8bf1ec111597f9207e109aa3be65f8f919f1fd
      928dea0e
    • C
      Add Apache Doris to USERS (#7865) · 4aa1a19d
      Cheng Chang 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7865
      
      Reviewed By: ajkr
      
      Differential Revision: D25916166
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: 24776b0203b21a733b5358dfa5dd66f639106dad
      4aa1a19d
    • V
      Fix write-ahead log file size overflow (#7870) · 4db58bcf
      Vladimir Maksimovski 提交于
      Summary:
      The WAL's file size is stored as an unsigned 64 bit integer.
      
      In db_info_dumper.cc, this integer gets converted to a string. Since 2^64 is approximately 10^19, we need 20 digits to represent the integer correctly. To store the decimal representation, we need 21 bytes (+1 due to the '\0' terminator at the end). The code previously used 16 bytes, which would overflow if the log is really big (>1 petabyte).
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7870
      
      Reviewed By: ajkr
      
      Differential Revision: D25938776
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 6ee9e21ebd65d297ea90fa1e7e74f3e1c533299d
      4db58bcf
  5. 16 1月, 2021 8 次提交
    • A
      Cover all status codes in `Status::ToString()` (#7872) · 5b748b9e
      Andrew Kryczka 提交于
      Summary:
      - Completed the switch statement for all possible `Code` values (the only one missing was `kCompactionTooLarge`).
      - Removed the default case so compiler can alert us if a new value is added to `Code` without handling it in `Status::ToString()`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7872
      
      Test Plan:
      verified the log message for this scenario looks right
      
      ```
      2021/01/15-17:26:34.564450 7fa6845fe700 [ERROR] [/db_impl/db_impl_compaction_flush.cc:2621] Waiting after background compaction error: Compaction too large: , Accumulated background error counts: 1
      ```
      
      Reviewed By: ramvadiv
      
      Differential Revision: D25934539
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2e0b3c0d993e356a4987276d6f8a163f0ee8be7a
      5b748b9e
    • O
      Fix various spelling errors still found in code (#7785) · acc9679c
      Otto Kekäläinen 提交于
      Summary:
      dont -> don't
      refered -> referred
      
      Merging this would allow to decrease the size of the downstream patch at https://salsa.debian.org/mariadb-team/mariadb-10.5/-/blob/master/debian/patches/fix-spelling.patch
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7785
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D25761408
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 290406ef2a3b05a3daeedbe3b20a00798ef581e7
      acc9679c
    • L
      Update version to 6.17 (#7871) · ffe49061
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7871
      
      Test Plan: `make check`
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25932233
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 8b80b0638a4f34f21a27ba80b3eda7d75410b2e8
      ffe49061
    • T
      Fixing Windows build using CMake (#7854) · d76a8eee
      Tomas Kolda 提交于
      Summary:
      Builds were not producing Windows binaries properly in 6.15 branch:
      
      ```
      00:00:46.413 Tests run: 11, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.183 sec <<< FAILURE! - in org.rocksdb.EventListenerTest
      00:00:46.414 testAllCallbacksInvocation(org.rocksdb.EventListenerTest)  Time elapsed: 0.012 sec  <<< ERROR!
      00:00:46.414 java.lang.UnsatisfiedLinkError: org.rocksdb.test.TestableEventListener.invokeAllCallbacks(J)V
      00:00:46.414 	at org.rocksdb.test.TestableEventListener.invokeAllCallbacks(Native Method)
      00:00:46.414 	at org.rocksdb.test.TestableEventListener.invokeAllCallbacks(TestableEventListener.java:19)
      00:00:46.414 	at org.rocksdb.EventListenerTest.testAllCallbacksInvocation(EventListenerTest.java:436)
      ```
      
      ```
      00:00:41.497        "D:\j\workspace\RocksDB_Build_Windows\build\java\rocksdbjni_headers.vcxproj" (default target) (3) ->
      00:00:41.497        (CustomBuild target) ->
      00:00:41.497          CUSTOMBUILD : error : Could not find class file for 'org.rocksdb.TestableEventListener'. [D:\j\workspace\RocksDB_Build_Windows\build\java\rocksdbjni_headers.vcxproj]
      ```
      
      Also failed on Linux as library was not initialized yet:
      
      ```
      00:01:25.103 Running org.rocksdb.NativeComparatorWrapperTest
      00:01:25.133 Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.006 sec <<< FAILURE! - in org.rocksdb.NativeComparatorWrapperTest
      00:01:25.133 rountrip(org.rocksdb.NativeComparatorWrapperTest)  Time elapsed: 0.002 sec  <<< ERROR!
      00:01:25.133 java.lang.UnsatisfiedLinkError: org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.newStringComparator()J
      00:01:25.133 	at org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.newStringComparator(Native Method)
      00:01:25.133 	at org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.initializeNative(NativeComparatorWrapperTest.java:87)
      00:01:25.133 	at org.rocksdb.RocksCallbackObject.<init>(RocksCallbackObject.java:28)
      00:01:25.133 	at org.rocksdb.AbstractComparator.<init>(AbstractComparator.java:20)
      00:01:25.133 	at org.rocksdb.NativeComparatorWrapper.<init>(NativeComparatorWrapper.java:16)
      00:01:25.133 	at org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.<init>(NativeComparatorWrapperTest.java:82)
      00:01:25.133 	at org.rocksdb.NativeComparatorWrapperTest.rountrip(NativeComparatorWrapperTest.java:30)
      ```
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7854
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25873378
      
      Pulled By: ajkr
      
      fbshipit-source-id: 88afb08bfd30edff31f17da063e636df0769cbfe
      d76a8eee
    • T
      Read Options to support direct slice (#7132) · 1001bc01
      Tomas Kolda 提交于
      Summary:
      This request is adding support for using DirectSlice in ReadOptions lower/upper bounds.
      
      To be more efficient I have added setLength to DirectSlice so I can just update the length to be used by slice from direct buffer. It is also needed, because when one creates iterator it keep pointer to original slice so setting new slice in options does not help (it needs to reuse existing one). Using this approach one can modify the slice any time during operations with iterator.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7132
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D25840092
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 760167baf61568c9a35138145c4bf9b06824cb71
      1001bc01
    • shadowlux's avatar
      Using emplace_back replace push_back (#7568) · 2fb6d933
      shadowlux 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7568
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D24437383
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 7c9b3c4944b959aa7796c53b410c2b1055dc5641
      2fb6d933
    • T
      S390 Linux is failing tests ColumnFamilyOptionsTest.cfPaths (#7853) · ac956f2b
      Tomas Kolda 提交于
      Summary:
      Fix ColumnFamilyOptionsTest.cfPaths and OptionsTest.cfPaths in 6.15 branch (and probably other branches including master)
      
      has_exception variable was not initialized which was causing test failures and incorrect behavior on s390 platform (and maybe others as variable content is undefined).
      
      adamretter please take a look.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7853
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D25901639
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 151b5db27b495fc6d8ed54c0eccbde2508215ac5
      ac956f2b
    • A
      Make regression test load options from file for checkpoint (#7864) · 7189ea8f
      anand76 提交于
      Summary:
      The regression_test.sh script checkpoints the DB directory before running db_bench on it. Specify the --try_load_options when creating the checkpoint in order to load options from the OPTIONS file.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7864
      
      Test Plan: manually run db_bench on the checkpoint dir
      
      Reviewed By: akankshamahajan15
      
      Differential Revision: D25926960
      
      Pulled By: anand1976
      
      fbshipit-source-id: d3442ae24a7044b474dc80efc9c06bdc6ebe0388
      7189ea8f
  6. 14 1月, 2021 2 次提交
  7. 12 1月, 2021 7 次提交
  8. 10 1月, 2021 2 次提交
    • J
      Fix checkpoint_test hang (#7849) · a3066ee7
      Jay Zhuang 提交于
      Summary:
      `CheckpointTest.CurrentFileModifiedWhileCheckpointing` could hang
      because now create checkpoint triggers flush twice. The test should wait
      both flush done.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7849
      
      Test Plan: `gtest-parallel ./checkpoint_test --gtest_filter=CheckpointTest.CurrentFileModifiedWhileCheckpointing -r 100`
      
      Reviewed By: ajkr
      
      Differential Revision: D25860713
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: e1c2f23037dedc33e205519f4289a25e77816b41
      a3066ee7
    • A
      Improvements to Env::GetChildren (#7819) · 4926b337
      Adam Retter 提交于
      Summary:
      The main improvement here is to not include `.` or `..` in the results of `Env::GetChildren`. The occurrence of `.` or `..`; it is non-portable, dependent on the Operating System and the File System. See: https://www.gnu.org/software/libc/manual/html_node/Reading_002fClosing-Directory.html
      
      There were lots of duplicate checks spread through the RocksDB codebase previously to skip `.` and `..`. This new removes the need for those at the source.
      
      Also some minor fixes to `Env::GetChildren`:
      * Improve error handling in POSIX implementation
      * Remove unnecessary array allocation on Windows
      * Fix struct name for Windows Non-UTF-8 API
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7819
      
      Reviewed By: ajkr
      
      Differential Revision: D25837394
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 1e137e7218d38b450af9c083f73d5357abcbba2e
      4926b337
  9. 09 1月, 2021 1 次提交
  10. 08 1月, 2021 4 次提交
    • C
      Get manifest size again after getting min_log_num during checkpoint (#7836) · b2e30bdb
      Cheng Chang 提交于
      Summary:
      Currently, manifest size is determined before getting min_log_num.
      
      But between getting manifest size and getting min_log_num, concurrently, a flush might succeed, which will write new records to manifest to make some WALs become outdated, then min_log_num will be correspondingly increased, but the new records in manifest will not be copied into the checkpoint because the manifest's size is determined before them, then the newly outdated WALs will still exist in the checkpoint's manifest, but they are not linked/copied to the checkpoint because their log number is < min_log_num, so a corruption of missing WAL will be reported when restoring from the checkpoint.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7836
      
      Test Plan: make crash_test
      
      Reviewed By: ajkr
      
      Differential Revision: D25788204
      
      Pulled By: cheng-chang
      
      fbshipit-source-id: a4e5acf30f08270b3c0a95304ff559a9e655252f
      b2e30bdb
    • A
      Store test logs as artifacts if the build fails in CircleCI (#7812) · c22e619f
      Adam Retter 提交于
      Summary:
      If a workflow fails in CircleCI this will ensure that the `t/` directory is tar'd up and added to the workflow as an artifact. This allows us to download the detailed logs and see what went wrong.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7812
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D25761003
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: 41cfd16c6385bfcc9fb35fb63df84f97d4b8b80b
      c22e619f
    • Z
      Treat File Scope Write IO Error the same as Retryable IO Error (#7840) · 48c0843e
      Zhichao Cao 提交于
      Summary:
      In RocksDB, when IO error happens, the flags of IOStatus can be set. If the IOStatus is set as "File Scope IO Error", it indicate that the error is constrained in the file level. Since RocksDB does not continues write data to a file when any IO Error happens, File Scope IO Error can be treated the same as Retryable IO Error. Adding the logic to ErrorHandler::SetBGError to include the file scope IO Error in its error handling logic, which is the same as retryable IO Error.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7840
      
      Test Plan: added new unit tests in error_handler_fs_test. make check
      
      Reviewed By: anand1976
      
      Differential Revision: D25820481
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 69cabd3d010073e064d6142ce1cabf341b8a6806
      48c0843e
    • M
      Add more tests to the ASC pass list (#7834) · cc2a180d
      mrambacher 提交于
      Summary:
      Fixed the following  to now pass ASC checks:
      * `ttl_test`
      * `blob_db_test`
      * `backupable_db_test`,
      * `delete_scheduler_test`
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/7834
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D25795398
      
      Pulled By: ajkr
      
      fbshipit-source-id: a10037817deda4fc7cbb353a2e00b62ed89b6476
      cc2a180d
  11. 07 1月, 2021 1 次提交