1. 10 11月, 2021 5 次提交
    • A
      Remove invalid RocksJava native entry (#9147) · 1a8eec46
      Adam Retter 提交于
      Summary:
      It seems that an incorrect native source file entry was introduced in https://github.com/facebook/rocksdb/pull/8999. For some reason it appears that CI was not run against that PR, and so the problem was not detected.
      
      This PR fixes the problem by removing the invalid entry, allowing RocksJava to build correctly again.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9147
      
      Reviewed By: pdillinger
      
      Differential Revision: D32300976
      
      fbshipit-source-id: dbd763b806bacf0fc08f4deaf07c63d0a266c4cf
      1a8eec46
    • A
      Fix an error on GCC 4.8.5 where -Werror=unused-parameter fails (#9144) · fc93553e
      Adam Retter 提交于
      Summary:
      Before this fix compilation with GCC 4.8.5 20150623 (Red Hat 4.8.5-36) would fail with the following error:
      ```
      CC       jls/db/db_impl/db_impl.o
      In file included from ./env/file_system_tracer.h:8:0,
                       from ./file/random_access_file_reader.h:15,
                       from ./file/file_prefetch_buffer.h:15,
                       from ./table/format.h:13,
                       from ./table/internal_iterator.h:14,
                       from ./db/pinned_iterators_manager.h:12,
                       from ./db/range_tombstone_fragmenter.h:15,
                       from ./db/memtable.h:22,
                       from ./db/memtable_list.h:16,
                       from ./db/column_family.h:17,
                       from ./db/db_impl/db_impl.h:22,
                       from db/db_impl/db_impl.cc:9:
      ./include/rocksdb/file_system.h:108:8: error: unused parameter 'opts'
      [-Werror=unused-parameter]
       struct FileOptions : EnvOptions {
              ^
      db/db_impl/db_impl.cc: In member function 'virtual rocksdb::Status
      rocksdb::DBImpl::SetDBOptions(const
      std::unordered_map<std::basic_string<char>, std::basic_string<char>
      >&)':
      db/db_impl/db_impl.cc:1230:36: note: synthesized method
      'rocksdb::FileOptions& rocksdb::FileOptions::operator=(const
      rocksdb::FileOptions&)' first required here
             file_options_for_compaction_ = FileOptions(new_db_options);
                                          ^
        CC       jls/db/db_impl/db_impl_compaction_flush.o
      cc1plus: all warnings being treated as errors
      make[1]: *** [jls/db/db_impl/db_impl.o] Error 1
      make[1]: *** Waiting for unfinished jobs....
      make[1]: Leaving directory `/rocksdb-local-build'
      make: *** [rocksdbjavastatic] Error 2
      Makefile:2222: recipe for target 'rocksdbjavastaticdockerarm64v8' failed
      make: *** [rocksdbjavastaticdockerarm64v8] Error 2
      ```
      
      This was detected on both ppc64le and arm64v8, however it does not seem to appear in the same GCC 4.8 version we use for x64 in CircleCI - https://app.circleci.com/pipelines/github/facebook/rocksdb/9691/workflows/c2a94367-14f3-4039-be95-325c34643d41/jobs/227906
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9144
      
      Reviewed By: riversand963
      
      Differential Revision: D32290770
      
      fbshipit-source-id: c90a54ba2a618e1ff3660fff3f3368ab36c3c527
      fc93553e
    • Y
      Fix a bug in timestamp-related GC (#9116) · a113cecf
      Yanqin Jin 提交于
      Summary:
      For multiple versions (ts + seq) of the same user key, if they cross the boundary of `full_history_ts_low_`,
      we should retain the version that is visible to the `full_history_ts_low_`. Namely, we keep the internal key
      with the largest timestamp smaller than `full_history_ts_low`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9116
      
      Test Plan: make check
      
      Reviewed By: ltamasi
      
      Differential Revision: D32261514
      
      Pulled By: riversand963
      
      fbshipit-source-id: e10f47c254c04c05261440051e4f50cb7d95474e
      a113cecf
    • H
      RAII support for per cache reservation through handle (#9130) · 2fbe32b0
      Hui Xiao 提交于
      Summary:
      Note: This PR is the 3rd PR of a bigger PR stack (https://github.com/facebook/rocksdb/issues/9073) and depends on the second PR (https://github.com/facebook/rocksdb/pull/9071). **See changes from this PR only https://github.com/facebook/rocksdb/pull/9130/commits/00447324d082136b0e777d3ab6a3df3a8452c633**
      
      Context:
      pdillinger brought up a good [point](https://github.com/facebook/rocksdb/pull/9073#discussion_r741478309) about lacking RAII support for per cache reservation in `CacheReservationManager`  when reviewing https://github.com/facebook/rocksdb/pull/9073.
      
      To summarize the discussion, the current API `CacheReservationManager::UpdateCacheReservation()` requires callers to explicitly calculate and pass in a correct`new_mem_used` to release a cache reservation (if they don't want to rely on the clean-up during `CacheReservationManager`'s destruction - such as they want to release it earlier).
      
      While this implementation has convenience in some use-case such as `WriteBufferManager`, where [reservation](https://github.com/facebook/rocksdb/blob/main/memtable/write_buffer_manager.cc#L69-L91) and [release](https://github.com/facebook/rocksdb/blob/main/memtable/write_buffer_manager.cc#L109-L129) amounts do not necessarily correspond symmetrically and thus a flexible `new_mem_used` inputing is needed, it can be prone to caller's calculation error as well as cause a mass of codes in releasing cache in other use-case such as filter construction, where reservation and release amounts do correspond symmetrically and many code paths requiring a cache release, as [pointed](https://github.com/facebook/rocksdb/pull/9073#discussion_r741478309) out by pdillinger.
      
      Therefore we decided to provide a new API in `CacheReservationManager` to update reservation with better RAII support for per cache reservation, using a handle to manage the life time of that particular cache reservation.
      - Added a new class `CacheReservationHandle`
      - Added a new API `CacheReservationManager::MakeCacheReservation()` that outputs a `CacheReservationHandle` for managing the reservation
      - Updated class comments to clarify two different cache reservation methods
      
      Tests:
      - Passing new tests
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9130
      
      Reviewed By: pdillinger
      
      Differential Revision: D32199446
      
      Pulled By: hx235
      
      fbshipit-source-id: 1cba7c636e5ecfb55b0c1e0c2d218cc9b5b30b4e
      2fbe32b0
    • H
      Add new API CacheReservationManager::GetTotalMemoryUsage() (#9071) · ffd6085e
      Hui Xiao 提交于
      Summary:
      Note: This PR is the 2nd PR of a bigger PR stack (https://github.com/facebook/rocksdb/pull/9073).
      
      Context:
      `CacheReservationManager::UpdateCacheReservation(std::size_t new_memory_used)` accepts an accumulated total memory used (e.g, used 10MB so far) instead of usage change (e.g, increase by 5 MB, decrease by 5 MB). It has benefits including consolidating API for increase and decrease as described in https://github.com/facebook/rocksdb/pull/8506.
      
      However, not every `CacheReservationManager` user keeps track of this accumulated total memory usage. For example, Bloom/Ribbon Filter construction (e.g, [here](https://github.com/facebook/rocksdb/blob/822d729fcd9f7af9f371ca7168e52dbdab898e41/table/block_based/filter_policy.cc#L587) in https://github.com/facebook/rocksdb/pull/9073) does not  while WriteBufferManager and compression dictionary buffering do.
      
      Considering future users might or might not keep track of this counter and implementing this counter within `CacheReservationManager` is easy due to the passed-in `std::size_t new_memory_used` in calling `CacheReservationManager::UpdateCacheReservation(std::size_t new_memory_used)`, it is proposed to add a new API `CacheReservationManager::GetTotalMemoryUsage()`.
      
      As noted in the API comments,   since `CacheReservationManager` is NOT thread-safe, external synchronization is
       needed in calling `UpdateCacheReservation()` if you want `GetTotalMemoryUsed()` returns the indeed latest memory used.
      - Added and updated private counter `memory_used_` every time `CacheReservationManager::UpdateCacheReservation(std::size_t new_memory_used)` is called regardless if the call returns non-okay status
      - Added `CacheReservationManager::GetTotalMemoryUsage()` to return `memory_used_`
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9071
      
      Test Plan:
      - Passing new tests
      - Passing existing tests
      
      Reviewed By: ajkr
      
      Differential Revision: D31887813
      
      Pulled By: hx235
      
      fbshipit-source-id: 9a09f0c8683822673260362894c878b61ee60ceb
      ffd6085e
  2. 09 11月, 2021 6 次提交
  3. 07 11月, 2021 2 次提交
  4. 06 11月, 2021 5 次提交
    • J
      Deflake DBBasicTestWithTimestampCompressionSettings.PutAndGetWithComp… (#9136) · 5aad38f2
      Jay Zhuang 提交于
      Summary:
      …action
      ```
      db/db_with_timestamp_basic_test.cc:2643: Failure
      db_->CompactFiles(compact_opt, handles_[cf], collector->GetFlushedFiles(), static_cast<int>(kNumTimestamps - i))
      Invalid argument: A compaction must contain at least one file.
      ```
      Able to be reproduced by run multiple test in parallel.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9136
      
      Test Plan:
      ```
      gtest-parallel ./db_with_timestamp_basic_test --gtest_filter=Timestamp/DBBasicTestWithTimestampCompressionSettings.PutAndGetWithCompaction/12 -r 100 -w 100
      ```
      
      Reviewed By: riversand963
      
      Differential Revision: D32197734
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: aeb0d6e9b37312f577e203ca81bb7a0f14d4e7ce
      5aad38f2
    • L
      Refactor and unify blob file saving and the logic that finds the oldest live blob file (#9122) · 3fbddb1d
      Levi Tamasi 提交于
      Summary:
      The patch refactors and unifies the logic in `VersionBuilder::SaveBlobFilesTo`
      and `VersionBuilder::GetMinOldestBlobFileNumber` by introducing a generic
      helper that can "merge" the list of `BlobFileMetaData` in the base version with
      the list of `MutableBlobFileMetaData` representing the updated state after
      applying a sequence of `VersionEdit`s. This serves as groundwork for subsequent
      changes that will enable us to determine whether a blob file is live after applying
      a sequence of edits without calling `VersionBuilder::SaveTo`.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9122
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D32151472
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 11622b475866de823334b8bc21b0e99d913af97e
      3fbddb1d
    • H
      Minor improvement to CacheReservationManager/WriteBufferManager/CompressionDictBuilding (#9139) · 3018a3e2
      Hui Xiao 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9139
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D32211415
      
      Pulled By: hx235
      
      fbshipit-source-id: 39ce036ba34e1fb4a1992a33ac6904a4a943301d
      3018a3e2
    • Y
      Fix assertion error during compaction with write-prepared txn enabled (#9105) · 5237b39d
      Yanqin Jin 提交于
      Summary:
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9105
      
      The user contract of SingleDelete is that: a SingleDelete can only be issued to
      a key that exists and has NOT been updated. For example, application can insert
      one key `key`, and uses a SingleDelete to delete it in the future. The `key`
      cannot be updated or removed using Delete.
      In reality, especially when write-prepared transaction is being used, things
      can get tricky. For example, a prepared transaction already writes `key` to the
      memtable after a successful Prepare(). Afterwards, should the transaction
      rollback, it will insert a Delete into the memtable to cancel out the prior
      Put. Consider the following sequence of operations.
      
      ```
      // operation sequence 1
      Begin txn
      Put(key)
      Prepare()
      Flush()
      
      Rollback txn
      Flush()
      ```
      
      There will be two SSTs resulting from above. One of the contains a PUT, while
      the second one contains a Delete. It is also known that releasing a snapshot
      can lead to an L0 containing only a SD for a particular key. Consider the
      following operations following the above block.
      
      ```
      // operation sequence 2
      db->Put(key)
      db->SingleDelete(key)
      Flush()
      ```
      
      The operation sequence 2 can result in an L0 with only the SD.
      
      Should there be a snapshot for conflict checking created before operation
      sequence 1, then an attempt to compact the db may hit the assertion failure
      below, because ikey_.type is Delete (from a rollback).
      
      ```
      else if (clear_and_output_next_key_) {
        assert(ikey_.type == kTypeValue || ikey_.type == kTypeBlobIndex);
      }
      ```
      
      To fix the assertion failure, we can skip the SingleDelete if we detect an
      earlier Delete in the same snapshot interval.
      
      Reviewed By: ltamasi
      
      Differential Revision: D32056848
      
      fbshipit-source-id: 23620a91e28562d91c45cf7e95f414b54b729748
      5237b39d
    • J
      Fixing more loosely formatted issue (#9140) · 7d74d347
      Jack Feng 提交于
      Summary:
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9140
      
      Upon checking, some jobs still have loosely formatted issue.
      Also, some started failing due to my previous buggy change.
      
      https://fburl.com/scuba/sandcastle/e9mhejsc
      https://fburl.com/scuba/duplo_events/dbxx4215
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D32213703
      
      fbshipit-source-id: 806f872b820afe275cf62459988c9d6f668af35c
      7d74d347
  5. 05 11月, 2021 7 次提交
    • S
      Improve comments on options.writable_file_max_buffer_size (#9131) · 28bab0ef
      Siying Dong 提交于
      Summary:
      Comments of options.writable_file_max_buffer_size mentioned Windows, which is confusing. Remove it.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9131
      
      Reviewed By: anand1976
      
      Differential Revision: D32187003
      
      fbshipit-source-id: 1f134d7ecdc4a9d13825d461ab1da56769b9455f
      28bab0ef
    • S
      Add options.manual_wal_flush to db_bench (#9132) · caadf09d
      sdong 提交于
      Summary:
      It is useful to add options.manual_wal_flush to db_bench
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9132
      
      Test Plan: Run the benchamrk with the option.
      
      Reviewed By: ltamasi
      
      Differential Revision: D32188060
      
      fbshipit-source-id: a70835d3cad0f30095218dfda1daff0a432892e5
      caadf09d
    • J
      Fixing loosely formatted child job specs, double quotes (#9125) · 8be121bb
      Jack Feng 提交于
      Summary:
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9125
      
      See task for more context
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D32157345
      
      fbshipit-source-id: be6631c6400fd66e6891e71dc0235798c594010a
      8be121bb
    • P
      Fix -Werror=type-limits seen in Travis (#9128) · 2a3511a0
      Peter Dillinger 提交于
      Summary:
      Work around annoying compiler warning-as-error from https://github.com/facebook/rocksdb/issues/9113
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9128
      
      Test Plan: CI
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D32181499
      
      Pulled By: pdillinger
      
      fbshipit-source-id: d7e5f7857a29f7ba47c49c3aee7150b5763b65d9
      2a3511a0
    • H
      Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier... · 1ababeb7
      Hui Xiao 提交于
      Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter (#9070)
      
      Summary:
      Note: This PR is the 1st part of a bigger PR stack (https://github.com/facebook/rocksdb/pull/9073).
      
      Context:
      Previously, the payload (i.e, filter data) within `BlockBasedTableBuilder::Rep::FilterBlockBuilder` object is not deallocated until `BlockBasedTableBuilder` is deallocated, despite it is no longer useful after its related `filter_content` being written.
      - Transferred the payload (i.e, the filter data) out of `BlockBasedTableBuilder::Rep::FilterBlockBuilder` object
      - For PartitionedFilter:
         - Unified `filters` and `filter_gc` lists into one `std::deque<FilterEntry> filters` by adding a new field `last_filter_entry_key` and storing the `std::unique_ptr filter_data` with the `Slice filter` in the same entry
         - Reset `last_filter_data` in the case where `filters` is empty, which should be as by then we would've finish using all the `Slice filter`
      - Deallocated the payload by going out of scope as soon as we're done with using the `filter_content` associated with the payload
      - This is an internal interface change at the level of `FilterBlockBuilder::Finish()`, which leads to touching the inherited interface in `BlockBasedFilterBlockBuilder`. But for that, the payload transferring is ignored.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9070
      
      Test Plan: - The main focus is to catch segment fault error during `FilterBlockBuilder::Finish()` and `BlockBasedTableBuilder::Finish()` and interface mismatch. Relying on existing CI tests is enough as `assert(false)` was temporarily added to verify the new logic of transferring ownership indeed run
      
      Reviewed By: pdillinger
      
      Differential Revision: D31884933
      
      Pulled By: hx235
      
      fbshipit-source-id: f73ecfbea13788d4fc058013ace27230110b52f4
      1ababeb7
    • H
      Sanitize negative request bytes in GenericRateLimiter::Request and clarify API (#9112) · a64c8ca7
      Hui Xiao 提交于
      Summary:
      Context:
      Surprisingly, there isn't any sanitization against negative `int64_t bytes` in `GenericRateLimiter::Request(int64_t bytes, const Env::IOPriority pri, Statistics* stats)`. A negative `bytes` can be passed in and incorrectly increases `available_bytes_` by subtracting the negative `bytes` from `available_bytes_`, such as  [here](https://github.com/facebook/rocksdb/blob/main/util/rate_limiter.cc#L138) and [here](https://github.com/facebook/rocksdb/blob/main/util/rate_limiter.cc#L283), which are incorrect behaviors.
      - Sanitized negative request bytes by rounding it up to 0
      - Added notes to public and internal API
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9112
      
      Test Plan: - Rely on existing tests
      
      Reviewed By: ajkr
      
      Differential Revision: D32085364
      
      Pulled By: hx235
      
      fbshipit-source-id: b1b6066b2dd5ffc7bcbfb07069ca65a33578251b
      a64c8ca7
    • P
      Some checksum code refactoring (#9113) · dfedc74d
      Peter Dillinger 提交于
      Summary:
      To prepare for adding checksum to footer and "context aware"
      checksums. This also brings closely related code much closer together.
      
      Recently added `BlockBasedTableBuilder::ComputeBlockTrailer` for testing
      is made obsolete in the refactoring, as testing the checksums can happen
      at a lower level of abstraction.
      
      Also now checking for unrecognized checksum type on reading footer,
      rather than later on use.
      
      Also removed an obsolete function delcaration.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9113
      
      Test Plan:
      existing tests worked before refactoring to remove
      `ComputeBlockTrailer`. And then refactored+improved tests using it.
      
      Reviewed By: mrambacher
      
      Differential Revision: D32090149
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 2879da683c1498ea85a3b70dace9b6d9f6b47b6e
      dfedc74d
  6. 04 11月, 2021 5 次提交
    • A
      Re-enable 390x+cmake* Travis jobs (#9110) · 312d9c47
      Adam Retter 提交于
      Summary:
      Revert "Temporarily disable s390x+cmake* Travis jobs (https://github.com/facebook/rocksdb/issues/9095)"
      This reverts commit f2d11b3f.
      
      I have now uploaded the CMake deb for s390x provided by jonathan-albrecht-ibm to our S3 bucket.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9110
      
      Reviewed By: ajkr
      
      Differential Revision: D32082903
      
      Pulled By: riversand963
      
      fbshipit-source-id: b7243d19fc133e665a8654e3b528c4f53d5b11d1
      312d9c47
    • Y
      Fixed a bug in CompactionIterator when write-preared transaction is used (#9060) · 9b53f14a
      Yanqin Jin 提交于
      Summary:
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9060
      
      RocksDB bottommost level compaction may zero out an internal key's sequence if
      the key's sequence is in the earliest_snapshot.
      In write-prepared transaction, checking the visibility of a certain sequence in
      a specific released snapshot may return a "snapshot released" result.
      Therefore, it is possible, after a certain sequence of events, a PUT has its
      sequence zeroed out, but a subsequent SingleDelete of the same key will still
      be output with its original sequence. This violates the ascending order of
      keys and leads to incorrect result.
      
      The solution is to use an extra variable `last_key_seq_zeroed_` to track the
      information about visibility in earliest snapshot. With this variable, we can
      know for sure that a SingleDelete is in the earliest snapshot even if the said
      snapshot is released during compaction before processing the SD.
      
      Reviewed By: ltamasi
      
      Differential Revision: D31813016
      
      fbshipit-source-id: d8cff59d6f34e0bdf282614034aaea99be9174e1
      9b53f14a
    • J
      Fixing child jobs of rocksdb for duplo deprecation (#9117) · 56810142
      Jack Feng 提交于
      Summary:
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9117
      
      This fixes the loosely formatted json child job spec. I.e., trailing commas
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D32119606
      
      fbshipit-source-id: 0ef571ccbef0e638df18f17288611d3b5774f129
      56810142
    • J
      Skip directory fsync for filesystem btrfs (#8903) · 29102641
      Jay Zhuang 提交于
      Summary:
      Directory fsync might be expensive on btrfs and it may not be needed.
      Here are 4 directory fsync cases:
      1. creating a new file: dir-fsync is not needed on btrfs, as long as the
         new file itself is synced.
      2. renaming a file: dir-fsync is not needed if the renamed file is
         synced. So an API `FsyncAfterFileRename(filename, ...)` is provided
         to sync the file on btrfs. By default, it just calls dir-fsync.
      3. deleting files: dir-fsync is forced by set
         `IOOptions.force_dir_fsync = true`
      4. renaming multiple files (like backup and checkpoint): dir-fsync is
         forced, the same as above.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8903
      
      Test Plan: run tests on btrfs and non btrfs
      
      Reviewed By: ajkr
      
      Differential Revision: D30885059
      
      Pulled By: jay-zhuang
      
      fbshipit-source-id: dd2730b31580b0bcaedffc318a762d7dbf25de4a
      29102641
    • L
      Refactor the detailed consistency checks and the SST saving logic in VersionBuilder (#9099) · 08172278
      Levi Tamasi 提交于
      Summary:
      The patch refactors the parts of `VersionBuilder` that deal with SST file
      comparisons. Specifically, it makes the following changes:
      * Turns `NewestFirstBySeqNo` and `BySmallestKey` from free-standing
      functions into function objects. Note: `BySmallestKey` has a pointer to the
      `InternalKeyComparator`, while `NewestFirstBySeqNo` is completely
      stateless.
      * Eliminates the wrapper `FileComparator`, which was essentially an
      unnecessary DIY virtual function call mechanism.
      * Refactors `CheckConsistencyDetails` and `SaveSSTFilesTo` using helper
      function templates that take comparator/checker function objects. Using
      static polymorphism eliminates the need to make runtime decisions about
      which comparator to use.
      * Extends some error messages returned by the consistency checks and
      makes them more uniform.
      * Removes some incomplete/redundant consistency checks from `VersionBuilder`
      and `FilePicker`.
      * Improves const correctness in several places.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9099
      
      Test Plan: `make check`
      
      Reviewed By: riversand963
      
      Differential Revision: D32027503
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 621326ae41f4f55f7ad6a91abbd6e666d5c7857c
      08172278
  7. 03 11月, 2021 5 次提交
    • P
      Don't call OnTableFileCreated with OK for empty+deleted file (#9118) · 2b60621f
      Peter Dillinger 提交于
      Summary:
      EventListener::OnTableFileCreated was previously called with OK
      status and file_size==0 in cases of no SST file contents written
      (because there was no content to add) and the empty file deleted before
      calling the listener. This could lead to a stress test assertion failure
      added in https://github.com/facebook/rocksdb/issues/9054.
      
      This changes the status to Aborted, to align with the API doc:
      "... if the file is successfully created. Now it will also be called on
      failure case. User can check info.status to see if it succeeded or not."
      For internal purposes, this case is considered "success" but for
      listener purposes, no SST file is (successfully) created.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9118
      
      Test Plan: test case added + existing db_stress
      
      Reviewed By: ajkr, riversand963
      
      Differential Revision: D32120232
      
      Pulled By: pdillinger
      
      fbshipit-source-id: a804e2e0a52598018d3b182da97804d402ffcdfa
      2b60621f
    • P
      Fix TSAN report on MemPurge test (#9115) · 21f8a57f
      Peter Dillinger 提交于
      Summary:
      TSAN reported data race on count variables in MemPurgeBasic
      test. This suggests the test could fail if mempurges were slow enough
      that they don't complete before the count variables being checked, but
      injecting a long sleep into MemPurge (outside DB mutex) confirms that
      blocked writes ensure enough mempurges/flushes happen to make the test
      pass. All the possible different values on testing should be OK to make
      the test pass.
      
      So this change makes the variables atomic so that up-to-date value is
      always read and TSAN report suppressed. I have also used `.exchange(0)`
      to make the checking less stateful by "popping off" all the accumulated
      counts.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9115
      
      Test Plan: updated test, watch for any flakiness
      
      Reviewed By: riversand963
      
      Differential Revision: D32114432
      
      Pulled By: pdillinger
      
      fbshipit-source-id: c985609d39896a0d8f69ebc87b221e688609bdd8
      21f8a57f
    • A
      Clarify setting `CompressionOptions::max_dict_bytes > 0` will charge block cache (#9119) · 67a7b74b
      Andrew Kryczka 提交于
      Summary:
      Add it to the API comment.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9119
      
      Reviewed By: hx235
      
      Differential Revision: D32124238
      
      Pulled By: ajkr
      
      fbshipit-source-id: d1f82037417d883f2000f2d62995a7708dda77c6
      67a7b74b
    • P
      Some API clarifications (#9080) · 82afa018
      Peter Dillinger 提交于
      Summary:
      * Clarify that RocksDB is not exception safe on many of our callback
      and extension interfaces
      * Clarify FSRandomAccessFile::MultiRead implementations must accept
      non-sorted inputs (see https://github.com/facebook/rocksdb/issues/8953)
      * Clarify ConcurrentTaskLimiter and SstFileManager are not (currently)
      extensible interfaces
      * Mark WriteBufferManager as `final`, so it is then clearly not a
      callback interface, even though it smells like one
      * Clarify TablePropertiesCollector Status returns are mostly ignored
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9080
      
      Test Plan: comments only (except WriteBufferManager final)
      
      Reviewed By: ajkr
      
      Differential Revision: D31968782
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 11b648ce3ce3c5e5bdc02d2eafc7ea4b864bd1d2
      82afa018
    • M
      Make FileSystem a Customizable Class (#8649) · f72c834e
      mrambacher 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8649
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D32036059
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 4f1e7557ecac52eb849b83ae02b8d7d232112295
      f72c834e
  8. 02 11月, 2021 5 次提交
    • L
      Mention PR 9100 in HISTORY.md (#9111) · cfc57f55
      Levi Tamasi 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9111
      
      Reviewed By: riversand963
      
      Differential Revision: D32076310
      
      Pulled By: ltamasi
      
      fbshipit-source-id: 81e30a02ded87c0f1a42985db42e80b62235ba11
      cfc57f55
    • A
      Regression tests for tickets fixed by previous change. (#9019) · ec9082d6
      Alan Paxton 提交于
      Summary:
      closes https://github.com/facebook/rocksdb/issues/5891
      closes https://github.com/facebook/rocksdb/issues/2001
      
      Java BytewiseComparator is now unsigned compliant, consistent with the default C++ comparator, which has always been thus. Consequently 2 tickets reporting the previous broken state can be closed.
      
       This test confirms that the following issues were in fact resolved
       by a change made between 6.2.2 and 6.22.1,
       to wit https://github.com/facebook/rocksdb/commit/7242dae7
      which as part of its effect, changed the Java bytewise comparators.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9019
      
      Reviewed By: pdillinger
      
      Differential Revision: D31610910
      
      Pulled By: mrambacher
      
      fbshipit-source-id: 664230f1377a1aa270136edd63eea2c206b907e9
      ec9082d6
    • H
      Add new API CacheReservationManager::GetDummyEntrySize() (#9072) · 560fe702
      Hui Xiao 提交于
      Summary:
      Note: it might conflict with another CRM related PR https://github.com/facebook/rocksdb/pull/9071 and so will merge after that's merged.
      
      Context:
      As `CacheReservationManager` being used by more memory users, it is convenient to retrieve the dummy entry size for `CacheReservationManager` instead of hard-coding `256 * 1024` in writing tests. Plus it allows more flexibility to change our implementation on dummy entry size.
      
      A follow-up PR is needed to replace those hard-coded dummy entry size value in `db_test2.cc`, `db_write_buffer_manager_test.cc`, `write_buffer_manager_test.cc`, `table_test.cc` and the ones introduced in https://github.com/facebook/rocksdb/pull/9072#issue-1034326069.
      - Exposed the private static constexpr `kDummyEntrySize` through public static `CacheReservationManager::GetDummyEntrySize()`
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9072
      
      Test Plan:
      - Passing new tests
      - Passing existing tests
      
      Reviewed By: ajkr
      
      Differential Revision: D32043684
      
      Pulled By: hx235
      
      fbshipit-source-id: ddefc6921c052adab6a2cda2394eb26da3076a50
      560fe702
    • S
      Try to start TTL earlier with kMinOverlappingRatio is used (#8749) · a2b9be42
      sdong 提交于
      Summary:
      Right now, when options.ttl is set, compactions are triggered around the time when TTL is reached. This might cause extra compactions which are often bursty. This commit tries to mitigate it by picking those files earlier in normal compaction picking process. This is only implemented using kMinOverlappingRatio with Leveled compaction as it is the default value and it is more complicated to change other styles.
      
      When a file is aged more than ttl/2, RocksDB starts to boost the compaction priority of files in normal compaction picking process, and hope by the time TTL is reached, very few extra compaction is needed.
      
      In order for this to work, another change is made: during a compaction, if an output level file is older than ttl/2, cut output files based on original boundary (if it is not in the last level). This is to make sure that after an old file is moved to the next level, and new data is merged from the upper level, the new data falling into this range isn't reset with old timestamp. Without this change, in many cases, most files from one level will keep having old timestamp, even if they have newer data and we stuck in it.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/8749
      
      Test Plan: Add a unit test to test the boosting logic. Will add a unit test to test it end-to-end.
      
      Reviewed By: jay-zhuang
      
      Differential Revision: D30735261
      
      fbshipit-source-id: 503c2d89250b22911eb99e72b379be154de3428e
      a2b9be42
    • H
      Minor improvement to #8428 (Account for dictionary-building buffer in global memory limit) (#9032) · a5ec5e3e
      hx235 提交于
      Summary:
      Summary/Context:
      - Renamed `cache_rev_mng` to `compression_dict_buffer_cache_res_mgr`
         - It is to distinguish with other potential `cache_res_mgr` in `BlockBasedTableBuilder` and to use correct short-hand for the words "reservation", "manager"
      - Added `table_options.block_cache == nullptr` in additional to `table_options.no_block_cache == true` to be conditions where we don't create a `CacheReservationManager`
         - Theoretically `table_options.no_block_cache == true` is equivalent to `table_options.block_cache == nullptr` by API. But since segment fault will be generated by passing `nullptr` into `CacheReservationManager`'s constructor, it does not hurt to directly verify  `table_options.block_cache != nullptr` before passing in
      - Renamed `is_cache_full` to `exceeds_global_block_cache_limit`
         - It is to hide implementation detail of cache reservation and to emphasize on the concept/design intent of caping memory within global block cache limit
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/9032
      
      Test Plan: - Passing existing tests
      
      Reviewed By: ajkr
      
      Differential Revision: D32005807
      
      Pulled By: hx235
      
      fbshipit-source-id: 619fd17bb924199de3db5924d8ab7dae53b1efa2
      a5ec5e3e