1. 10 10月, 2020 1 次提交
  2. 15 9月, 2020 1 次提交
  3. 19 8月, 2020 1 次提交
  4. 17 7月, 2020 1 次提交
    • M
      Add Support for saving CompressionOptions to Options File (#6817) · ec711b23
      mrambacher 提交于
      Summary:
      This PR does a few things:
      - The "compression_opts" and "bottom_compression_opts" can now be read/written as name/value pairs of options (instead of only a colon-separated list;
      - These options can now be read/written to the Options file;
      - The parallel_threads value can now be set (either in the colon or name-value format).
      
      The compression options are now stored and treated as a OptionTypeInfo::Struct by the options system, meaning they can be read and written like the other structs.  This change allows them to be read/written easily to the options file.
      
      Additionally, the colon-format was extended to allow support for setting parallel threads.  Tests were added to test all of the option settings via the optional parameters in the colon format.
      
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6817
      
      Reviewed By: ajkr
      
      Differential Revision: D22396004
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 38bcf74b7e9cd5bc2a84540fac2e9ba4f765b2c8
      ec711b23
  5. 04 6月, 2020 1 次提交
    • M
      Add OptionTypeInfo::Vector to parse/serialize vectors (#6424) · 0a17d953
      mrambacher 提交于
      Summary:
      The OptionTypeInfo::Vector method allows a vector<T> to be converted to/from strings via the options.
      
      The kVectorInt and kVectorCompressionType vectors were replaced with this methodology.
      
      As part of this change, the NextToken method was added to the OptionTypeInfo.  This method was refactored from code within the StringToMap function.
      
      Future types that could use this functionality include the EventListener vectors.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6424
      
      Reviewed By: cheng-chang
      
      Differential Revision: D21832368
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: e1ca766faff139d54e6e8407a9ec09ece6517439
      0a17d953
  6. 22 5月, 2020 1 次提交
    • M
      Add Struct Type to OptionsTypeInfo (#6425) · 38be6861
      mrambacher 提交于
      Summary:
      Added code for generically handing structs to OptionTypeInfo.  A struct is a collection of variables handled by their own map of OptionTypeInfos.  Examples of structs include Compaction and Cache options.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6425
      
      Reviewed By: siying
      
      Differential Revision: D21668789
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 064b110de39dadf82361ed4663f7ac1a535b0b07
      38be6861
  7. 09 5月, 2020 1 次提交
    • A
      prototype status check enforcement (#6798) · 1c846604
      Andrew Kryczka 提交于
      Summary:
      Tried making Status object enforce that it is checked in some way. In cases it is not checked, `PermitUncheckedError()` must be called explicitly.
      
      Added a way to run tests (`ASSERT_STATUS_CHECKED=1 make -j48 check`) on a
      whitelist. The effort appears significant to get each test to pass with
      this assertion, so I only fixed up enough to get one test (`options_test`)
      working and added it to the whitelist.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6798
      
      Reviewed By: pdillinger
      
      Differential Revision: D21377404
      
      Pulled By: ajkr
      
      fbshipit-source-id: 73236f9c8df38f01cf24ecac4a6d1661b72d077e
      1c846604
  8. 06 5月, 2020 1 次提交
    • M
      Add OptionTypeInfo::Enum and related methods (#6423) · 394f2bbd
      mrambacher 提交于
      Summary:
      Add methods and constructors for handling enums to the OptionTypeInfo.  This change allows enums to be converted/compared without adding a special "type" to the OptionType.
      
      This change addresses a couple of issues:
      - It allows new enumerated types to be added to the options without editing the OptionType base class (and related methods)
      - It standardizes the procedure for adding enumerated types to the options, reducing potential mistakes
      - It moves the enum maps to the location where they are used, allowing them to be static file members rather than global values
      - It reduces the number of types and cases that need to be handled in the various OptionType methods
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6423
      
      Reviewed By: siying
      
      Differential Revision: D21408713
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: fc492af285d011822578b95d186a0fce25d35626
      394f2bbd
  9. 01 5月, 2020 1 次提交
    • S
      Remove the support of setting CompressionOptions.parallel_threads from string for now (#6782) · 6504ae0c
      sdong 提交于
      Summary:
      The current way of implementing CompressionOptions.parallel_threads introduces a format change. We plan to change CompressionOptions's serailization format to a new JSON-like format, which would be another format change. We would like to consolidate the two format changes into one, rather than making some users to change twice. Hold CompressionOptions.parallel_threads from being supported by option string for now. Will add it back after the general CompressionOptions's format change.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6782
      
      Test Plan: Run all existing tests.
      
      Reviewed By: zhichao-cao
      
      Differential Revision: D21338614
      
      fbshipit-source-id: bca2dac3cb37d4e6e64b52cbbe8ea749cd848685
      6504ae0c
  10. 29 4月, 2020 1 次提交
    • M
      Add Functions to OptionTypeInfo (#6422) · 618bf638
      mrambacher 提交于
      Summary:
      Added functions for parsing, serializing, and comparing elements to OptionTypeInfo.  These functions allow all of the special cases that could not be handled directly in the map of OptionTypeInfo to be moved into the map.  Using these functions, every type can be handled via the map rather than special cased.
      
      By adding these functions, the code for handling options can become more standardized (fewer special cases) and (eventually) handled completely by common classes.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6422
      
      Test Plan: pass make check
      
      Reviewed By: siying
      
      Differential Revision: D21269005
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 9ba71c721a38ebf9ee88259d60bd81b3282b9077
      618bf638
  11. 22 4月, 2020 1 次提交
  12. 02 4月, 2020 1 次提交
    • Z
      Add pipelined & parallel compression optimization (#6262) · 03a781a9
      Ziyue Yang 提交于
      Summary:
      This PR adds support for pipelined & parallel compression optimization for `BlockBasedTableBuilder`. This optimization makes block building, block compression and block appending a pipeline, and uses multiple threads to accelerate block compression. Users can set `CompressionOptions::parallel_threads` greater than 1 to enable compression parallelism.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6262
      
      Reviewed By: ajkr
      
      Differential Revision: D20651306
      
      fbshipit-source-id: 62125590a9c15b6d9071def9dc72589c1696a4cb
      03a781a9
  13. 21 2月, 2020 1 次提交
    • S
      Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) · fdf882de
      sdong 提交于
      Summary:
      When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433
      
      Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.
      
      Differential Revision: D19977691
      
      fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
      fdf882de
  14. 11 2月, 2020 1 次提交
    • S
      Make clang analyze happy with options_test (#6398) · 594e815e
      sdong 提交于
      Summary:
      clang analysis shows following warning:
      
      options/options_test.cc:1554:24: warning: The left operand of '-' is a garbage value
                  (file_size - 1) / readahead_size + 1);
                   ~~~~~~~~~ ^
      
      Explicitly initialize file_size and add an assertion to make clang analysis happy.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6398
      
      Test Plan: Run "make analysis" and see the warning goes away.
      
      Differential Revision: D19819662
      
      fbshipit-source-id: 1589ea91c0c8f78242538f01448e4ad0e5fbc219
      594e815e
  15. 08 2月, 2020 1 次提交
    • S
      Allow readahead when reading option files. (#6372) · 876c2dbf
      sdong 提交于
      Summary:
      Right, when reading from option files, no readahead is used and 8KB buffer is used. It might introduce high latency if the file system provide high latency and doesn't do readahead. Instead, introduce a readahead to the file. When calling inside DB, infer the value from options.log_readahead. Otherwise, a default 512KB readahead size is used.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6372
      
      Test Plan: Add --log_readahead_size in db_bench. Run it with several options and observe read size from option files using strace.
      
      Differential Revision: D19727739
      
      fbshipit-source-id: e6d8053b0a64259abc087f1f388b9cd66fa8a583
      876c2dbf
  16. 14 12月, 2019 1 次提交
    • A
      Introduce a new storage specific Env API (#5761) · afa2420c
      anand76 提交于
      Summary:
      The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc.
      
      This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO.
      
      The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before.
      
      This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection.
      
      The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761
      
      Differential Revision: D18868376
      
      Pulled By: anand1976
      
      fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f
      afa2420c
  17. 27 11月, 2019 1 次提交
    • P
      Allow fractional bits/key in BloomFilterPolicy (#6092) · 57f30322
      Peter Dillinger 提交于
      Summary:
      There's no technological impediment to allowing the Bloom
      filter bits/key to be non-integer (fractional/decimal) values, and it
      provides finer control over the memory vs. accuracy trade-off. This is
      especially handy in using the format_version=5 Bloom filter in place
      of the old one, because bits_per_key=9.55 provides the same accuracy as
      the old bits_per_key=10.
      
      This change not only requires refining the logic for choosing the best
      num_probes for a given bits/key setting, it revealed a flaw in that logic.
      As bits/key gets higher, the best num_probes for a cache-local Bloom
      filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a
      standard Bloom filter. For example, at 16 bits per key, the best
      num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%).
      This change fixes and refines that logic (for the format_version=5
      Bloom filter only, just in case) based on empirical tests to find
      accuracy inflection points between each num_probes.
      
      Although bits_per_key is now specified as a double, the new Bloom
      filter converts/rounds this to "millibits / key" for predictable/precise
      internal computations. Just in case of unforeseen compatibility
      issues, we round to the nearest whole number bits / key for the
      legacy Bloom filter, so as not to unlock new behaviors for it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092
      
      Test Plan: unit tests included
      
      Differential Revision: D18711313
      
      Pulled By: pdillinger
      
      fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a
      57f30322
  18. 21 9月, 2019 1 次提交
  19. 19 9月, 2019 1 次提交
  20. 10 9月, 2019 1 次提交
  21. 24 8月, 2019 1 次提交
    • Z
      Refactor trimming logic for immutable memtables (#5022) · 2f41ecfe
      Zhongyi Xie 提交于
      Summary:
      MyRocks currently sets `max_write_buffer_number_to_maintain` in order to maintain enough history for transaction conflict checking. The effectiveness of this approach depends on the size of memtables. When memtables are small, it may not keep enough history; when memtables are large, this may consume too much memory.
      We are proposing a new way to configure memtable list history: by limiting the memory usage of immutable memtables. The new option is `max_write_buffer_size_to_maintain` and it will take precedence over the old `max_write_buffer_number_to_maintain` if they are both set to non-zero values. The new option accounts for the total memory usage of flushed immutable memtables and mutable memtable. When the total usage exceeds the limit, RocksDB may start dropping immutable memtables (which is also called trimming history), starting from the oldest one.
      The semantics of the old option actually works both as an upper bound and lower bound. History trimming will start if number of immutable memtables exceeds the limit, but it will never go below (limit-1) due to history trimming.
      In order the mimic the behavior with the new option, history trimming will stop if dropping the next immutable memtable causes the total memory usage go below the size limit. For example, assuming the size limit is set to 64MB, and there are 3 immutable memtables with sizes of 20, 30, 30. Although the total memory usage is 80MB > 64MB, dropping the oldest memtable will reduce the memory usage to 60MB < 64MB, so in this case no memtable will be dropped.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5022
      
      Differential Revision: D14394062
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 60457a509c6af89d0993f988c9b5c2aa9e45f5c5
      2f41ecfe
  22. 24 7月, 2019 1 次提交
    • M
      The ObjectRegistry class replaces the Registrar and NewCustomObjects.… (#5293) · cfcf045a
      Mark Rambacher 提交于
      Summary:
      The ObjectRegistry class replaces the Registrar and NewCustomObjects.  Objects are registered with the registry by Type (the class must implement the static const char *Type() method).
      
      This change is necessary for a few reasons:
      - By having a class (rather than static template instances), the class can be passed between compilation units, meaning that objects could be registered and shared from a dynamic library with an executable.
      - By having a class with instances, different units could have different objects registered.  This could be useful if, for example, one Option allowed for a dynamic library and one did not.
      
      When combined with some other PRs (being able to load shared libraries, a Configurable interface to configure objects to/from string), this code will allow objects in external shared libraries to be added to a RocksDB image at run-time, rather than requiring every new extension to be built into the main library and called explicitly by every program.
      
      Test plan (on riversand963's  devserver)
      ```
      $COMPILE_WITH_ASAN=1 make -j32 all && sleep 1 && make check
      ```
      All tests pass.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5293
      
      Differential Revision: D16363396
      
      Pulled By: riversand963
      
      fbshipit-source-id: fbe4acb615bfc11103eef40a0b288845791c0180
      cfcf045a
  23. 28 6月, 2019 1 次提交
    • S
      LRU Cache to enable mid-point insertion by default (#5508) · 15fd3be0
      sdong 提交于
      Summary:
      Mid-point insertion is a useful feature and is mature now. Make it default. Also changed cache_index_and_filter_blocks_with_high_priority=true as default accordingly, so that we won't evict index and filter blocks easier after the change, to avoid too many surprises to users.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5508
      
      Test Plan: Run all existing tests.
      
      Differential Revision: D16021179
      
      fbshipit-source-id: ce8456e8d43b3bfb48df6c304b5290a9d19817eb
      15fd3be0
  24. 18 6月, 2019 1 次提交
  25. 07 6月, 2019 1 次提交
  26. 04 6月, 2019 1 次提交
  27. 31 5月, 2019 2 次提交
  28. 04 5月, 2019 1 次提交
    • M
      Refresh snapshot list during long compactions (2nd attempt) (#5278) · 6a40ee5e
      Maysam Yabandeh 提交于
      Summary:
      Part of compaction cpu goes to processing snapshot list, the larger the list the bigger the overhead. Although the lifetime of most of the snapshots is much shorter than the lifetime of compactions, the compaction conservatively operates on the list of snapshots that it initially obtained. This patch allows the snapshot list to be updated via a callback if the compaction is taking long. This should let the compaction to continue more efficiently with much smaller snapshot list.
      For simplicity, to avoid the feature is disabled in two cases: i) When more than one sub-compaction are sharing the same snapshot list, ii) when Range Delete is used in which the range delete aggregator has its own copy of snapshot list.
      This fixes the reverted https://github.com/facebook/rocksdb/pull/5099 issue with range deletes.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5278
      
      Differential Revision: D15203291
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: fa645611e606aa222c7ce53176dc5bb6f259c258
      6a40ee5e
  29. 02 5月, 2019 1 次提交
  30. 26 4月, 2019 2 次提交
    • M
      Refresh snapshot list during long compactions (#5099) · 506e8448
      Maysam Yabandeh 提交于
      Summary:
      Part of compaction cpu goes to processing snapshot list, the larger the list the bigger the overhead. Although the lifetime of most of the snapshots is much shorter than the lifetime of compactions, the compaction conservatively operates on the list of snapshots that it initially obtained. This patch allows the snapshot list to be updated via a callback if the compaction is taking long. This should let the compaction to continue more efficiently with much smaller snapshot list.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5099
      
      Differential Revision: D15086710
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 7649f56c3b6b2fb334962048150142a3bf9c1a12
      506e8448
    • A
      Option string/map/file can set env from object registry (#5237) · 6eb317bb
      Andrew Kryczka 提交于
      Summary:
      - By providing the "env" field in any text-based options (i.e., string, map, or file), we can use `NewCustomObject` to deserialize the text value into an actual `Env` object.
      - Currently factory functions for `Env` registered with object registry should only return pointer to static `Env` objects. That's because `DBOptions::env` is a raw pointer so we cannot easily delegate cleanup.
      - Note I did not add `env` to `db_option_type_info`. It wasn't needed for (de)serialization, and I believe we don't want to do verification on `env`, even by checking name. That's because the user should be able to copy their DB from Linux to Windows, change envs, and not see an option verification error.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5237
      
      Differential Revision: D15056360
      
      Pulled By: siying
      
      fbshipit-source-id: 4b5f0b83297a5058f8949ec955dbf27d98d73d7e
      6eb317bb
  31. 23 4月, 2019 1 次提交
    • A
      Optionally wait on bytes_per_sync to smooth I/O (#5183) · 8272a6de
      Andrew Kryczka 提交于
      Summary:
      The existing implementation does not guarantee bytes reach disk every `bytes_per_sync` when writing SST files, or every `wal_bytes_per_sync` when writing WALs. This can cause confusing behavior for users who enable this feature to avoid large syncs during flush and compaction, but then end up hitting them anyways.
      
      My understanding of the existing behavior is we used `sync_file_range` with `SYNC_FILE_RANGE_WRITE` to submit ranges for async writeback, such that we could continue processing the next range of bytes while that I/O is happening. I believe we can preserve that benefit while also limiting how far the processing can get ahead of the I/O, which prevents huge syncs from happening when the file finishes.
      
      Consider this `sync_file_range` usage: `sync_file_range(fd_, 0, static_cast<off_t>(offset + nbytes), SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE)`. Expanding the range to start at 0 and adding the `SYNC_FILE_RANGE_WAIT_BEFORE` flag causes any pending writeback (like from a previous call to `sync_file_range`) to finish before it proceeds to submit the latest `nbytes` for writeback. The latest `nbytes` are still written back asynchronously, unless processing exceeds I/O speed, in which case the following `sync_file_range` will need to wait on it.
      
      There is a second change in this PR to use `fdatasync` when `sync_file_range` is unavailable (determined statically) or has some known problem with the underlying filesystem (determined dynamically).
      
      The above two changes only apply when the user enables a new option, `strict_bytes_per_sync`.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5183
      
      Differential Revision: D14953553
      
      Pulled By: siying
      
      fbshipit-source-id: 445c3862e019fb7b470f9c7f314fc231b62706e9
      8272a6de
  32. 29 3月, 2019 1 次提交
  33. 27 3月, 2019 1 次提交
  34. 13 3月, 2019 1 次提交
  35. 02 3月, 2019 1 次提交
  36. 21 2月, 2019 1 次提交
    • Z
      add GetStatsHistory to retrieve stats snapshots (#4748) · c4f5d0aa
      Zhongyi Xie 提交于
      Summary:
      This PR adds public `GetStatsHistory` API to retrieve stats history in the form of an std map. The key of the map is the timestamp in microseconds when the stats snapshot is taken, the value is another std map from stats name to stats value (stored in std string). Two DBOptions are introduced: `stats_persist_period_sec` (default 10 minutes) controls the intervals between two snapshots are taken; `max_stats_history_count` (default 10) controls the max number of history snapshots to keep in memory. RocksDB will stop collecting stats snapshots if `stats_persist_period_sec` is set to 0.
      
      (This PR is the in-memory part of https://github.com/facebook/rocksdb/pull/4535)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4748
      
      Differential Revision: D13961471
      
      Pulled By: miasantreble
      
      fbshipit-source-id: ac836d401ecb84ea92216bf9966f969dedf4ad04
      c4f5d0aa
  37. 20 2月, 2019 1 次提交
  38. 08 2月, 2019 1 次提交