- 09 6月, 2022 2 次提交
-
-
由 Akanksha Mahajan 提交于
Summary: RocksDB uses WalManager to manage WAL files. In WalManager::ReadFirstLine(), the assumption is that reading the first record of a valid WAL file will return OK status and set the output sequence to non-zero value. This assumption has been broken by WAL compression which writes a `kSetCompressionType` record which is not associated with any sequence number. Consequently, WalManager::GetSortedWalsOfType() will skip these WALs and not return them to caller, e.g. Checkpoint, Backup, causing the operations to fail. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10130 Test Plan: - Newly Added test Reviewed By: riversand963 Differential Revision: D36985744 Pulled By: akankshamahajan15 fbshipit-source-id: dfde7b3be68b6a30b75b49479779748eedf29f7f
-
由 Mark Callaghan 提交于
Summary: A recent diff add a few more fields to one of the db_bench output lines that gets parsed. This diff updates tools/benchmark.sh to handle that. overwrite : 7.939 micros/op 125963 ops/sec; 50.5 MB/s overwrite : 7.854 micros/op 127320 ops/sec 1800.001 seconds 229176999 operations; 51.0 MB/s Pull Request resolved: https://github.com/facebook/rocksdb/pull/10124 Test Plan: Run it Reviewed By: jay-zhuang Differential Revision: D36945137 Pulled By: mdcallag fbshipit-source-id: 9c96f79491411da997e369a3be9c6b921a21d0fa
-
- 08 6月, 2022 4 次提交
-
-
由 Yanqin Jin 提交于
Summary: This PR updates secondary instance testing in stress test by default. A background thread will be started (disabled by default), running a secondary instance tailing the logs of the primary. Periodically (every 1 sec), this thread calls `TryCatchUpWithPrimary()` and uses point lookup or range scan to read some random keys with only very basic verification to make sure no assertion failure is triggered. Thanks to https://github.com/facebook/rocksdb/issues/10061 , we can enable secondary instance when user-defined timestamp is enabled. Also removed a less useful test configuration, `secondary_catch_up_one_in`. This is very similar to the periodic catch-up. In the last commit, I decided not to enable it now, but just update the tests, since secondary instance does not work well when the underlying file is renamed by primary, e.g. SstFileManager. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10121 Test Plan: ``` TEST_TMPDIR=/dev/shm/rocksdb make crash_test TEST_TMPDIR=/dev/shm/rocksdb make crash_test_with_ts TEST_TMPDIR=/dev/shm/rocksdb make crash_test_with_atomic_flush ``` Reviewed By: ajkr Differential Revision: D36939458 Pulled By: riversand963 fbshipit-source-id: 1c065b7efc3690fc341569b9d369a5cbd8ef6b3e
-
由 Andrew Kryczka 提交于
Summary: After https://github.com/facebook/rocksdb/issues/9357 we began seeing the following error attempting to acquire locks for file ingestion: ``` FATAL: ThreadSanitizer CHECK failed: /home/engshare/third-party2/llvm-fb/12/src/llvm/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector.h:67 "((n_all_locks_)) < (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" (0x40, 0x40) ``` The command was using default values for `ingest_external_file_width` (1000) and `log2_keys_per_lock` (2). The expected number of locks needed to update those keys is then (1000 / 2^2) = 250, which is above the 0x40 (64) limit. This PR reduces the default value of `ingest_external_file_width` to 100 so the expected number of locks is 25, which is within the limit. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10131 Reviewed By: ltamasi Differential Revision: D36986307 Pulled By: ajkr fbshipit-source-id: e918cdb2fcc39517d585f1e5fd2539e185ada7c1
-
由 gitbw95 提交于
Summary: **Summary:** Add unit tests to verify that the dynamic priority can be passed from compaction to FS. Compaction reads&writes and other DB reads&writes share the same read&write paths to FSRandomAccessFile or FSWritableFile, so a MockTestFileSystem is added to replace the default filesystem from Env to intercept and verify the io_priority. To prepare the compaction input files, use the default filesystem from Env. To test the io priority of the compaction reads and writes, db_options_.fs is set as MockTestFileSystem. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10088 Test Plan: Add unit tests. Reviewed By: anand1976 Differential Revision: D36882528 Pulled By: gitbw95 fbshipit-source-id: 120adc15801966f2b8c9fc45285f590a3fff96d1
-
由 zczhu 提交于
Summary: The default implementation of Close() function in Directory/FSDirectory classes returns `NotSupported` status. However, we don't want operations that worked in older versions to begin failing after upgrading when run on FileSystems that have not implemented Directory::Close() yet. So we require the upper level that calls Close() function should properly handle "NotSupported" status instead of treating it as an error status. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10127 Reviewed By: ajkr Differential Revision: D36971112 Pulled By: littlepig2013 fbshipit-source-id: 100f0e6ad1191e1acc1ba6458c566a11724cf466
-
- 07 6月, 2022 5 次提交
-
-
由 zczhu 提交于
Summary: As pointed out by [https://github.com/facebook/rocksdb/pull/8351#discussion_r645765422](https://github.com/facebook/rocksdb/pull/8351#discussion_r645765422), check `manual_compaction_paused` and `manual_compaction_canceled` can be reduced by setting `*canceled` to be true in `DisableManualCompaction()` and `*canceled` to be false in the last time calling `EnableManualCompaction()`. Changed Tests: The origin `DBTest2.PausingManualCompaction1` uses a callback function to increase `manual_compaction_paused` and the origin CompactionJob/CompactionIterator with `manual_compaction_paused` can detect this. I changed the callback function so that it sets `*canceled` as true if `canceled` is not `nullptr` (to notify CompactionJob/CompactionIterator the compaction has been canceled). Pull Request resolved: https://github.com/facebook/rocksdb/pull/10070 Test Plan: This change does not introduce new features, but some slight difference in compaction implementation. Run the same manual compaction unit tests as before (e.g., PausingManualCompaction[1-4], CancelManualCompaction[1-2], CancelManualCompactionWithListener in db_test2, and db_compaction_test). Reviewed By: ajkr Differential Revision: D36949133 Pulled By: littlepig2013 fbshipit-source-id: c5dc4c956fbf8f624003a0f5ad2690240063a821
-
由 Yu Zhang 提交于
Summary: With this change, when a given read timestamp is smaller than the column-family's full_history_ts_low, Get(), MultiGet() and iterators APIs will return Status::InValidArgument(). Test plan ``` $COMPILE_WITH_ASAN=1 make -j24 all $./db_with_timestamp_basic_test --gtest_filter=DBBasicTestWithTimestamp.UpdateFullHistoryTsLow $ make -j24 check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/10109 Reviewed By: riversand963 Differential Revision: D36901126 Pulled By: jowlyzhang fbshipit-source-id: 255feb1a66195351f06c1d0e42acb1ff74527f86
-
由 zczhu 提交于
Summary: As pointed by anand1976 in his [comment](https://github.com/facebook/rocksdb/pull/10049#pullrequestreview-994255819), previous implementation (adding Close() function in Directory/FSDirectory class) is not backward-compatible. And we mistakenly added the default implementation `return Status::NotSupported("Close")` or `return IOStatus::NotSupported("Close")` in WritableFile class in this [pull request](https://github.com/facebook/rocksdb/pull/10101). This pull request fixes the above issue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10123 Reviewed By: ajkr Differential Revision: D36943661 Pulled By: littlepig2013 fbshipit-source-id: 9dc45f4d2ab3a9d51c30bdfde679f1d13c4d5509
-
由 Guido Tagliavini Ponce 提交于
Summary: There was an overflow bug when computing the variance in the HistogramStat class. This manifests, for instance, when running cache_bench with default arguments. This executes 32M lookups/inserts/deletes in a block cache, and then computes (among other things) the variance of the latencies. The variance is computed as ``variance = (cur_sum_squares * cur_num - cur_sum * cur_sum) / (cur_num * cur_num)``, where ``cum_sum_squares`` is the sum of the squares of the samples, ``cur_num`` is the number of samples, and ``cur_sum`` is the sum of the samples. Because the median latency in a typical run is around 3800 nanoseconds, both the ``cur_sum_squares * cur_num`` and ``cur_sum * cur_sum`` terms overflow as uint64_t. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10100 Test Plan: Added a unit test. Run ``make -j24 histogram_test && ./histogram_test``. Reviewed By: pdillinger Differential Revision: D36942738 Pulled By: guidotag fbshipit-source-id: 0af5fb9e2a297a284e8e74c24e604d302906006e
-
由 Peter Dillinger 提交于
Summary: We have three related concepts: * BlockType: an internal enum conceptually indicating a type of SST file block * CacheEntryRole: a user-facing enum for categorizing block cache entries, which is also involved in associated cache entries with an appropriate deleter. Can include categories for non-block cache entries (e.g. memory reservations). * TBlocklike: a C++ type for the actual type behind a void* cache entry. We had some existing code ugliness because BlockType did not imply TBlocklike, because of various kinds of "filter" block. This refactoring fixes that with new BlockTypes. More clean-up can come in later work. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10098 Test Plan: existing tests Reviewed By: akankshamahajan15 Differential Revision: D36897945 Pulled By: pdillinger fbshipit-source-id: 3ae496b5caa81e0a0ed85e873eb5b525e2d9a295
-
- 06 6月, 2022 1 次提交
-
-
由 Jay Zhuang 提交于
Summary: The script is broken. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10119 Reviewed By: ltamasi Differential Revision: D36928076 Pulled By: jay-zhuang fbshipit-source-id: f325cfd00869c506c64573fe8192cb5b561825d6
-
- 05 6月, 2022 1 次提交
-
-
由 Alan Paxton 提交于
Summary: CircleCI runner based benchmarking. A runner is a dedicate machine configured for CircleCI to perform work on. Our work is a repeatable benchmark, the `benchmark-linux` job in `config.yml` A runner, in CircleCI terminology, is a machine that is managed by the client (us) rather than running on CircleCI resources in the cloud. This means that we define and configure the iron, and that therefore the performance is repeatable and predictable. Which is what we need for performance regression benchmarking. On a time schedule (or on commit, during branch development) benchmarks are set off on the runner, and then a script is run `benchmark_log_tool.py` which parses the benchmark output and pushes it into a pre-configured OpenSearch document connected to an OpenSearch dashboard. Members of the team can examine benchmark performance changes on the dashboard. As time progresses we can add different benchmarks to the suite which gets run. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9723 Reviewed By: pdillinger Differential Revision: D35555626 Pulled By: jay-zhuang fbshipit-source-id: c6a905ca04494495c3784cfbb991f5ab90c807ee
-
- 04 6月, 2022 11 次提交
-
-
由 yite.gu 提交于
Summary: Add a simple example of backup and restore Signed-off-by: NYiteGu <ess_gyt@qq.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/10054 Reviewed By: jay-zhuang Differential Revision: D36678141 Pulled By: ajkr fbshipit-source-id: 43545356baddb4c2c76c62cd63d7a3238d1f8a00
-
由 Levi Tamasi 提交于
Summary: The patch adds some low-level logic that can be used to serialize/deserialize a sorted vector of wide columns to/from a simple binary searchable string representation. Currently, there is no user-facing API; this will be implemented in subsequent stages. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9915 Test Plan: `make check` Reviewed By: siying Differential Revision: D35978076 Pulled By: ltamasi fbshipit-source-id: 33f5f6628ec3bcd8c8beab363b1978ac047a8788
-
由 Yanqin Jin 提交于
Summary: If caller specifies a non-null `timestamp` argument in `DB::Get()` or a non-null `timestamps` in `DB::MultiGet()`, RocksDB will return the timestamps of the point tombstones. Note: DeleteRange is still unsupported. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10056 Test Plan: make check Reviewed By: ltamasi Differential Revision: D36677956 Pulled By: riversand963 fbshipit-source-id: 2d7af02cc7237b1829cd269086ea895a49d501ae
-
由 Hui Xiao 提交于
Increase ChargeTableReaderTest/ChargeTableReaderTest.Basic error tolerance rate from 1% to 5% (#10113) Summary: **Context:** https://github.com/facebook/rocksdb/pull/9748 added support to charge table reader memory to block cache. In the test `ChargeTableReaderTest/ChargeTableReaderTest.Basic`, it estimated the table reader memory, calculated the expected number of table reader opened based on this estimation and asserted this number with actual number. The expected number of table reader opened calculated based on estimated table reader memory will not be 100% accurate and should have tolerance for error. It was previously set to 1% and recently encountered an assertion failure that `(opened_table_reader_num) <= (max_table_reader_num_capped_upper_bound), actual: 375 or 376 vs 374` where `opened_table_reader_num` is the actual opened one and `max_table_reader_num_capped_upper_bound` is the estimated opened one (=371 * 1.01). I believe it's safe to increase error tolerance from 1% to 5% hence there is this PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10113 Test Plan: - CI again succeeds. Reviewed By: ajkr Differential Revision: D36911556 Pulled By: hx235 fbshipit-source-id: 259687dd77b450fea0f5658a5b567a1d31d4b1f7
-
由 Zeyi (Rice) Fan 提交于
Summary: When building RocksDB with getdeps on Windows, `thirdparty.inc` get in the way since `FindXXXX.cmake` are working properly now. This PR adds an option to skip that file when building RocksDB so we can disable it. FB: see [D36905191](https://www.internalfb.com/diff/D36905191). Pull Request resolved: https://github.com/facebook/rocksdb/pull/10110 Reviewed By: siying Differential Revision: D36913882 Pulled By: fanzeyi fbshipit-source-id: 33d36841dc0d4fe87f51e1d9fd2b158a3adab88f
-
由 Levi Tamasi 提交于
Summary: The patch attempts to fix three bugs in `verify_random_db.sh`: 1) https://github.com/facebook/rocksdb/pull/9937 changed the default for `--try_load_options` to true in the script's use case, so we have to explicitly set it to false if the corresponding argument of the script is 0. This should fix the issue we've been seeing with our forward compatibility tests where 7.3 is unable to open a database created by the version on main after adding a new configuration option. 2) The script seems to support two "extra parameters"; however, in practice, if the second one was set, only that one was passed on to `ldb`. Now both get forwarded. 3) When running the `diff` command, the base DB directory was passed as the second argument instead of the file containing the `ldb` output (this actually seems to work, probably accidentally though). Pull Request resolved: https://github.com/facebook/rocksdb/pull/10112 Reviewed By: pdillinger Differential Revision: D36911363 Pulled By: ltamasi fbshipit-source-id: fe29db4e28d373cee51a12322c59050fc50e926d
-
由 Yanqin Jin 提交于
Summary: Closing https://github.com/facebook/rocksdb/issues/10080 When `SyncWAL()` calls `MarkLogsSynced()`, even if there is only one active WAL file, this event should still be added to the MANIFEST. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10087 Test Plan: make check Reviewed By: ajkr Differential Revision: D36797580 Pulled By: riversand963 fbshipit-source-id: 24184c9dd606b3939a454ed41de6e868d1519999
-
由 Guido Tagliavini Ponce 提交于
Summary: cache_bench can now run with FastLRUCache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10095 Test Plan: - Temporarily add an ``assert(false)`` in the execution path that sets up the FastLRUCache. Run ``make -j24 cache_bench``. Then test the appropriate code is used by running ``./cache_bench -cache_type=fast_lru_cache`` and checking that the assert is called. Repeat for LRUCache. - Verify that FastLRUCache (currently a clone of LRUCache) has similar latency distribution than LRUCache, by comparing the outputs of ``./cache_bench -cache_type=fast_lru_cache`` and ``./cache_bench -cache_type=lru_cache``. Reviewed By: pdillinger Differential Revision: D36875834 Pulled By: guidotag fbshipit-source-id: eb2ad0bb32c2717a258a6ac66ed736e06f826cd8
-
由 zczhu 提交于
Summary: As pointed by anand1976 in his [comment](https://github.com/facebook/rocksdb/pull/10049#pullrequestreview-994255819), previous implementation is not backward-compatible. In this implementation, the default implementation `return Status::NotSupported("Close")` or `return IOStatus::NotSupported("Close")` is added for `Close()` function for `*Directory` classes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10101 Test Plan: DBBasicTest.DBCloseAllDirectoryFDs Reviewed By: anand1976 Differential Revision: D36899346 Pulled By: littlepig2013 fbshipit-source-id: 430624793362f330cbb8837960f0e8712a944ab9
-
由 Guido Tagliavini Ponce 提交于
Summary: db_bench can now run with FastLRUCache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10096 Test Plan: - Temporarily add an ``assert(false)`` in the execution path that sets up the FastLRUCache. Run ``make -j24 db_bench``. Then test the appropriate code is used by running ``./db_bench -cache_type=fast_lru_cache`` and checking that the assert is called. Repeat for LRUCache. - Verify that FastLRUCache (currently a clone of LRUCache) produces similar benchmark data than LRUCache, by comparing the outputs of ``./db_bench -benchmarks=fillseq,fillrandom,readseq,readrandom -cache_type=fast_lru_cache`` and ``./db_bench -benchmarks=fillseq,fillrandom,readseq,readrandom -cache_type=lru_cache``. Reviewed By: gitbw95 Differential Revision: D36898774 Pulled By: guidotag fbshipit-source-id: f9f6b6f6da124f88b21b3c8dee742fbb04eff773
-
由 Yanqin Jin 提交于
Summary: Will re-enable after fixing the bug in https://github.com/facebook/rocksdb/issues/10099 and https://github.com/facebook/rocksdb/issues/10097. Right now, the priority is https://github.com/facebook/rocksdb/issues/10087, but the bug in WAL compression prevents the mini crash test from passing. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10108 Reviewed By: pdillinger Differential Revision: D36897214 Pulled By: riversand963 fbshipit-source-id: d64dc52738222d5f66003f7731dc46eaeed812be
-
- 03 6月, 2022 7 次提交
-
-
由 Mark Callaghan 提交于
Summary: … BlobDB for all tests This does two big things: * provides more tuning options * supports universal and integrated BlobDB for all of the benchmarks that are leveled-only It does several smaller things, and I will list a few * sets l0_slowdown_writes_trigger which wasn't set before this diff. * improves readability in report.tsv by using smaller field names in the header * adds more columns to report.tsv report.tsv before this diff: ``` ops_sec mb_sec total_size_gb level0_size_gb sum_gb write_amplification write_mbps usec_op percentile_50 percentile_75 percentile_99 percentile_99.9 percentile_99.99 uptime stall_time stall_percent test_name test_date rocksdb_version job_id 823294 329.8 0.0 21.5 21.5 1.0 183.4 1.2 1.0 1.0 3 6 14 120 00:00:0.000 0.0 fillseq.wal_disabled.v400 2022-03-16T15:46:45.000-07:00 7.0 326520 130.8 0.0 0.0 0.0 0.0 0 12.2 139.8 155.1 170 234 250 60 00:00:0.000 0.0 multireadrandom.t4 2022-03-16T15:48:47.000-07:00 7.0 86313 345.7 0.0 0.0 0.0 0.0 0 46.3 44.8 50.6 75 84 108 60 00:00:0.000 0.0 revrangewhilewriting.t4 2022-03-16T15:50:48.000-07:00 7.0 101294 405.7 0.0 0.1 0.1 1.0 1.6 39.5 40.4 45.9 64 75 103 62 00:00:0.000 0.0 fwdrangewhilewriting.t4 2022-03-16T15:52:50.000-07:00 7.0 258141 103.4 0.0 0.1 1.2 18.2 19.8 15.5 14.3 18.1 28 34 48 62 00:00:0.000 0.0 readwhilewriting.t4 2022-03-16T15:54:51.000-07:00 7.0 334690 134.1 0.0 7.6 18.7 4.2 308.8 12.0 11.8 13.7 21 30 62 62 00:00:0.000 0.0 overwrite.t4.s0 2022-03-16T15:56:53.000-07:00 7.0 ``` report.tsv with this diff: ``` ops_sec mb_sec lsm_sz blob_sz c_wgb w_amp c_mbps c_wsecs c_csecs b_rgb b_wgb usec_op p50 p99 p99.9 p99.99 pmax uptime stall% Nstall u_cpu s_cpu rss test date version job_id 831144 332.9 22GB 0.0GB, 21.7 1.0 185.1 264 262 0 0 1.2 1.0 3 6 14 9198 120 0.0 0 0.4 0.0 0.7 fillseq.wal_disabled.v400 2022-03-16T16:21:23 7.0 325229 130.3 22GB 0.0GB, 0.0 0.0 0 0 0 0 12.3 139.8 170 237 249 572 60 0.0 0 0.4 0.1 1.2 multireadrandom.t4 2022-03-16T16:23:25 7.0 312920 125.3 26GB 0.0GB, 11.1 2.6 189.3 115 113 0 0 12.8 11.8 21 34 1255 6442 60 0.2 1 0.7 0.1 0.6 overwritesome.t4.s0 2022-03-16T16:25:27 7.0 81698 327.2 25GB 0.0GB, 0.0 0.0 0 0 0 0 48.9 46.2 79 246 369 9445 60 0.0 0 0.4 0.1 1.4 revrangewhilewriting.t4 2022-03-16T16:30:21 7.0 92484 370.4 25GB 0.0GB, 0.1 1.5 1.1 1 0 0 0 43.2 42.3 75 103 110 9512 62 0.0 0 0.4 0.1 1.4 fwdrangewhilewriting.t4 2022-03-16T16:32:24 7.0 241661 96.8 25GB 0.0GB, 0.1 1.5 1.1 1 0 0 0 16.5 17.1 30 34 49 9092 62 0.0 0 0.4 0.1 1.4 readwhilewriting.t4 2022-03-16T16:34:27 7.0 305234 122.3 30GB 0.0GB, 12.1 2.7 201.7 127 124 0 0 13.1 11.8 21 128 1934 6339 62 0.0 0 0.7 0.1 0.7 overwrite.t4.s0 2022-03-16T16:36:30 7.0 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/9704 Test Plan: run it Reviewed By: jay-zhuang Differential Revision: D36864627 Pulled By: mdcallag fbshipit-source-id: d5af1cfc258a16865210163fa6fd1b803ab1a7d3
-
由 Levi Tamasi 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10105 Reviewed By: cbi42 Differential Revision: D36891073 Pulled By: ltamasi fbshipit-source-id: 16487ec708fc96add2a1ebc2d98f6439dfc852ca
-
由 Levi Tamasi 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10106 Reviewed By: cbi42 Differential Revision: D36891284 Pulled By: ltamasi fbshipit-source-id: 304ffa84549201659feb0b74d6ba54a83f08906b
-
由 zczhu 提交于
Summary: In [close_db_dir](https://github.com/facebook/rocksdb/pull/10049) pull request, some merging conflicts occurred (some comments and one line `s.PermitUncheckedError()` are missing). This pull request aims to put them back. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10093 Reviewed By: ajkr Differential Revision: D36884117 Pulled By: littlepig2013 fbshipit-source-id: 8c0e2a8793fc52804067c511843bd1ff4912c1c3
-
由 Yanqin Jin 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10102 Reviewed By: jay-zhuang Differential Revision: D36885468 Pulled By: riversand963 fbshipit-source-id: 6ed5b62dda8fe0f4be4b66d09bdec0134cf4500c
-
由 Gang Liao 提交于
Summary: Currently, if blob files are enabled (i.e. `enable_blob_files` is true), large values are extracted both during flush/recovery (when SST files are written into level 0 of the LSM tree) and during compaction into any LSM tree level. For certain use cases that have a mix of short-lived and long-lived values, it might make sense to support extracting large values only during compactions whose output level is greater than or equal to a specified LSM tree level (e.g. compactions into L1/L2/... or above). This could reduce the space amplification caused by large values that are turned into garbage shortly after being written at the price of some write amplification incurred by long-lived values whose extraction to blob files is delayed. In order to achieve this, we would like to do the following: - Add a new configuration option `blob_file_starting_level` (default: 0) to `AdvancedColumnFamilyOptions` (and `MutableCFOptions` and extend the related logic) - Instantiate `BlobFileBuilder` in `BuildTable` (used during flush and recovery, where the LSM tree level is L0) and `CompactionJob` iff `enable_blob_files` is set and the LSM tree level is `>= blob_file_starting_level` - Add unit tests for the new functionality, and add the new option to our stress tests (`db_stress` and `db_crashtest.py` ) - Add the new option to our benchmarking tool `db_bench` and the BlobDB benchmark script `run_blob_bench.sh` - Add the new option to the `ldb` tool (see https://github.com/facebook/rocksdb/wiki/Administration-and-Data-Access-Tool) - Ideally extend the C and Java bindings with the new option - Update the BlobDB wiki to document the new option. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10077 Reviewed By: ltamasi Differential Revision: D36884156 Pulled By: gangliao fbshipit-source-id: 942bab025f04633edca8564ed64791cb5e31627d
-
由 Jay Zhuang 提交于
Summary: Only used as temperature high bound for current code, may increase with more temperatures added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10044 Test Plan: ci Reviewed By: siying Differential Revision: D36633410 Pulled By: jay-zhuang fbshipit-source-id: eecdfa7623c31778c31d789902eacf78aad7b482
-
- 02 6月, 2022 8 次提交
-
-
由 Gang Liao 提交于
Summary: Garbage collection is generally controlled by the BlobDB configuration options `enable_blob_garbage_collection` and `blob_garbage_collection_age_cutoff`. However, there might be use cases where we would want to temporarily override these options while performing a manual compaction. (One use case would be doing a full key-space manual compaction with full=100% garbage collection age cutoff in order to minimize the space occupied by the database.) Our goal here is to make it possible to override the configured GC parameters when using the `CompactRange` API to perform manual compactions. This PR would involve: - Extending the `CompactRangeOptions` structure so clients can both force-enable and force-disable GC, as well as use a different cutoff than what's currently configured - Storing whether blob GC should actually be enabled during a certain manual compaction and the cutoff to use in the `Compaction` object (considering the above overrides) and passing it to `CompactionIterator` via `CompactionProxy` - Updating the BlobDB wiki to document the new options. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10073 Test Plan: Adding unit tests and adding the new options to the stress test tool. Reviewed By: ltamasi Differential Revision: D36848700 Pulled By: gangliao fbshipit-source-id: c878ef101d1c612429999f513453c319f75d78e9
-
由 Zichen Zhu 提交于
Summary: Currently, the DB directory file descriptor is left open until the deconstruction process (`DB::Close()` does not close the file descriptor). To verify this, comment out the lines between `db_ = nullptr` and `db_->Close()` (line 512, 513, 514, 515 in ldb_cmd.cc) to leak the ``db_'' object, build `ldb` tool and run ``` strace --trace=open,openat,close ./ldb --db=$TEST_TMPDIR --ignore_unknown_options put K1 V1 --create_if_missing ``` There is one directory file descriptor that is not closed in the strace log. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10049 Test Plan: Add a new unit test DBBasicTest.DBCloseAllDirectoryFDs: Open a database with different WAL directory and three different data directories, and all directory file descriptors should be closed after calling Close(). Explicitly call Close() after a directory file descriptor is not used so that the counter of directory open and close should be equivalent. Reviewed By: ajkr, hx235 Differential Revision: D36722135 Pulled By: littlepig2013 fbshipit-source-id: 07bdc2abc417c6b30997b9bbef1f79aa757b21ff
-
由 Guido Tagliavini Ponce 提交于
Summary: Stress tests can run with the experimental FastLRUCache. Crash tests randomly choose between LRUCache and FastLRUCache. Since only LRUCache supports a secondary cache, we validate the `--secondary_cache_uri` and `--cache_type` flags---when `--secondary_cache_uri` is set, the `--cache_type` is set to `lru_cache`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10081 Test Plan: - To test that the FastLRUCache is used and the stress test runs successfully, run `make -j24 CRASH_TEST_EXT_ARGS=—duration=960 blackbox_crash_test_with_atomic_flush`. The cache type should sometimes be `fast_lru_cache`. - To test the flag validation, run `make -j24 CRASH_TEST_EXT_ARGS="--duration=960 --secondary_cache_uri=x" blackbox_crash_test_with_atomic_flush` multiple times. The test will always be aborted (which is okay). Check that the cache type is always `lru_cache`. Reviewed By: anand1976 Differential Revision: D36839908 Pulled By: guidotag fbshipit-source-id: ebcdfdcd12ec04c96c09ae5b9c9d1e613bdd1725
-
由 Akanksha Mahajan 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10092 Reviewed By: riversand963 Differential Revision: D36832311 Pulled By: akankshamahajan15 fbshipit-source-id: 8fb1cf90b1d4dddebbfbeebeddb15f6905968e9b
-
由 Jay Zhuang 提交于
Summary: `db_impl.alive_log_files_` is used to track the WAL size in `db_impl.logs_`. Get the `LogFileNumberSize` obj in `alive_log_files_` the same time as `log_writer` to keep them consistent. For this issue, it's not safe to do `deque::reverse_iterator::operator*` and `deque::pop_front()` concurrently, so remove the tail cache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10086 Test Plan: ``` # on Windows gtest-parallel ./db_test --gtest_filter=DBTest.FileCreationRandomFailure -r 1000 -w 100 ``` Reviewed By: riversand963 Differential Revision: D36822373 Pulled By: jay-zhuang fbshipit-source-id: 5e738051dfc7bcf6a15d85ba25e6365df6b6a6af
-
由 Changyu Bi 提交于
Summary: Add to HISTORY.md the bug fixed in https://github.com/facebook/rocksdb/issues/10051 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10091 Reviewed By: ajkr Differential Revision: D36821861 Pulled By: cbi42 fbshipit-source-id: 598812fab88f65c0147ece53cff55cf4ea73aac6
-
由 Peter Dillinger 提交于
Summary: We recently saw a case in crash test in which a WAL file in the middle of the list of live WALs was not included in the backup, so the DB was not openable due to missing WAL. We are not sure why, but this change should at least turn that into a backup-time failure by ensuring all the WAL files expected by the manifest (according to VersionSet) are included in `GetSortedWalFiles()` (used by `GetLiveFilesStorageInfo()`, `BackupEngine`, and `Checkpoint`) Related: to maximize the effectiveness of track_and_verify_wals_in_manifest with GetSortedWalFiles() during checkpoint/backup, we will now sync WAL in GetLiveFilesStorageInfo() when track_and_verify_wals_in_manifest=true. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10083 Test Plan: added new unit test for the check in GetSortedWalFiles() Reviewed By: ajkr Differential Revision: D36791608 Pulled By: pdillinger fbshipit-source-id: a27bcf0213fc7ab177760fede50d4375d579afa6
-
由 Akanksha Mahajan 提交于
Summary: In case of non-TransactionDB and avoid_flush_during_recovery = true, RocksDB won't flush the data from WAL to L0 for all column families if possible. As a result, not all column families can increase their log_numbers, and min_log_number_to_keep won't change. For transaction DB (.allow_2pc), even with the flush, there may be old WAL files that it must not delete because they can contain data of uncommitted transactions and min_log_number_to_keep won't change. If we persist a new MANIFEST with advanced log_numbers for some column families, then during a second crash after persisting the MANIFEST, RocksDB will see some column families' log_numbers larger than the corrupted wal, and the "column family inconsistency" error will be hit, causing recovery to fail. As a solution, RocksDB will persist the new MANIFEST after successfully syncing the new WAL. If a future recovery starts from the new MANIFEST, then it means the new WAL is successfully synced. Due to the sentinel empty write batch at the beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point. If future recovery starts from the old MANIFEST, it means the writing the new MANIFEST failed. We won't have the "SST ahead of WAL" error. Currently, RocksDB DB::Open() may creates and writes to two new MANIFEST files even before recovery succeeds. This PR buffers the edits in a structure and writes to a new MANIFEST after recovery is successful Pull Request resolved: https://github.com/facebook/rocksdb/pull/9922 Test Plan: 1. Update unit tests to fail without this change 2. make crast_test -j Branch with unit test and no fix https://github.com/facebook/rocksdb/pull/9942 to keep track of unit test (without fix) Reviewed By: riversand963 Differential Revision: D36043701 Pulled By: akankshamahajan15 fbshipit-source-id: 5760970db0a0920fb73d3c054a4155733500acd9
-
- 01 6月, 2022 1 次提交
-
-
由 Yanqin Jin 提交于
Summary: This variable is actually not being used for anything meaningful, thus remove it. This can make https://github.com/facebook/rocksdb/issues/7516 slightly simpler by reducing the amount of state that must be made lock-free. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10078 Test Plan: make check Reviewed By: ajkr Differential Revision: D36779817 Pulled By: riversand963 fbshipit-source-id: ffb0d9ad6149616917ae5e02bb28102cb90fc406
-