- 29 9月, 2022 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Without this fix, db_crashtest.py could fail with useless output such as: `UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 267: invalid start byte` Pull Request resolved: https://github.com/facebook/rocksdb/pull/10753 Reviewed By: hx235 Differential Revision: D39905809 Pulled By: ajkr fbshipit-source-id: 50ba2cf20d206eeb168309cec137e827a34c8f0b
-
- 28 9月, 2022 3 次提交
-
-
由 Yanqin Jin 提交于
Summary: Older versions of gflags do not have `DEFINE_uint32` and `DECLARE_uint32`. In util/gflag_compat.h, we already add a hack for `DEFINE_uint32`. This PR adds a hack for `DECLARE_uint32`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10729 Test Plan: ROCKSDB_NO_FBCODE=1 make V=1 -j16 db_stress make check Resolves https://github.com/facebook/rocksdb/issues/10704 Reviewed By: pdillinger Differential Revision: D39789183 Pulled By: riversand963 fbshipit-source-id: a58747e0163dcf55dd762733aa5c40d8f0ae70a6
-
由 Hui Xiao 提交于
Summary: An add-on to https://github.com/facebook/rocksdb/pull/6818 to complete adding single-level universal compaction to stress/crash testing. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10732 Test Plan: - Locally run for 10 min `python3 ./tools/db_crashtest.py whitebox --simple --compaction_style=1 --num_levels=1 -max_key=1000000 -value_size_mult=33 -write_buffer_size=524288 -target_file_size_base=524288 -max_bytes_for_level_base=2097152 --duration=120 --interval=10 --ops_per_thread=1000 --random_kill_odd=887` - Check LOG to confirm single-level universal compaction is called - Manual testing and log checking to ensure destroy_db_initially=1 is correctly set across runs with different compaction styles (i.e, in the second half of whitebox testing). - [ongoing]CI jobs stress test Reviewed By: ajkr Differential Revision: D39797612 Pulled By: ajkr fbshipit-source-id: 16f5c40c3464c57360c06c8305f92118e426149c
-
由 Yanqin Jin 提交于
Summary: Currently, this original behavior should not lead to incorrect result, but will violate the contract of CompareWithTimestamp() that when a_has_ts or b_has_ts is false, the slice does not include timestamp. Resolves https://github.com/facebook/rocksdb/issues/10709 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10742 Test Plan: make check Reviewed By: ltamasi Differential Revision: D39834096 Pulled By: riversand963 fbshipit-source-id: c597600f5a7820734f07d0926cdc224cea5eabe1
-
- 27 9月, 2022 5 次提交
-
-
由 Changyu Bi 提交于
Summary: when a new internal iterator is constructed during iterator refresh, pointer to the previous memtable range tombstone iterator was not cleared. This could cause segfault for future `Refresh()` calls when they try to free the memtable range tombstones. This PR fixes this issue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10739 Test Plan: added a unit test in db_range_del_test.cc to reproduce this issue. Reviewed By: ajkr, riversand963 Differential Revision: D39825283 Pulled By: cbi42 fbshipit-source-id: 3b59a2b73865aed39e28cdd5c1b57eed7991b94c
-
由 Hui Xiao 提交于
Summary: **Context:** Prior to this PR, correctness testing with un-sync data loss [disabled](https://github.com/facebook/rocksdb/pull/10605) transaction (`use_txn=1`) thus all of the `txn_write_policy` . This PR improved that by adding support for one policy - WriteCommit (`txn_write_policy=0`). **Summary:** They key to this support is (a) handle Mark{Begin, End}Prepare/MarkCommit/MarkRollback in constructing ExpectedState under WriteCommit policy correctly and (b) monitor CI jobs and solve any test incompatibility issue till jobs are stable. (b) will be part of the test plan. For (a) - During prepare (i.e, between `MarkBeginPrepare()` and `MarkEndPrepare(xid)`), `ExpectedStateTraceRecordHandler` will buffer all writes by adding all writes to an internal `WriteBatch`. - On `MarkEndPrepare()`, that `WriteBatch` will be associated with the transaction's `xid`. - During the commit (i.e, on `MarkCommit(xid)`), `ExpectedStateTraceRecordHandler` will retrieve and iterate the internal `WriteBatch` and finally apply those writes to `ExpectedState` - During the rollback (i.e, on `MarkRollback(xid)`), `ExpectedStateTraceRecordHandler` will erase the internal `WriteBatch` from the map. For (b) - one major issue described below: - TransactionsDB in db stress recovers prepared-but-not-committed txns from the previous crashed run by randomly committing or rolling back it at the start of the current run, see a historical [PR](https://github.com/facebook/rocksdb/commit/6d06be22c083ccf185fd38dba49fde73b644b4c1) predated correctness testing. - And we will verify those processed keys in a recovered db against their expected state. - However since now we turn on `sync_fault_injection=1` where the expected state is constructed from the trace instead of using the LATEST.state from previous run. The expected state now used to verify those processed keys won't contain UNKNOWN_SENTINEL as they should - see test 1 for a failed case. - Therefore, we decided to manually update its expected state to be UNKNOWN_SENTINEL as part of the processing. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10624 Test Plan: 1. Test exposed the major issue described above. This test will fail without setting UNKNOWN_SENTINEL in expected state during the processing and pass after ``` db=/dev/shm/rocksdb_crashtest_blackbox exp=/dev/shm/rocksdb_crashtest_expected dbt=$db.tmp expt=$exp.tmp rm -rf $db $exp mkdir -p $exp echo "RUN 1" ./db_stress \ --clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \ --use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 & pid=$! sleep 0.2 sleep 20 kill $pid sleep 0.2 echo "RUN 2" ./db_stress \ --clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \ --use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 & pid=$! sleep 0.2 sleep 20 kill $pid sleep 0.2 echo "RUN 3" ./db_stress \ --clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \ --use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 ``` 2. Manual testing to ensure ExpectedState is constructed correctly during recovery by verifying it against previously crashed TransactionDB's WAL. - Run the following command to crash a TransactionDB with WriteCommit policy. Then `./ldb dump_wal` on its WAL file ``` db=/dev/shm/rocksdb_crashtest_blackbox exp=/dev/shm/rocksdb_crashtest_expected rm -rf $db $exp mkdir -p $exp ./db_stress \ --clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \ --use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 & pid=$! sleep 30 kill $pid sleep 1 ``` - Run the following command to verify recovery of the crashed db under debugger. Compare the step-wise result with WAL records (e.g, WriteBatch content, xid, prepare/commit/rollback marker) ``` ./db_stress \ --clear_column_family_one_in=0 --column_families=1 --db=$db --delpercent=10 --delrangepercent=0 --destroy_db_initially=0 --expected_values_dir=$exp --iterpercent=0 --key_len_percent_dist=1,30,69 --max_key=1000000 --max_key_len=3 --prefixpercent=0 --readpercent=0 --reopen=0 --ops_per_thread=100000000 --test_batches_snapshots=0 --value_size_mult=32 --writepercent=90 \ --use_txn=1 --txn_write_policy=0 --sync_fault_injection=1 ``` 3. Automatic testing by triggering all RocksDB stress/crash test jobs for 3 rounds with no failure. Reviewed By: ajkr, riversand963 Differential Revision: D39199373 Pulled By: hx235 fbshipit-source-id: 7a1dec0e3e2ee6ea86ddf5dd19ceb5543a3d6f0c
-
由 anand76 提交于
Summary: Update the docker image with OpenSSL, required by the folly build. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10741 Reviewed By: jay-zhuang Differential Revision: D39831081 Pulled By: anand1976 fbshipit-source-id: 900154f70a456d1b6f9e384b8bdbcc227af4adbc
-
由 Yanqin Jin 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10737 Reviewed By: cbi42 Differential Revision: D39825386 Pulled By: riversand963 fbshipit-source-id: a3c55f2777e034d6ae6ff44ef0219d9fbbf1cc96
-
由 Levi Tamasi 提交于
Summary: The PR cleans up the logic in `NonBatchedOpsStressTest::VerifyDb` so that the verification method is picked using a single random number generation. It also eliminates some repeated key comparisons and makes some small code hygiene improvements. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10740 Test Plan: Ran a simple blackbox crash test. Reviewed By: riversand963 Differential Revision: D39828646 Pulled By: ltamasi fbshipit-source-id: 60ee5a3bb1851278f62c7d83b0c93b902ed9702e
-
- 24 9月, 2022 2 次提交
-
-
由 Yanqin Jin 提交于
Summary: Currently, without this fix, DBImpl::GetLatestSequenceForKey() may not return the latest sequence number for merge operands of the key. This can cause conflict checking during optimistic transaction commit phase to fail. Fix it by always returning the latest sequence number of the key, also considering range tombstones. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10724 Test Plan: make check Reviewed By: cbi42 Differential Revision: D39756847 Pulled By: riversand963 fbshipit-source-id: 0764c3dd4cb24960b37e18adccc6e7feed0e6876
-
由 Alan Paxton 提交于
Summary: Larger size is necessary to stress levels 2, 3 of LSM tree Pull Request resolved: https://github.com/facebook/rocksdb/pull/10649 Reviewed By: ajkr Differential Revision: D39744515 Pulled By: jay-zhuang fbshipit-source-id: 62ff097bfbfdfc26ff1e6290e1e3b71506b7042c
-
- 23 9月, 2022 5 次提交
-
-
由 Levi Tamasi 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10723 Reviewed By: riversand963 Differential Revision: D39749277 Pulled By: ltamasi fbshipit-source-id: 4bda94b4620a0db1fcd4309c7ad03fc23e8718cb
-
由 walter 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10693 Reviewed By: riversand963 Differential Revision: D39668399 Pulled By: ajkr fbshipit-source-id: 66d46e5104c49d235a14e8df8eec3af285ab9752
-
由 Anatol Pomozov 提交于
Summary: It fixes "egrep: warning: egrep is obsolescent; using grep -E" warning at the systems with newer gnu grep. https://www.phoronix.com/news/GNU-Grep-3.8-Stop-egrep-fgrep Pull Request resolved: https://github.com/facebook/rocksdb/pull/10701 Reviewed By: ajkr Differential Revision: D39737908 Pulled By: cbi42 fbshipit-source-id: 13f3ccfc1a37d18541d156a6b4c2ba24f6c66589
-
由 dependabot[bot] 提交于
Summary: Bumps [commonmarker](https://github.com/gjtorikian/commonmarker) from 0.23.4 to 0.23.6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/gjtorikian/commonmarker/releases">commonmarker's releases</a>.</em></p> <blockquote> <h2>v0.23.6</h2> <h2>What's Changed</h2> <p>This release includes two updates from the upstream <code>cmark-gfm</code> library, namely:</p> <ul> <li><a href="https://github.com/github/cmark-gfm/releases">DoS vulnerability in autolink extension</a> per <a href="https://github.com/github/cmark-gfm/security/advisories/GHSA-cgh3-p57x-9q7q">GHSA-cgh3-p57x-9q7q</a></li> <li><a href="https://github.com/github/cmark-gfm/releases/tag/0.29.0.gfm.5">Added <code>xmpp:</code> and <code>mailto:</code> support to the autolink extension</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/gjtorikian/commonmarker/blob/main/CHANGELOG.md">commonmarker's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/gjtorikian/commonmarker/commit/a8f8d76fbc8c92ddb2e539a06bd93c5f8326705e"><code>a8f8d76</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/190">https://github.com/facebook/rocksdb/issues/190</a> from anticomputer/main</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/ac916346314aef3015a713f92f7b46a8c34e98ed"><code>ac91634</code></a>
💎 release 0.23.6</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/777fd3054be0e0ba18f2fda18ccc7eeeee82db92"><code>777fd30</code></a> Update cmark-upstream to <a href="https://github.com/github/cmark-gfm/commit/9d57d8a23">https://github.com/github/cmark-gfm/commit/9d57d8a23</a>...</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/7aaeb37e9780e87a9e9fbdddfe1feba1c9f360f4"><code>7aaeb37</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/188">https://github.com/facebook/rocksdb/issues/188</a> from stevenlaidlaw/update-to-0290gfm5</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/795e628a406ec67f169440ed3aa84ba9483e2700"><code>795e628</code></a> Update cmark-upstream to <a href="https://github.com/github/cmark-gfm/commit/0578e1e4f">https://github.com/github/cmark-gfm/commit/0578e1e4f</a>...</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/39d19d65300c5735efcc77b2a57b65c207c013e7"><code>39d19d6</code></a> Update cmark-upstream to <a href="https://github.com/github/cmark-gfm/commit/766f161ef">https://github.com/github/cmark-gfm/commit/766f161ef</a>...</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/63b7bf89ee1be857c5a5757d6ea678c5c759b8b9"><code>63b7bf8</code></a> Update FUNDING.yml</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/558c7275b18a7ae16136c0fc55f444458dd8cc58"><code>558c727</code></a> Bump to 0.23.5</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/41eee7265f501305834d80e3d045ed6c9df77de2"><code>41eee72</code></a> lint</li> <li><a href="https://github.com/gjtorikian/commonmarker/commit/897e8ed07d04b902a5cc3c928852bafdab4468aa"><code>897e8ed</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/gjtorikian/commonmarker/issues/180">https://github.com/facebook/rocksdb/issues/180</a> from lumaxis/main</li> <li>Additional commits viewable in <a href="https://github.com/gjtorikian/commonmarker/compare/v0.23.4...v0.23.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=commonmarker&package-manager=bundler&previous-version=0.23.4&new-version=0.23.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/facebook/rocksdb/network/alerts). </details> Pull Request resolved: https://github.com/facebook/rocksdb/pull/10722 Reviewed By: riversand963 Differential Revision: D39735507 Pulled By: ajkr fbshipit-source-id: 3a54f9542a3d2b88c2993dd8a125b0e7f990a886 -
由 Peter Dillinger 提交于
Summary: We have a lot of confusing code because of mixed, sometimes completely opposite uses of of the term "raw block" or "raw contents", sometimes within the same source file. For example, in `BlockBasedTableBuilder`, `raw_block_contents` and `raw_size` generally referred to uncompressed block contents and size, while `WriteRawBlock` referred to writing a block that is already compressed if it is going to be. Meanwhile, in `BlockBasedTable`, `raw_block_contents` either referred to a (maybe compressed) block with trailer, or a maybe compressed block maybe without trailer. (Note: left as follow-up work to use C++ typing to better sort out the various kinds of BlockContents.) This change primarily tries to apply some consistent terminology around the kinds of block representations, avoiding the unclear "raw". (Any meaning of "raw" assumes some bias toward the storage layer or toward the logical data layer.) Preferred terminology: * **Serialized block** - bytes that go into storage. For block-based table (usually the case) this includes the block trailer. WART: block `size` may or may not include the trailer; need to be clear about whether it does or not. * **Maybe compressed block** - like a serialized block, but without the trailer (or no promise of including a trailer). Must be accompanied by a CompressionType. * **Uncompressed block** - "payload" bytes that are either stored with no compression, used as input to compression function, or result of decompression function. * **Parsed block** - an in-memory form of a block in block cache, as it is used by the table reader. Different C++ types are used depending on the block type (see block_like_traits.h). Other refactorings: * Misc corrections/improvements of internal API comments * Remove a few misleading / unhelpful / redundant comments. * Use move semantics in some places to simplify contracts * Use better parameter names to indicate which parameters are used for outputs * Remove some extraneous `extern` * Various clean-ups to `CacheDumperImpl` (mostly unnecessary code) Pull Request resolved: https://github.com/facebook/rocksdb/pull/10408 Test Plan: existing tests Reviewed By: akankshamahajan15 Differential Revision: D38172617 Pulled By: pdillinger fbshipit-source-id: ccb99299f324ac5ca46996d34c5089621a4f260c
-
- 22 9月, 2022 7 次提交
-
-
由 Levi Tamasi 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/10718 Reviewed By: riversand963 Differential Revision: D39707115 Pulled By: ltamasi fbshipit-source-id: 59aec8c732482f063d0abaad4d9200ba57ebf437
-
由 Changyu Bi 提交于
Summary: Mention in HISTORY.md the fix in https://github.com/facebook/rocksdb/issues/10705. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10720 Reviewed By: riversand963 Differential Revision: D39709455 Pulled By: cbi42 fbshipit-source-id: 1a2c9dd8425c73f7095ddb1d0b1cca8ed35b7ef2
-
由 anand76 提交于
Summary: Change the library order in PLATFORM_LDFLAGS to enable fbcode platform 10 build with folly. This PR also has a few fixes for platform 10 compiler errors. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10708 Test Plan: ROCKSDB_FBCODE_BUILD_WITH_PLATFORM010=1 USE_COROUTINES=1 make -j64 check ROCKSDB_FBCODE_BUILD_WITH_PLATFORM010=1 USE_FOLLY=1 make -j64 check Reviewed By: ajkr Differential Revision: D39666590 Pulled By: anand1976 fbshipit-source-id: 256a1127ef561399cd6299a6a392ca29bd68ca44
-
由 Akanksha Mahajan 提交于
Summary: Update io_uring_prep_cancel as it is now backward compatible. Also, io_uring_prep_cancel expects sqe->addr to match with read request submitted. It's being set wrong which is now fixed in this PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10644 Test Plan: - Ran internally with lastest liburing package and on RocksDB github repo with older version. - Ran seekrandom regression to confirm there is no regression. Reviewed By: anand1976 Differential Revision: D39284229 Pulled By: akankshamahajan15 fbshipit-source-id: fd52cdf23d676da114896163626b75c8ae09c980
-
由 Changyu Bi 提交于
Summary: Fix potential memory leak in ArenaWrappedDBIter::Refresh() introduced in https://github.com/facebook/rocksdb/issues/10705. See https://github.com/facebook/rocksdb/pull/10705#discussion_r976765905 for detail. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10716 Test Plan: make check Reviewed By: ajkr Differential Revision: D39698561 Pulled By: cbi42 fbshipit-source-id: dc0d0c6e3878eaa84f87623fbe4916b9b08b077a
-
由 Bo Wang 提交于
Summary: As title Pull Request resolved: https://github.com/facebook/rocksdb/pull/10717 Test Plan: Unit Tests CI Reviewed By: riversand963 Differential Revision: D39700707 Pulled By: gitbw95 fbshipit-source-id: 54de27e695535a50159f5f6467da36aaf21bebae
-
由 Changyu Bi 提交于
Summary: when there is a single memtable without range tombstones and no SST files in the database, DBIter should wrap memtable iterator directly. Currently we create a merging iterator on top of the memtable iterator, and have DBIter wrap around it. This causes iterator regression and this PR fixes this issue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10705 Test Plan: - `make check` - Performance: - Set up: `./db_bench -benchmarks=filluniquerandom -write_buffer_size=$((1 << 30)) -num=10000` - Benchmark: `./db_bench -benchmarks=seekrandom -use_existing_db=true -avoid_flush_during_recovery=true -write_buffer_size=$((1 << 30)) -num=10000 -threads=16 -duration=60 -seek_nexts=$seek_nexts` ``` seek_nexts main op/sec https://github.com/facebook/rocksdb/issues/10705 RocksDB v7.6 0 5746568 5749033 5786180 30 2411690 3006466 2837699 1000 102556 128902 124667 ``` Reviewed By: ajkr Differential Revision: D39644221 Pulled By: cbi42 fbshipit-source-id: 8063ff611ba31b0e5670041da3927c8c54b2097d
-
- 21 9月, 2022 1 次提交
-
-
由 Bo Wang 提交于
Summary: Enable BLACK for internal_repo_rocksdb. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10710 Reviewed By: riversand963, zsol Differential Revision: D39666245 Pulled By: gitbw95 fbshipit-source-id: ef364318d2bbba66e96f3211dd6a975174d52c21
-
- 20 9月, 2022 3 次提交
-
-
由 Jay Zhuang 提交于
Summary: There're 2 knobs to disable blobdb, adding that. Also print call stack when there's assert failure. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10699 Reviewed By: gitbw95 Differential Revision: D39596448 Pulled By: jay-zhuang fbshipit-source-id: 5ce9fd0630d8b6ff1e157a2685a1e80a99997098
-
由 Jay Zhuang 提交于
Summary: The background compaction may still running while the test end, which would cause ASAN stack-use-after-scope error. Explicitly close the DB before test end. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10697 Test Plan: able to reproduce with: ``` gtest-parallel ./compaction_service_test --gtest_filter=CompactionServiceTest.BasicCompactions -r 10000 -w 100 ``` Reviewed By: gitbw95 Differential Revision: D39590974 Pulled By: jay-zhuang fbshipit-source-id: da264b2e6a276afbda7d5ff7adb9d7b8d4213d90
-
由 Yanqin Jin 提交于
Summary: This PR bumps up version number from 7.7 to 7.8 in main branch, indicating that next release will be 7.8. We are going to release 7.7 soon. Since 7.7.fb branch has been created, we can land this to main. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10694 Reviewed By: gitbw95 Differential Revision: D39581577 Pulled By: riversand963 fbshipit-source-id: 84f3fecf25fd9ac96e46b4cd6d50ddb6edc89427
-
- 19 9月, 2022 1 次提交
-
-
由 anand76 提交于
Summary: Fix invalid reference in MultiGet due to resizing of the ```batches``` autovector. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10702 Test Plan: Run asan crash test Reviewed By: riversand963 Differential Revision: D39608753 Pulled By: anand1976 fbshipit-source-id: 7a9e7fc6f436f08eb22003d0e6b0e1e4dcdc1a2a
-
- 17 9月, 2022 3 次提交
-
-
由 gitbw95 提交于
Summary: `enable_custom_split_merge` is added for enabling the custom split and merge feature, which split the compressed value into chunks so that they may better fit jemalloc bins. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10690 Test Plan: Unit Tests Stress Tests Reviewed By: anand1976 Differential Revision: D39567604 Pulled By: gitbw95 fbshipit-source-id: f6d1d46200f365220055f793514601dcb0edc4b7
-
由 anand76 提交于
Summary: The assertion in ```FilePickerMultiGet::ReplaceRange()``` was incorrect. The function should only be called to replace the range after finishing the search in the current level, which is indicated by ```hit_file_ == nullptr``` i.e no more overlapping files in this level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10695 Reviewed By: gitbw95 Differential Revision: D39583217 Pulled By: anand1976 fbshipit-source-id: d4cedfb2b62fb9f3a083e9848a403ae6342f0519
-
由 Peter Dillinger 提交于
Summary: This change establishes a distinctive name for the experimental new lock-free clock cache (originally developed by guidotag and revamped in PR https://github.com/facebook/rocksdb/issues/10626). A few reasons: * We want to make it clear that this is a fundamentally different implementation vs. the old clock cache, to avoid people saying "I already tried clock cache." * We want to highlight the key feature: it's fast (especially under parallel load) * Because it requires an estimated charge per entry, it is not drop-in API compatible with old clock cache. This estimate might always be required for highest performance, and giving it a distinct name should reduce confusion about the distinct API requirements. * We might develop a variant requiring the same estimate parameter but with LRU eviction. In that case, using the name HyperLRUCache should make things more clear. (FastLRUCache is just a prototype that might soon be removed.) Some API detail: * To reduce copy-pasting parameter lists, etc. as in LRUCache construction, I have a `MakeSharedCache()` function on `HyperClockCacheOptions` instead of `NewHyperClockCache()`. * Changes -cache_type=clock_cache to -cache_type=hyper_clock_cache for applicable tools. I think this is more consistent / sustainable for reasons already stated. For performance tests see https://github.com/facebook/rocksdb/pull/10626 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10684 Test Plan: no interesting functional changes; tests updated Reviewed By: anand1976 Differential Revision: D39547800 Pulled By: pdillinger fbshipit-source-id: 5c0fe1b5cf3cb680ab369b928c8569682b9795bf
-
- 16 9月, 2022 7 次提交
-
-
由 Peter Dillinger 提交于
Summary: * Consolidates most metadata into a single word per slot so that more can be accomplished with a single atomic update. In the common case, Lookup was previously about 4 atomic updates, now just 1 atomic update. Common case Release was previously 1 atomic read + 1 atomic update, now just 1 atomic update. * Eliminate spins / waits / yields, which likely threaten some "lock free" benefits. Compare-exchange loops are only used in explicit Erase, and strict_capacity_limit=true Insert. Eviction uses opportunistic compare- exchange. * Relaxes some aggressiveness and guarantees. For example, * Duplicate Inserts will sometimes go undetected and the shadow duplicate will age out with eviction. * In many cases, the older Inserted value for a given cache key will be kept (i.e. Insert does not support overwrite). * Entries explicitly erased (rather than evicted) might not be freed immediately in some rare cases. * With strict_capacity_limit=false, capacity limit is not tracked/enforced as precisely as LRUCache, but is self-correcting and should only deviate by a very small number of extra or fewer entries. * Use smaller "computed default" number of cache shards in many cases, because benefits to larger usage tracking / eviction pools outweigh the small cost of more lock-free atomic contention. The improvement in CPU and I/O is dramatic in some limit-memory cases. * Even without the sharding change, the eviction algorithm is likely more effective than LRU overall because it's more stateful, even though the "hot path" state tracking for it is essentially free with ref counting. It is like a generalized CLOCK with aging (see code comments). I don't have performance numbers showing a specific improvement, but in theory, for a Poisson access pattern to each block, keeping some state allows better estimation of time to next access (Poisson interval) than strict LRU. The bounded randomness in CLOCK can also reduce "cliff" effect for repeated range scans approaching and exceeding cache size. ## Hot path algorithm comparison Rough descriptions, focusing on number and kind of atomic operations: * Old `Lookup()` (2-5 atomic updates per probe): ``` Loop: Increment internal ref count at slot If possible hit: Check flags atomic (and non-atomic fields) If cache hit: Three distinct updates to 'flags' atomic Increment refs for internal-to-external Return Decrement internal ref count while atomic read 'displacements' > 0 ``` * New `Lookup()` (1-2 atomic updates per probe): ``` Loop: Increment acquire counter in meta word (optimistic) If visible entry (already read meta word): If match (read non-atomic fields): Return Else: Decrement acquire counter in meta word Else if invisible entry (rare, already read meta word): Decrement acquire counter in meta word while atomic read 'displacements' > 0 ``` * Old `Release()` (1 atomic update, conditional on atomic read, rarely more): ``` Read atomic ref count If last reference and invisible (rare): Use CAS etc. to remove Return Else: Decrement ref count ``` * New `Release()` (1 unconditional atomic update, rarely more): ``` Increment release counter in meta word If last reference and invisible (rare): Use CAS etc. to remove Return ``` ## Performance test setup Build DB with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=fillrandom -num=30000000 -disable_wal=1 -bloom_bits=16 ``` Test with ``` TEST_TMPDIR=/dev/shm ./db_bench -benchmarks=readrandom -readonly -num=30000000 -bloom_bits=16 -cache_index_and_filter_blocks=1 -cache_size=${CACHE_MB}000000 -duration 60 -threads=$THREADS -statistics ``` Numbers on a single socket Skylake Xeon system with 48 hardware threads, DEBUG_LEVEL=0 PORTABLE=0. Very similar story on a dual socket system with 80 hardware threads. Using (every 2nd) Fibonacci MB cache sizes to sample the territory between powers of two. Configurations: base: LRUCache before this change, but with db_bench change to default cache_numshardbits=-1 (instead of fixed at 6) folly: LRUCache before this change, with folly enabled (distributed mutex) but on an old compiler (sorry) gt_clock: experimental ClockCache before this change new_clock: experimental ClockCache with this change ## Performance test results First test "hot path" read performance, with block cache large enough for whole DB: 4181MB 1thread base -> kops/s: 47.761 4181MB 1thread folly -> kops/s: 45.877 4181MB 1thread gt_clock -> kops/s: 51.092 4181MB 1thread new_clock -> kops/s: 53.944 4181MB 16thread base -> kops/s: 284.567 4181MB 16thread folly -> kops/s: 249.015 4181MB 16thread gt_clock -> kops/s: 743.762 4181MB 16thread new_clock -> kops/s: 861.821 4181MB 24thread base -> kops/s: 303.415 4181MB 24thread folly -> kops/s: 266.548 4181MB 24thread gt_clock -> kops/s: 975.706 4181MB 24thread new_clock -> kops/s: 1205.64 (~= 24 * 53.944) 4181MB 32thread base -> kops/s: 311.251 4181MB 32thread folly -> kops/s: 274.952 4181MB 32thread gt_clock -> kops/s: 1045.98 4181MB 32thread new_clock -> kops/s: 1370.38 4181MB 48thread base -> kops/s: 310.504 4181MB 48thread folly -> kops/s: 268.322 4181MB 48thread gt_clock -> kops/s: 1195.65 4181MB 48thread new_clock -> kops/s: 1604.85 (~= 24 * 1.25 * 53.944) 4181MB 64thread base -> kops/s: 307.839 4181MB 64thread folly -> kops/s: 272.172 4181MB 64thread gt_clock -> kops/s: 1204.47 4181MB 64thread new_clock -> kops/s: 1615.37 4181MB 128thread base -> kops/s: 310.934 4181MB 128thread folly -> kops/s: 267.468 4181MB 128thread gt_clock -> kops/s: 1188.75 4181MB 128thread new_clock -> kops/s: 1595.46 Whether we have just one thread on a quiet system or an overload of threads, the new version wins every time in thousand-ops per second, sometimes dramatically so. Mutex-based implementation quickly becomes contention-limited. New clock cache shows essentially perfect scaling up to number of physical cores (24), and then each hyperthreaded core adding about 1/4 the throughput of an additional physical core (see 48 thread case). Block cache miss rates (omitted above) are negligible across the board. With partitioned instead of full filters, the maximum speed-up vs. base is more like 2.5x rather than 5x. Now test a large block cache with low miss ratio, but some eviction is required: 1597MB 1thread base -> kops/s: 46.603 io_bytes/op: 1584.63 miss_ratio: 0.0201066 max_rss_mb: 1589.23 1597MB 1thread folly -> kops/s: 45.079 io_bytes/op: 1530.03 miss_ratio: 0.019872 max_rss_mb: 1550.43 1597MB 1thread gt_clock -> kops/s: 48.711 io_bytes/op: 1566.63 miss_ratio: 0.0198923 max_rss_mb: 1691.4 1597MB 1thread new_clock -> kops/s: 51.531 io_bytes/op: 1589.07 miss_ratio: 0.0201969 max_rss_mb: 1583.56 1597MB 32thread base -> kops/s: 301.174 io_bytes/op: 1439.52 miss_ratio: 0.0184218 max_rss_mb: 1656.59 1597MB 32thread folly -> kops/s: 273.09 io_bytes/op: 1375.12 miss_ratio: 0.0180002 max_rss_mb: 1586.8 1597MB 32thread gt_clock -> kops/s: 904.497 io_bytes/op: 1411.29 miss_ratio: 0.0179934 max_rss_mb: 1775.89 1597MB 32thread new_clock -> kops/s: 1182.59 io_bytes/op: 1440.77 miss_ratio: 0.0185449 max_rss_mb: 1636.45 1597MB 128thread base -> kops/s: 309.91 io_bytes/op: 1438.25 miss_ratio: 0.018399 max_rss_mb: 1689.98 1597MB 128thread folly -> kops/s: 267.605 io_bytes/op: 1394.16 miss_ratio: 0.0180286 max_rss_mb: 1631.91 1597MB 128thread gt_clock -> kops/s: 691.518 io_bytes/op: 9056.73 miss_ratio: 0.0186572 max_rss_mb: 1982.26 1597MB 128thread new_clock -> kops/s: 1406.12 io_bytes/op: 1440.82 miss_ratio: 0.0185463 max_rss_mb: 1685.63 610MB 1thread base -> kops/s: 45.511 io_bytes/op: 2279.61 miss_ratio: 0.0290528 max_rss_mb: 615.137 610MB 1thread folly -> kops/s: 43.386 io_bytes/op: 2217.29 miss_ratio: 0.0289282 max_rss_mb: 600.996 610MB 1thread gt_clock -> kops/s: 46.207 io_bytes/op: 2275.51 miss_ratio: 0.0290057 max_rss_mb: 637.934 610MB 1thread new_clock -> kops/s: 48.879 io_bytes/op: 2283.1 miss_ratio: 0.0291253 max_rss_mb: 613.5 610MB 32thread base -> kops/s: 306.59 io_bytes/op: 2250 miss_ratio: 0.0288721 max_rss_mb: 683.402 610MB 32thread folly -> kops/s: 269.176 io_bytes/op: 2187.86 miss_ratio: 0.0286938 max_rss_mb: 628.742 610MB 32thread gt_clock -> kops/s: 855.097 io_bytes/op: 2279.26 miss_ratio: 0.0288009 max_rss_mb: 733.062 610MB 32thread new_clock -> kops/s: 1121.47 io_bytes/op: 2244.29 miss_ratio: 0.0289046 max_rss_mb: 666.453 610MB 128thread base -> kops/s: 305.079 io_bytes/op: 2252.43 miss_ratio: 0.0288884 max_rss_mb: 723.457 610MB 128thread folly -> kops/s: 269.583 io_bytes/op: 2204.58 miss_ratio: 0.0287001 max_rss_mb: 676.426 610MB 128thread gt_clock -> kops/s: 53.298 io_bytes/op: 8128.98 miss_ratio: 0.0292452 max_rss_mb: 956.273 610MB 128thread new_clock -> kops/s: 1301.09 io_bytes/op: 2246.04 miss_ratio: 0.0289171 max_rss_mb: 788.812 The new version is still winning every time, sometimes dramatically so, and we can tell from the maximum resident memory numbers (which contain some noise, by the way) that the new cache is not cheating on memory usage. IMPORTANT: The previous generation experimental clock cache appears to hit a serious bottleneck in the higher thread count configurations, presumably due to some of its waiting functionality. (The same bottleneck is not seen with partitioned index+filters.) Now we consider even smaller cache sizes, with higher miss ratios, eviction work, etc. 233MB 1thread base -> kops/s: 10.557 io_bytes/op: 227040 miss_ratio: 0.0403105 max_rss_mb: 247.371 233MB 1thread folly -> kops/s: 15.348 io_bytes/op: 112007 miss_ratio: 0.0372238 max_rss_mb: 245.293 233MB 1thread gt_clock -> kops/s: 6.365 io_bytes/op: 244854 miss_ratio: 0.0413873 max_rss_mb: 259.844 233MB 1thread new_clock -> kops/s: 47.501 io_bytes/op: 2591.93 miss_ratio: 0.0330989 max_rss_mb: 242.461 233MB 32thread base -> kops/s: 96.498 io_bytes/op: 363379 miss_ratio: 0.0459966 max_rss_mb: 479.227 233MB 32thread folly -> kops/s: 109.95 io_bytes/op: 314799 miss_ratio: 0.0450032 max_rss_mb: 400.738 233MB 32thread gt_clock -> kops/s: 2.353 io_bytes/op: 385397 miss_ratio: 0.048445 max_rss_mb: 500.688 233MB 32thread new_clock -> kops/s: 1088.95 io_bytes/op: 2567.02 miss_ratio: 0.0330593 max_rss_mb: 303.402 233MB 128thread base -> kops/s: 84.302 io_bytes/op: 378020 miss_ratio: 0.0466558 max_rss_mb: 1051.84 233MB 128thread folly -> kops/s: 89.921 io_bytes/op: 338242 miss_ratio: 0.0460309 max_rss_mb: 812.785 233MB 128thread gt_clock -> kops/s: 2.588 io_bytes/op: 462833 miss_ratio: 0.0509158 max_rss_mb: 1109.94 233MB 128thread new_clock -> kops/s: 1299.26 io_bytes/op: 2565.94 miss_ratio: 0.0330531 max_rss_mb: 361.016 89MB 1thread base -> kops/s: 0.574 io_bytes/op: 5.35977e+06 miss_ratio: 0.274427 max_rss_mb: 91.3086 89MB 1thread folly -> kops/s: 0.578 io_bytes/op: 5.16549e+06 miss_ratio: 0.27276 max_rss_mb: 96.8984 89MB 1thread gt_clock -> kops/s: 0.512 io_bytes/op: 4.13111e+06 miss_ratio: 0.242817 max_rss_mb: 119.441 89MB 1thread new_clock -> kops/s: 48.172 io_bytes/op: 2709.76 miss_ratio: 0.0346162 max_rss_mb: 100.754 89MB 32thread base -> kops/s: 5.779 io_bytes/op: 6.14192e+06 miss_ratio: 0.320399 max_rss_mb: 311.812 89MB 32thread folly -> kops/s: 5.601 io_bytes/op: 5.83838e+06 miss_ratio: 0.313123 max_rss_mb: 252.418 89MB 32thread gt_clock -> kops/s: 0.77 io_bytes/op: 3.99236e+06 miss_ratio: 0.236296 max_rss_mb: 396.422 89MB 32thread new_clock -> kops/s: 1064.97 io_bytes/op: 2687.23 miss_ratio: 0.0346134 max_rss_mb: 155.293 89MB 128thread base -> kops/s: 4.959 io_bytes/op: 6.20297e+06 miss_ratio: 0.323945 max_rss_mb: 823.43 89MB 128thread folly -> kops/s: 4.962 io_bytes/op: 5.9601e+06 miss_ratio: 0.319857 max_rss_mb: 626.824 89MB 128thread gt_clock -> kops/s: 1.009 io_bytes/op: 4.1083e+06 miss_ratio: 0.242512 max_rss_mb: 1095.32 89MB 128thread new_clock -> kops/s: 1224.39 io_bytes/op: 2688.2 miss_ratio: 0.0346207 max_rss_mb: 218.223 ^ Now something interesting has happened: the new clock cache has gained a dramatic lead in the single-threaded case, and this is because the cache is so small, and full filters are so big, that dividing the cache into 64 shards leads to significant (random) imbalances in cache shards and excessive churn in imbalanced shards. This new clock cache only uses two shards for this configuration, and that helps to ensure that entries are part of a sufficiently big pool that their eviction order resembles the single-shard order. (This effect is not seen with partitioned index+filters.) Even smaller cache size: 34MB 1thread base -> kops/s: 0.198 io_bytes/op: 1.65342e+07 miss_ratio: 0.939466 max_rss_mb: 48.6914 34MB 1thread folly -> kops/s: 0.201 io_bytes/op: 1.63416e+07 miss_ratio: 0.939081 max_rss_mb: 45.3281 34MB 1thread gt_clock -> kops/s: 0.448 io_bytes/op: 4.43957e+06 miss_ratio: 0.266749 max_rss_mb: 100.523 34MB 1thread new_clock -> kops/s: 1.055 io_bytes/op: 1.85439e+06 miss_ratio: 0.107512 max_rss_mb: 75.3125 34MB 32thread base -> kops/s: 3.346 io_bytes/op: 1.64852e+07 miss_ratio: 0.93596 max_rss_mb: 180.48 34MB 32thread folly -> kops/s: 3.431 io_bytes/op: 1.62857e+07 miss_ratio: 0.935693 max_rss_mb: 137.531 34MB 32thread gt_clock -> kops/s: 1.47 io_bytes/op: 4.89704e+06 miss_ratio: 0.295081 max_rss_mb: 392.465 34MB 32thread new_clock -> kops/s: 8.19 io_bytes/op: 3.70456e+06 miss_ratio: 0.20826 max_rss_mb: 519.793 34MB 128thread base -> kops/s: 2.293 io_bytes/op: 1.64351e+07 miss_ratio: 0.931866 max_rss_mb: 449.484 34MB 128thread folly -> kops/s: 2.34 io_bytes/op: 1.6219e+07 miss_ratio: 0.932023 max_rss_mb: 396.457 34MB 128thread gt_clock -> kops/s: 1.798 io_bytes/op: 5.4241e+06 miss_ratio: 0.324881 max_rss_mb: 1104.41 34MB 128thread new_clock -> kops/s: 10.519 io_bytes/op: 2.39354e+06 miss_ratio: 0.136147 max_rss_mb: 1050.52 As the miss ratio gets higher (say, above 10%), the CPU time spent in eviction starts to erode the advantage of using fewer shards (13% miss rate much lower than 94%). LRU's O(1) eviction time can eventually pay off when there's enough block cache churn: 13MB 1thread base -> kops/s: 0.195 io_bytes/op: 1.65732e+07 miss_ratio: 0.946604 max_rss_mb: 45.6328 13MB 1thread folly -> kops/s: 0.197 io_bytes/op: 1.63793e+07 miss_ratio: 0.94661 max_rss_mb: 33.8633 13MB 1thread gt_clock -> kops/s: 0.519 io_bytes/op: 4.43316e+06 miss_ratio: 0.269379 max_rss_mb: 100.684 13MB 1thread new_clock -> kops/s: 0.176 io_bytes/op: 1.54148e+07 miss_ratio: 0.91545 max_rss_mb: 66.2383 13MB 32thread base -> kops/s: 3.266 io_bytes/op: 1.65544e+07 miss_ratio: 0.943386 max_rss_mb: 132.492 13MB 32thread folly -> kops/s: 3.396 io_bytes/op: 1.63142e+07 miss_ratio: 0.943243 max_rss_mb: 101.863 13MB 32thread gt_clock -> kops/s: 2.758 io_bytes/op: 5.13714e+06 miss_ratio: 0.310652 max_rss_mb: 396.121 13MB 32thread new_clock -> kops/s: 3.11 io_bytes/op: 1.23419e+07 miss_ratio: 0.708425 max_rss_mb: 321.758 13MB 128thread base -> kops/s: 2.31 io_bytes/op: 1.64823e+07 miss_ratio: 0.939543 max_rss_mb: 425.539 13MB 128thread folly -> kops/s: 2.339 io_bytes/op: 1.6242e+07 miss_ratio: 0.939966 max_rss_mb: 346.098 13MB 128thread gt_clock -> kops/s: 3.223 io_bytes/op: 5.76928e+06 miss_ratio: 0.345899 max_rss_mb: 1087.77 13MB 128thread new_clock -> kops/s: 2.984 io_bytes/op: 1.05341e+07 miss_ratio: 0.606198 max_rss_mb: 898.27 gt_clock is clearly blowing way past its memory budget for lower miss rates and best throughput. new_clock also seems to be exceeding budgets, and this warrants more investigation but is not the use case we are targeting with the new cache. With partitioned index+filter, the miss ratio is much better, and although still high enough that the eviction CPU time is definitely offsetting mutex contention: 13MB 1thread base -> kops/s: 16.326 io_bytes/op: 23743.9 miss_ratio: 0.205362 max_rss_mb: 65.2852 13MB 1thread folly -> kops/s: 15.574 io_bytes/op: 19415 miss_ratio: 0.184157 max_rss_mb: 56.3516 13MB 1thread gt_clock -> kops/s: 14.459 io_bytes/op: 22873 miss_ratio: 0.198355 max_rss_mb: 63.9688 13MB 1thread new_clock -> kops/s: 16.34 io_bytes/op: 24386.5 miss_ratio: 0.210512 max_rss_mb: 61.707 13MB 128thread base -> kops/s: 289.786 io_bytes/op: 23710.9 miss_ratio: 0.205056 max_rss_mb: 103.57 13MB 128thread folly -> kops/s: 185.282 io_bytes/op: 19433.1 miss_ratio: 0.184275 max_rss_mb: 116.219 13MB 128thread gt_clock -> kops/s: 354.451 io_bytes/op: 23150.6 miss_ratio: 0.200495 max_rss_mb: 102.871 13MB 128thread new_clock -> kops/s: 295.359 io_bytes/op: 24626.4 miss_ratio: 0.212452 max_rss_mb: 121.109 Pull Request resolved: https://github.com/facebook/rocksdb/pull/10626 Test Plan: updated unit tests, stress/crash test runs including with TSAN, ASAN, UBSAN Reviewed By: anand1976 Differential Revision: D39368406 Pulled By: pdillinger fbshipit-source-id: 5afc44da4c656f8f751b44552bbf27bd3ca6fef9
-
由 anand76 提交于
Summary: The stats were not accurate for the coroutine version of MultiGet. This PR fixes it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10673 Reviewed By: akankshamahajan15 Differential Revision: D39492615 Pulled By: anand1976 fbshipit-source-id: b46c04e15ea27e66f4c31f00c66497aa283bf9d3
-
由 Yanqin Jin 提交于
Summary: Hopefully, we can re-enable the combination of user-defined timestamp and subcompactions after https://github.com/facebook/rocksdb/issues/10658. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10689 Test Plan: Make sure the following succeeds on devserver. make crash_test_with_ts Reviewed By: ltamasi Differential Revision: D39556558 Pulled By: riversand963 fbshipit-source-id: 4695f420b1bc9ebf3b24640b693746f4db82c149
-
由 anand76 提交于
Summary: Fix a bug in the async IO/coroutine version of MultiGet that may cause a segfault or assertion failure due to accessing an invalid file index in a LevelFilesBrief. The bug is that when a MultiGetRange is split into two, we may re-process keys in the original range that were already marked to be skipped (in ```current_level_range_```) due to not overlapping the level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10688 Reviewed By: gitbw95 Differential Revision: D39556131 Pulled By: anand1976 fbshipit-source-id: 65e79438508a283cb19e64eca5c91d0714b81458
-
由 Andrew Kryczka 提交于
Summary: One problem of the previous strategy was `NonBatchedOpsStressTest::TestIngestExternalFile()` could release the lock for `rand_keys[0]` in `rand_column_families[0]`, and then subsequent operations in the same loop iteration (e.g., `TestPut()`) would run without locking. This PR changes the strategy so each `Test*()` function is responsible for acquiring and releasing its own locks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10678 Reviewed By: hx235 Differential Revision: D39516401 Pulled By: ajkr fbshipit-source-id: bf67f12ebbd293ba8c24fdf8754ff28737bcd758
-
由 Levi Tamasi 提交于
Summary: The patch makes it possible to use the `JemallocNodumpAllocator` with the block/blob caches in `db_bench`. In addition to its stated purpose of excluding cache contents from core dumps, `JemallocNodumpAllocator` also uses a dedicated arena and jemalloc tcaches for cache allocations, which can reduce fragmentation and thus memory usage. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10685 Reviewed By: riversand963 Differential Revision: D39552261 Pulled By: ltamasi fbshipit-source-id: b5c58eab6b7c1baa9a307d9f1248df1d7a77d2b5
-
由 Bo Wang 提交于
Summary: Disable this flaky test since PersistentCache is not used. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10683 Test Plan: Unit Tests Reviewed By: cbi42 Differential Revision: D39545974 Pulled By: gitbw95 fbshipit-source-id: ac53e96f6ba880e7612e325eb5ff22ee2799efed
-
- 15 9月, 2022 2 次提交
-
-
由 Jay Zhuang 提交于
Summary: Disable the tiered storage + BlobDB test. Also enable different hot data setting for Tiered compaction Pull Request resolved: https://github.com/facebook/rocksdb/pull/10681 Reviewed By: ajkr Differential Revision: D39531941 Pulled By: jay-zhuang fbshipit-source-id: aa0595eb38d03f17638d300d2e4cc9061429bf61
-
由 Jay Zhuang 提交于
Summary: Consolidate compaction output cut logic to `ShouldStopBefore()` and move it inside of CompactionOutputs class. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10629 Reviewed By: cbi42 Differential Revision: D39315536 Pulled By: jay-zhuang fbshipit-source-id: 7d81037babbd35c276bbaad02dbc2bb555fdac18
-