提交 · e628f59e87ddad7a859949413421cd1e743d6a2a · kvdb / rocksdb

07 1月, 2021 1 次提交

Create a CustomEnv class; Add WinFileSystem; Make LegacyFileSystemWrapper private (#7703) · e628f59e

由 mrambacher 提交于 1月 06, 2021

Summary:
This PR does the following:
-> Creates a WinFileSystem class.  This class is the Windows equivalent of the PosixFileSystem and will be used on Windows systems.
-> Introduces a CustomEnv class.  A CustomEnv is an Env that takes a FileSystem as constructor argument.  I believe there will only ever be two implementations of this class (PosixEnv and WinEnv).  There is still a CustomEnvWrapper class that takes an Env and a FileSystem and wraps the Env calls with the input Env but uses the FileSystem for the FileSystem calls
-> Eliminates the public uses of the LegacyFileSystemWrapper.

With this change in place, there are effectively the following patterns of Env:
- "Base Env classes" (PosixEnv, WinEnv).  These classes implement the core Env functions (e.g. Threads) and have a hard-coded input FileSystem.  These classes inherit from CompositeEnv, implement the core Env functions (threads) and delegate the FileSystem-like calls to the input file system.
- Wrapped Composite Env classes (MemEnv).  These classes take in an Env and a FileSystem.  The core env functions are re-directed to the wrapped env.  The file system calls are redirected to the input file system
- Legacy Wrapped Env classes.  These classes take in an Env input (but no FileSystem).  The core env functions are re-directed to the wrapped env.  A "Legacy File System" is created using this env and the file system calls directed to the env itself.

With these changes in place, the PosixEnv becomes a singleton -- there is only ever one created.  Any other use of the PosixEnv is via another wrapped env.  This cleans up some of the issues with the env construction and destruction.

Additionally, there were places in the code that required had an Env when they required a FileSystem.  Many of these places would wrap the Env with a LegacyFileSystemWrapper instead of using the env->GetFileSystem().  These places were changed, thereby removing layers of additional redirection (LegacyFileSystem --> Env --> Env::FileSystem).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7703

Reviewed By: zhichao-cao

Differential Revision: D25762190

Pulled By: anand1976

fbshipit-source-id: 1a088e97fc916f28ac69c149cd1dcad0ab31704b

e628f59e

05 1月, 2021 9 次提交

Make StringEnv, StringSink, StringSource use FS classes (#7786) · c1a65a4d

由 mrambacher 提交于 1月 04, 2021

Summary:
Change the StringEnv and related classes to be based on FileSystem APIs rather than the corresponding Env ones. The StringSink and StringSource classes were changed to be based on the corresponding FS file classes.

Part of a cleanup to use the newer interfaces. This change also eliminates some of the casts/wrappers to LegacyFile classes.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7786

Reviewed By: jay-zhuang

Differential Revision: D25761460

Pulled By: anand1976

fbshipit-source-id: 428ae8e32b3db97dbeeca08c9d3bb0d9d4d3a38f

c1a65a4d

Use mock time for histogram_test (#7799) · 58660bf2

由 Jay Zhuang 提交于 1月 04, 2021

Summary:
`histogram_test` uses real sleep, which depends on the test executing speed, it makes the test unstable.
Switching to using a mock time env, it can also increase the test speed (from 10100ms -> 100ms).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7799

Test Plan:
run test 10 times, all passed. vs. without fix 3 out 10 test failed:
no fix: https://app.circleci.com/pipelines/github/facebook/rocksdb?branch=pull%2F7797
with fix: https://app.circleci.com/pipelines/github/facebook/rocksdb?branch=pull%2F7799

Reviewed By: pdillinger

Differential Revision: D25676948

Pulled By: jay-zhuang

fbshipit-source-id: 64c273fc299c53283138dbb213386e4b45e8bdc2

58660bf2

Verify file checksum generator name (#7824) · 225abffd

由 Andrew Kryczka 提交于 1月 04, 2021

Summary:
Previously we only had a debug assertion to check the right generator was being used for verification. However a user hit a problem in production where their factory was creating the wrong generator for some files, leading to checksum mismatches. It would have been easier to debug if we verified in optimized builds that the generator with the proper name is used. This PR adds such verification.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7824

Reviewed By: zhichao-cao

Differential Revision: D25740254

Pulled By: ajkr

fbshipit-source-id: a6231521747605021bad3231484b5d4f99f4044f

225abffd

Fix typos in comments (#7790) · 159ea470

由 Dylan Wen 提交于 1月 04, 2021

Summary:
Hi there,

This PR fixes some typos in comments.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7790

Reviewed By: ajkr

Differential Revision: D25684213

Pulled By: zhichao-cao

fbshipit-source-id: b77026018cbdd59c9db25aa73edeb359d9962f3e

159ea470

Support --hex flag in `ldb file_checksum_dump` (#7820) · b8c01ed3

由 Andrew Kryczka 提交于 1月 04, 2021

Summary:
Prior to this PR it prints the raw bytes which can include non-printable
characters. This PR adds the option to print in hex instead.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7820

Test Plan:
try it out

```
$ ./ldb file_checksum_dump --hex --db=/tmp/rocksdbtest-9383//db_basic_test_12281129388755189514/
16, FileChecksumCrc32c, 0xC789D948
```

Reviewed By: jay-zhuang

Differential Revision: D25738072

Pulled By: ajkr

fbshipit-source-id: 8cf2856877971756c0495cfa63a9a1281c414dc7

b8c01ed3

Ignore the OnAddFile Status for SSTFileManager (#7826) · 0bad2b43

由 mrambacher 提交于 1月 04, 2021

Summary:
The returned Status is ignored here as some stress tests are failing, presumably when attempting to add an empty file.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7826

Reviewed By: jay-zhuang

Differential Revision: D25742931

fbshipit-source-id: a1fcd620d9472993a009929306dfc421f93eb43b

0bad2b43

fix thread status synchronization in thread_list_test (#7825) · 61e32442

由 Andrew Kryczka 提交于 1月 04, 2021

Summary:
The test was flaky because the BG threads could increase
`running_count_` up to `job_count_` before applying their thread status
updates. Then the test thread would see non-deterministic results when
counting threads with each status. The fix is to acquire mutex in test
thread so it sees `running_count_` and thread status updated atomically.
I think simply reordering the two updates would have been insufficient
since the thread status update uses `memory_order_relaxed`. This change
happens to also eliminate an undesirable sleep loop.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7825

Test Plan:
injected sleeps to verify the failure repros before this PR and does not
repro after.

Reviewed By: jay-zhuang

Differential Revision: D25742409

Pulled By: ajkr

fbshipit-source-id: 926a2223fe856e20bc4c0c27df6736ee5cb02c97

61e32442

Update RocksJava static compression dependencies (#7804) · bb0f781d

由 Adam Retter 提交于 1月 04, 2021

Summary:
Updates LZ4 and ZStd to the latest versions.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7804

Reviewed By: ajkr

Differential Revision: D25733770

Pulled By: pdillinger

fbshipit-source-id: ea74ef9eecb57fc47934ef1d4ff950c99ddd5158

bb0f781d

Eliminate the creation of ImmutableDBOptions in WBWI::GetFromBatch (#6851) · 81367a46

由 mrambacher 提交于 1月 04, 2021

Summary:
1. Made `WriteBatchWithIndexInternal` into a class that stores the `DB*` or `DBOptions*`.

2. Changed the `GetFromBatch` method to be non-static and use an instance of the class.  Added `MergeKey` methods to perform the merge itself and return any status.

This change unifies the multiple calls to the `MergeHelper` under a single wrapped API.

Closes https://github.com/facebook/rocksdb/issues/6683

Pull Request resolved: https://github.com/facebook/rocksdb/pull/6851

Reviewed By: ajkr

Differential Revision: D21706574

Pulled By: pdillinger

fbshipit-source-id: 6860bd64d62669aaa591846e914eed3b674e68b1

81367a46

31 12月, 2020 4 次提交

Increase the txn lock timeout in stress test (#7823) · 30cd38c6

由 Cheng Chang 提交于 12月 30, 2020

Summary:
We recently encounter two cases of txn lock timeout in stress test. It might be caused due to latencies of resource scheduling in the internal infrastructure. Hopefully increasing the timeout can make the related tests less flaky.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7823

Test Plan: watch internal stress test to pass.

Reviewed By: siying

Differential Revision: D25739233

Pulled By: cheng-chang

fbshipit-source-id: 84a5a8ae820db24dacd0cfc05928b26505fab89d

30cd38c6

rocksdb_transaction_get_for_update now exports (#6293) · edb0b1fb

由 jbosh 提交于 12月 30, 2020

Summary:
Added missing ROCKSDB_LIBRARY_API decorator to rocksdb_transaction_get_for_update.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/6293

Reviewed By: jay-zhuang

Differential Revision: D25234298

Pulled By: ajkr

fbshipit-source-id: 8a4817adaec1f445f338c8d8c59d3392925b5721

edb0b1fb

Attempt to fix build errors around missing compression library includes (#7803) · fd2db79f

由 Adam Retter 提交于 12月 30, 2020

Summary:
This fixes an issue introduced in https://github.com/facebook/rocksdb/pull/7769 that caused many errors about missing compression libraries to be displayed during compilation, although compilation actually succeeded. This PR fixes the compilation so the compression libraries are only introduced where strictly needed.

It likely needs to be merged into the same branches as https://github.com/facebook/rocksdb/pull/7769 which I think are:
1. master
2. 6.15.fb
3. 6.16.fb

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7803

Reviewed By: ramvadiv

Differential Revision: D25733743

Pulled By: pdillinger

fbshipit-source-id: 6c04f6864b2ff4a345841d791a89b19e0e3f5bf7

fd2db79f

Return Status from FilePrefetchBuffer::TryReadFromCache() (#7816) · 01298c8f

由 anand76 提交于 12月 30, 2020

Summary:
Return the Status from TryReadFromCache() in an argument to make it easier to report prefetch errors to the user.

Tests:
make crash_test
make check

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7816

Reviewed By: akankshamahajan15

Differential Revision: D25717222

Pulled By: anand1976

fbshipit-source-id: c320d3c12d4146bda16df78ff6927eee584c1810

01298c8f

29 12月, 2020 1 次提交

Fix db_bench duration for multireadrandom benchmark (#7817) · d7738666

由 anand76 提交于 12月 28, 2020

Summary:
The multireadrandom benchmark, when run for a specific number of reads (--reads argument), should base the duration on the actual number of keys read rather than number of batches.

Tests:
Run db_bench multireadrandom benchmark

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7817

Reviewed By: zhichao-cao

Differential Revision: D25717230

Pulled By: anand1976

fbshipit-source-id: 13f4d8162268cf9a34918655e60302d0aba3864b

d7738666

28 12月, 2020 1 次提交

Disable BasicLockEscalation if cannot determine whether TSAN is enabled (#7814) · 736c6dc5

由 cheng-chang 提交于 12月 27, 2020

Summary:
BasicLockEscalation will cause false-positive warnings under TSAN (this is a known issue in TSAN, see details in https://gist.github.com/spetrunia/77274cf2d5848e0a7e090d622695ed4e), skip this test if TSAN is enabled, or if we are not sure whether TSAN is enabled.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7814

Test Plan: watch the tsan contrun test to pass.

Reviewed By: zhichao-cao

Differential Revision: D25708094

Pulled By: cheng-chang

fbshipit-source-id: 4fc813ff373301d033d086154cc7bb60a5e95889

736c6dc5

27 12月, 2020 1 次提交

Add rate_limiter to GenerateOneFileChecksum (#7811) · 44ebc24d

由 Zhichao Cao 提交于 12月 26, 2020

Summary:
In GenerateOneFileChecksum(), RocksDB reads the file and computes its checksum. A rate limiter can be passed to the constructor of RandomAccessFileReader so that read I/O can be rate limited.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7811

Test Plan: make check

Reviewed By: cheng-chang

Differential Revision: D25699896

Pulled By: zhichao-cao

fbshipit-source-id: e2688bc1126c543979a3bcf91dda784bd7b74164

44ebc24d

26 12月, 2020 1 次提交

fix memory leak in db_stress checkpoint test (#7813) · 601585bc

由 Zhichao Cao 提交于 12月 25, 2020

Summary:
fix memory leak in db_stress checkpoint test. If s is not ok, checkpoint is not deleted, may cause memory leak.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7813

Test Plan: make asan_check

Reviewed By: cheng-chang

Differential Revision: D25702999

Pulled By: zhichao-cao

fbshipit-source-id: 08253b0852835acb8cfd412503cdabf720afb678

601585bc

24 12月, 2020 5 次提交

No elide constructors (#7798) · 55e99688

由 mrambacher 提交于 12月 23, 2020

Summary:
Added "no-elide-constructors to the ASSERT_STATUS_CHECK builds. This flag gives more errors/warnings for some of the Status checks where an inner class checks a Status and later returns it. In this case, without the elide check on, the returned status may not have been checked in the caller, thereby bypassing the checked code.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7798

Reviewed By: jay-zhuang

Differential Revision: D25680451

Pulled By: pdillinger

fbshipit-source-id: c3f14ed9e2a13f0a8c54d839d5fb4d1fc1e93917

55e99688

Update "num_data_read" stat in RetrieveMultipleBlocks (#7770) · 30a5ed9c

由 Akanksha Mahajan 提交于 12月 23, 2020

Summary:
RetrieveMultipleBlocks which is used by MultiGet to read data blocks is not updating num_data_read stat in
GetContextStats.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7770

Test Plan: make check -j64

Reviewed By: anand1976

Differential Revision: D25538982

Pulled By: akankshamahajan15

fbshipit-source-id: e3daedb035b1be8ab6af6f115cb3793ccc7b1ec6

30a5ed9c

Skip WALs according to MinLogNumberToKeep when creating checkpoint (#7789) · bdb7e544

由 cheng-chang 提交于 12月 23, 2020

Summary:
In a stress test failure, we observe that a WAL is skipped when creating checkpoint, although its log number >= MinLogNumberToKeep(). This might happen in the following case:

1. when creating the checkpoint, there are 2 column families: CF0 and CF1, and there are 2 WALs: 1, 2;
2. CF0's log number is 1, CF0's active memtable is empty, CF1's log number is 2, CF1's active memtable is not empty, WAL 2 is not empty, the sequence number points to WAL 2;
2. the checkpoint process flushes CF0, since CF0' active memtable is empty, there is no need to SwitchMemtable, thus no new WAL will be created, so CF0's log number is now 2, concurrently, some data is written to CF0 and WAL 2;
3. the checkpoint process flushes CF1, WAL 3 is created and CF1's log number is now 3, CF0's log number is still 2 because CF0 is not empty and WAL 2 contains its unflushed data concurrently written in step 2;
4. the checkpoint process determines that WAL 1 and 2 are no longer needed according to [live_wal_files[i]->StartSequence() >= *sequence_number](https://github.com/facebook/rocksdb/blob/master/utilities/checkpoint/checkpoint_impl.cc#L388), so it skips linking them to the checkpoint directory;
5. but according to `MinLogNumberToKeep()`, WAL 2 still needs to be kept because CF0's log number is 2.

If the checkpoint is reopened in read-only mode, and only read from the snapshot with the initial sequence number, then there will be no data loss or data inconsistency.

But if the checkpoint is reopened and read from the most recent sequence number, suppose in step 3, there are also data concurrently written to CF1 and WAL 3, then the most recent sequence number refers to the latest entry in WAL 3, so the data written in step 2 should also be visible, but since WAL 2 is discarded, those data are lost.

When tracking WAL in MANIFEST is enabled, when reopening the checkpoint, since WAL 2 is still tracked in MANIFEST as alive, but it's missing from the checkpoint directory, a corruption will be reported.

This PR makes the checkpoint process to only skip a WAL if its log number < `MinLogNumberToKeep`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7789

Test Plan: watch existing tests to pass.

Reviewed By: ajkr

Differential Revision: D25662346

Pulled By: cheng-chang

fbshipit-source-id: 136471095baa01886cf44809455cf855f24857a0

bdb7e544

Update regression_test.sh to run multireadrandom benchmark (#7802) · bd2645bc

由 anand76 提交于 12月 23, 2020

Summary:
Update the regression_test.sh script to run the multireadrandom benchmark.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7802

Reviewed By: zhichao-cao

Differential Revision: D25685482

Pulled By: anand1976

fbshipit-source-id: ef2973b551a1bbdbce198a0adf29fc277f3e65e2

bd2645bc

Remove flaky, redundant, and dubious DBTest.SparseMerge (#7800) · a727efca

由 Peter Dillinger 提交于 12月 23, 2020

Summary:
This test would occasionally fail like this:

    WARNING: c:\users\circleci\project\db\db_test.cc(1343): error: Expected:
    (dbfull()->TEST_MaxNextLevelOverlappingBytes(handles_[1])) <= (20 * 1048576), actual: 33501540 vs 20971520

And being a super old test, it's not structured in a sound way. And it appears that DBTest2.MaxCompactionBytesTest is a better test of what SparseMerge was intended to test. In fact, SparseMerge fails if I set

    options.max_compaction_bytes = options.target_file_size_base * 1000;

Thus, we are removing this negative-value test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7800

Test Plan: Q.E.D.

Reviewed By: ajkr

Differential Revision: D25693366

Pulled By: pdillinger

fbshipit-source-id: 9da07d4dce0559547fc938b2163a2015e956c548

a727efca

23 12月, 2020 8 次提交

Add more tests for assert status checked (#7524) · 02418194

由 mrambacher 提交于 12月 22, 2020

Summary:
Added 10 more tests that pass the ASSERT_STATUS_CHECKED test.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7524

Reviewed By: akankshamahajan15

Differential Revision: D24323093

Pulled By: ajkr

fbshipit-source-id: 28d4106d0ca1740c3b896c755edf82d504b74801

02418194

Range Locking: Implementation of range locking (#7506) · daab7603

由 Sergei Petrunia 提交于 12月 22, 2020

Summary:
Range Locking - an implementation based on the locktree library

- Add a RangeTreeLockManager and RangeTreeLockTracker which implement
  range locking using the locktree library.
- Point locks are handled as locks on single-point ranges.
- Add a unit test: range_locking_test

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7506

Reviewed By: akankshamahajan15

Differential Revision: D25320703

Pulled By: cheng-chang

fbshipit-source-id: f86347384b42ba2b0257d67eca0f45f806b69da7

daab7603

Avoid to force PORTABLE mode in tools/regression_test.sh (#7806) · f4db3e41

由 sdong 提交于 12月 22, 2020

Summary:
Right now tools/regression_test.sh always builds RocksDB with PORTABLE=1. There isn't a reason for that. Remove it. Users can always specify PORTABLE through envirionement variable.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7806

Test Plan: Run tools/regression_test.sh and see it still builds.

Reviewed By: ajkr

Differential Revision: D25687911

fbshipit-source-id: 1c0b03e5df890babc8b7d8af48b48774d9a4600c

f4db3e41

Add more tests to ASSERT_STATUS_CHECKED (4) (#7718) · 81592d9f

由 Adam Retter 提交于 12月 22, 2020

Summary:
Fourth batch of adding more tests to ASSERT_STATUS_CHECKED.

* db_range_del_test
* db_write_test
* random_access_file_reader_test
* merge_test
* external_sst_file_test
* write_buffer_manager_test
* stringappend_test
* deletefile_test

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7718

Reviewed By: pdillinger

Differential Revision: D25671608

fbshipit-source-id: 687a794e98a9e0cd5428ead9898ef05ced987c31

81592d9f

SyncWAL shouldn't be supported in compacted db (#7788) · 41ff125a

由 cheng-chang 提交于 12月 22, 2020

Summary:
`CompactedDB` is a kind of read-only DB, so it shouldn't support `SyncWAL`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7788

Test Plan: watch existing tests to pass.

Reviewed By: akankshamahajan15

Differential Revision: D25661209

Pulled By: cheng-chang

fbshipit-source-id: 9eb2cc3f73736dcc205c8410e5944aa203f002d3

41ff125a

Apply the changes from: PS-5501 : Re-license PerconaFT 'locktree' to Apache V2 (#7801) · 10220909

由 Sergei Petrunia 提交于 12月 22, 2020

Summary:
commit d5178f513c0b4144a5ac9358ec0f6a3b54a28e76
Author: George O. Lorch III <george.lorch@percona.com>
Date:   Tue Mar 19 12:18:40 2019 -0700

    PS-5501 : Re-license PerconaFT 'locktree' to Apache V2

    - Fixed some incomplete relicensed files from previous round.

    - Added missing license text to some.

    - Relicensed more files to Apache V2 that locktree depends on.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7801

Reviewed By: jay-zhuang

Differential Revision: D25682430

Pulled By: cheng-chang

fbshipit-source-id: deb8a0de3e76f3638672997bfbd300e2fffbe5f5

10220909

Minimize Timing Issue in test WALTrashCleanupOnOpen (#7796) · 9057d0a0

由 sdong 提交于 12月 22, 2020

Summary:
We saw DBWALTestWithParam/DBWALTestWithParam.WALTrashCleanupOnOpen sometimes fail with:

db/db_sst_test.cc:575: Failure
Expected: (trash_log_count) >= (1), actual: 0 vs 1

The suspicious is that delete scheduling actually deleted all trash files based on rate, but it is not expected. This can be reproduced if we manually add sleep after DB is closed for serveral seconds. Minimize its chance by setting the delete rate to be lowest possible.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7796

Test Plan: The test doesn't fail with the manual sleeping anymore

Reviewed By: anand1976

Differential Revision: D25675000

fbshipit-source-id: a39fd05e1a83719c41014e48843792e752368e22

9057d0a0

Add tests in ASSERT_STATUS_CHECKED (#7793) · fbac1b3f

由 Akanksha Mahajan 提交于 12月 22, 2020

Summary:
add io_tracer_parser_test and prefetch_test under
ASSERT_STATUS_CHECKED

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7793

Test Plan: ASSERT_STATUS_CHECKED=1 make check -j64

Reviewed By: jay-zhuang

Differential Revision: D25673464

Pulled By: akankshamahajan15

fbshipit-source-id: 50e0b6f17160ddda206a521a7b47ee33e699a2d4

fbac1b3f

22 12月, 2020 3 次提交

Migrate away from Travis+Linux+amd64 (#7791) · 4d897e51

由 Peter Dillinger 提交于 12月 22, 2020

Summary:
This disables Linux/amd64 builds in Travis for PRs, and adds a
gcc-10+c++20 build in CircleCI, which should fill out sufficient coverage
vs. what we had in Travis

Fixed a use of std::is_pod, which is deprecated in c++20

Fixed ++ on a volatile in db_repl_stress.cc, with bigger refactoring.
Although ++ on this volatile was probably ok with one thread writer and
one thread reader, the code was still overly complex. There was a
deadcode check for error
`if (replThread.no_read < dataPump.no_records)` which can be proven
never to happen based on the structure of the code. It infinite loops
instead for the case intended to be checked. I just simplified the code
for what should be the same checking power.

Also most configurations seem to be using make parallelism = 2 * vcores,
so fixing / using that.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7791

Test Plan:
CI
and `while ./db_repl_stress; do echo again; done` for a while

Reviewed By: siying

Differential Revision: D25669834

Pulled By: pdillinger

fbshipit-source-id: b2c688053d0b1d52c989903449d3cd27a04130d6

4d897e51

Fix Windows build in block_cache_tracer_test (#7795) · 861b0d1a

由 Jay Zhuang 提交于 12月 21, 2020

Summary:
The test was added to cmake in https://github.com/facebook/rocksdb/issues/7783

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7795

Reviewed By: akankshamahajan15

Differential Revision: D25671010

Pulled By: jay-zhuang

fbshipit-source-id: 2146ff9559cdd7266c4d78476672488c62654a6d

861b0d1a

Fix block_cache_test failure (#7783) · fd0d35d3

由 Jay Zhuang 提交于 12月 21, 2020

Summary:
`block_cache_tracer_test` and `block_cache_trace_analyzer_test` are using the same test directory. Which causes build failure if these 2 tests are running in parallel, for example: https://app.circleci.com/pipelines/github/facebook/rocksdb/5211/workflows/8639afbe-9fec-43e2-a6a4-6d47ea9cbcbe/jobs/74598/tests

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7783

Reviewed By: akankshamahajan15

Differential Revision: D25656762

Pulled By: jay-zhuang

fbshipit-source-id: 68aa020aa5b4b3bce324315edecb4e1a60cc18e6

fd0d35d3

20 12月, 2020 2 次提交

Update release version to 6.16 (#7782) · a8aeefd0

由 Jay Zhuang 提交于 12月 19, 2020

Summary:
Update release version to 6.8

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7782

Reviewed By: siying

Differential Revision: D25648579

Pulled By: jay-zhuang

fbshipit-source-id: c536d606868b95c5fb2ae8f19c17eb259d67bc51

a8aeefd0

aggregated-table-properties with GetMapProperty (#7779) · 4d1ac19e

由 Peter Dillinger 提交于 12月 19, 2020

Summary:
So that we can more easily get aggregate live table data such
as total filter, index, and data sizes.

Also adds ldb support for getting properties

Also fixed some missing/inaccurate related comments in db.h

For example:

    $ ./ldb --db=testdb get_property rocksdb.aggregated-table-properties
    rocksdb.aggregated-table-properties.data_size: 102871
    rocksdb.aggregated-table-properties.filter_size: 0
    rocksdb.aggregated-table-properties.index_partitions: 0
    rocksdb.aggregated-table-properties.index_size: 2232
    rocksdb.aggregated-table-properties.num_data_blocks: 100
    rocksdb.aggregated-table-properties.num_deletions: 0
    rocksdb.aggregated-table-properties.num_entries: 15000
    rocksdb.aggregated-table-properties.num_merge_operands: 0
    rocksdb.aggregated-table-properties.num_range_deletions: 0
    rocksdb.aggregated-table-properties.raw_key_size: 288890
    rocksdb.aggregated-table-properties.raw_value_size: 198890
    rocksdb.aggregated-table-properties.top_level_index_size: 0
    $ ./ldb --db=testdb get_property rocksdb.aggregated-table-properties-at-level1
    rocksdb.aggregated-table-properties-at-level1.data_size: 80909
    rocksdb.aggregated-table-properties-at-level1.filter_size: 0
    rocksdb.aggregated-table-properties-at-level1.index_partitions: 0
    rocksdb.aggregated-table-properties-at-level1.index_size: 1787
    rocksdb.aggregated-table-properties-at-level1.num_data_blocks: 81
    rocksdb.aggregated-table-properties-at-level1.num_deletions: 0
    rocksdb.aggregated-table-properties-at-level1.num_entries: 12466
    rocksdb.aggregated-table-properties-at-level1.num_merge_operands: 0
    rocksdb.aggregated-table-properties-at-level1.num_range_deletions: 0
    rocksdb.aggregated-table-properties-at-level1.raw_key_size: 238210
    rocksdb.aggregated-table-properties-at-level1.raw_value_size: 163414
    rocksdb.aggregated-table-properties-at-level1.top_level_index_size: 0
    $

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7779

Test Plan: Added a test to ldb_test.py

Reviewed By: jay-zhuang

Differential Revision: D25653103

Pulled By: pdillinger

fbshipit-source-id: 2905469a08a64dd6b5510cbd7be2e64d3234d6d3

4d1ac19e

19 12月, 2020 4 次提交

Track WAL obsoletion when updating empty CF's log number (#7781) · fbce7a38

由 Cheng Chang 提交于 12月 18, 2020

Summary:
In the write path, there is an optimization: when a new WAL is created during SwitchMemtable, we update the internal log number of the empty column families to the new WAL. `FindObsoleteFiles` marks a WAL as obsolete if the WAL's log number is less than `VersionSet::MinLogNumberWithUnflushedData`. After updating the empty column families' internal log number, `VersionSet::MinLogNumberWithUnflushedData` might change, so some WALs might become obsolete to be purged from disk.

For example, consider there are 3 column families: 0, 1, 2:
1. initially, all the column families' log number is 1;
2. write some data to cf0, and flush cf0, but the flush is pending;
3. now a new WAL 2 is created;
4. write data to cf1 and WAL 2, now cf0's log number is 1, cf1's log number is 2, cf2's log number is 2 (because cf1 and cf2 are empty, so their log numbers will be set to the highest log number);
5. now cf0's flush hasn't finished, flush cf1, a new WAL 3 is created, and cf1's flush finishes, now cf0's log number is 1, cf1's log number is 3, cf2's log number is 3, since WAL 1 still contains data for the unflushed cf0, no WAL can be deleted from disk;
6. now cf0's flush finishes, cf0's log number is 2 (because when cf0 was switching memtable, WAL 3 does not exist yet), cf1's log number is 3, cf2's log number is 3, so WAL 1 can be purged from disk now, but WAL 2 still cannot because `MinLogNumberToKeep()` is 2;
7. write data to cf2 and WAL 3, because cf0 is empty, its log number is updated to 3, so now cf0's log number is 3, cf1's log number is 3, cf2's log number is 3;
8. now if the background threads want to purge obsolete files from disk, WAL 2 can be purged because `MinLogNumberToKeep()` is 3. But there are only two flush results written to MANIFEST: the first is for flushing cf1, and the `MinLogNumberToKeep` is 1, the second is for flushing cf0, and the `MinLogNumberToKeep` is 2. So without this PR, if the DB crashes at this point and try to recover, `WalSet` will still expect WAL 2 to exist.

When WAL tracking is enabled, we assume WALs will only become obsolete after a flush result is written to MANIFEST in `MemtableList::TryInstallMemtableFlushResults` (or its atomic flush counterpart). The above situation breaks this assumption.

This PR tracks WAL obsoletion if necessary before updating the empty column families' log numbers.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7781

Test Plan:
watch existing tests and stress tests to pass.
`make -j48 blackbox_crash_test` on devserver

Reviewed By: ltamasi

Differential Revision: D25631695

Pulled By: cheng-chang

fbshipit-source-id: ca7fff967bdb42204b84226063d909893bc0a4ec

fbce7a38

Fix various small build issues, Java API naming (#7776) · 62afa968

由 Adam Retter 提交于 12月 18, 2020

Summary:
* Compatibility with older GCC.
* Compatibility with older jemalloc libraries.
* Remove Docker warning when building i686 binaries.
* Fix case inconsistency in Java API naming (potential update to HISTORY.md deferred)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7776

Reviewed By: akankshamahajan15

Differential Revision: D25607235

Pulled By: pdillinger

fbshipit-source-id: 7ab0fb7fa7a34e97ed0bec991f5081acb095777d

62afa968

Update code comment for options.ttl (#7775) · 75e4af14

由 sdong 提交于 12月 18, 2020

Summary:
The behavior of options.ttl has been updated long ago but we didn't update the code comments.
Also update the periodic compaction's comment.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7775

Test Plan: See it can still build through CI.

Reviewed By: ajkr

Differential Revision: D25592015

fbshipit-source-id: b1db18b6787e7048ce6aedcbc3bb44493c9fc49b

75e4af14

Support optimize_filters_for_memory for Ribbon filter (#7774) · 239d17a1

由 Peter Dillinger 提交于 12月 18, 2020

Summary:
Primarily this change refactors the optimize_filters_for_memory
code for Bloom filters, based on malloc_usable_size, to also work for
Ribbon filters.

This change also replaces the somewhat slow but general
BuiltinFilterBitsBuilder::ApproximateNumEntries with
implementation-specific versions for Ribbon (new) and Legacy Bloom
(based on a recently deleted version). The reason is to emphasize
speed in ApproximateNumEntries rather than 100% accuracy.

Justification: ApproximateNumEntries (formerly CalculateNumEntry) is
only used by RocksDB for range-partitioned filters, called each time we
start to construct one. (In theory, it should be possible to reuse the
estimate, but the abstractions provided by FilterPolicy don't really
make that workable.) But this is only used as a heuristic estimate for
hitting a desired partitioned filter size because of alignment to data
blocks, which have various numbers of unique keys or prefixes. The two
factors lead us to prioritize reasonable speed over 100% accuracy.

optimize_filters_for_memory adds extra complication, because precisely
calculating num_entries for some allowed number of bytes depends on state
with optimize_filters_for_memory enabled. And the allocator-agnostic
implementation of optimize_filters_for_memory, using malloc_usable_size,
means we would have to actually allocate memory, many times, just to
precisely determine how many entries (keys) could be added and stay below
some size budget, for the current state. (In a draft, I got this
working, and then realized the balance of speed vs. accuracy was all
wrong.)

So related to that, I have made CalculateSpace, an internal-only API
only used for testing, non-authoritative also if
optimize_filters_for_memory is enabled. This simplifies some code.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7774

Test Plan:
unit test updated, and for FilterSize test, range of tested
values is greatly expanded (still super fast)

Also tested `db_bench -benchmarks=fillrandom,stats -bloom_bits=10 -num=1000000 -partition_index_and_filters -format_version=5 [-optimize_filters_for_memory] [-use_ribbon_filter]` with temporary debug output of generated filter sizes.

Bloom+optimize_filters_for_memory:

      1 Filter size: 197 (224 in memory)
    134 Filter size: 3525 (3584 in memory)
    107 Filter size: 4037 (4096 in memory)
    Total on disk: 904,506
    Total in memory: 918,752

Ribbon+optimize_filters_for_memory:

      1 Filter size: 3061 (3072 in memory)
    110 Filter size: 3573 (3584 in memory)
     58 Filter size: 4085 (4096 in memory)
    Total on disk: 633,021 (-30.0%)
    Total in memory: 634,880 (-30.9%)

Bloom (no offm):

      1 Filter size: 261 (320 in memory)
      1 Filter size: 3333 (3584 in memory)
    240 Filter size: 3717 (4096 in memory)
    Total on disk: 895,674 (-1% on disk vs. +offm; known tolerable overhead of offm)
    Total in memory: 986,944 (+7.4% vs. +offm)

Ribbon (no offm):

      1 Filter size: 2949 (3072 in memory)
      1 Filter size: 3381 (3584 in memory)
    167 Filter size: 3701 (4096 in memory)
    Total on disk: 624,397 (-30.3% vs. Bloom)
    Total in memory: 690,688 (-30.0% vs. Bloom)

Note that optimize_filters_for_memory is even more effective for Ribbon filter than for cache-local Bloom, because it can close the unused memory gap even tighter than Bloom filter, because of 16 byte increments for Ribbon vs. 64 byte increments for Bloom.

Reviewed By: jay-zhuang

Differential Revision: D25592970

Pulled By: pdillinger

fbshipit-source-id: 606fdaa025bb790d7e9c21601e8ea86e10541912

239d17a1

kvdb / rocksdb 11 个月 前同步成功

kvdb / rocksdb
11 个月前同步成功