提交 · f29e6b3be29523e09e49e68f7dc8ed5f92586324 · kvdb / rocksdb

12 11月, 2019 3 次提交

bugfix: MemTableList::RemoveOldMemTables invalid iterator after remov… (#6013) · f29e6b3b

由蔡渠棠提交于 11月 11, 2019

Summary:
Fix issue https://github.com/facebook/rocksdb/issues/6012.

I found that it may be caused by the following codes in function _RemoveOldMemTables()_ in **db/memtable_list.cc**  :
```
  for (auto it = memlist.rbegin(); it != memlist.rend(); ++it) {
    MemTable* mem = *it;
    if (mem->GetNextLogNumber() > log_number) {
      break;
    }
    current_->Remove(mem, to_delete);
```

The iterator **it** turns invalid after `current_->Remove(mem, to_delete);`
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6013

Test Plan:
```
make check
```

Differential Revision: D18401107

Pulled By: riversand963

fbshipit-source-id: bf0da3b868ed70f7aff24cf7b3e2049c0c5c7a4e

f29e6b3b

Cascade TTL Compactions to move expired key ranges to bottom levels faster (#5992) · c17384fe

由 Sagar Vemuri 提交于 11月 11, 2019

Summary:
When users use Level-Compaction-with-TTL by setting `cf_options.ttl`, the ttl-expired data could take n*ttl time to reach the bottom level (where n is the number of levels) due to how the `creation_time` table property was calculated for the newly created files during compaction. The creation time of new files was set to a max of all compaction-input-files-creation-times which essentially resulted in resetting the ttl as the key range moves across levels. This behavior is now fixed by changing the `creation_time` to be based on minimum of all compaction-input-files-creation-times; this will cause cascading compactions across levels for the ttl-expired data to move to the bottom level, resulting in getting rid of tombstones/deleted-data faster.

This will help start cascading compactions to move the expired key range to the bottom-most level faster.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5992

Test Plan: `make check`

Differential Revision: D18257883

Pulled By: sagar0

fbshipit-source-id: 00df0bb8d0b7e14d9fc239df2cba8559f3e54cbc

c17384fe

BlobDB: Maintain mapping between blob files and SSTs (#6020) · 8e7aa628

由 Levi Tamasi 提交于 11月 11, 2019

Summary:
The patch adds logic to BlobDB to maintain the mapping between blob files
and SSTs for which the blob file in question is the oldest blob file referenced
by the SST file. The mapping is initialized during database open based on the
information retrieved using `GetLiveFilesMetaData`, and updated after
flushes/compactions based on the information received through the `EventListener`
interface (or, in the case of manual compactions issued through the `CompactFiles`
API, the `CompactionJobInfo` object).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6020

Test Plan: Added a unit test; also tested using the BlobDB mode of `db_bench`.

Differential Revision: D18410508

Pulled By: ltamasi

fbshipit-source-id: dd9e778af781cfdb0d7056298c54ba9cebdd54a5

8e7aa628

09 11月, 2019 2 次提交

Auto-GarbageCollect on PurgeOldBackups and DeleteBackup (#6015) · aa63abf6

由 Peter Dillinger 提交于 11月 08, 2019

Summary:
Only if there is a crash, power failure, or I/O error in
DeleteBackup, shared or private files from the backup might be left
behind that are not cleaned up by PurgeOldBackups or DeleteBackup-- only
by GarbageCollect. This makes the BackupEngine API "leaky by default."
Even if it means a modest performance hit, I think we should make
Delete and Purge do as they say, with ongoing best effort: i.e. future
calls will attempt to finish any incomplete work from earlier calls.

This change does that by having DeleteBackup and PurgeOldBackups do a
GarbageCollect, unless (to minimize performance hit) this BackupEngine
has already done a GarbageCollect and there have been no
deletion-related I/O errors in that GarbageCollect or since then.

Rejected alternative 1: remove meta file last instead of first. This would in theory turn partially deleted backups into corrupted backups, but code changes would be needed to allow the missing files and consider it acceptably corrupt, rather than failing to open the BackupEngine. This might be a reasonable choice, but I mostly rejected it because it doesn't solve the legacy problem of cleaning up existing lingering files.

Rejected alternative 2: use a deletion marker file. If deletion started with creating a file that marks a backup as flagged for deletion, then we could reliably detect partially deleted backups and efficiently finish removing them. In addition to not solving the legacy problem, this could be precarious if there's a disk full situation, and we try to create a new file in order to delete some files. Ugh.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6015

Test Plan: Updated unit tests

Differential Revision: D18401333

Pulled By: pdillinger

fbshipit-source-id: 12944e372ce6809f3f5a4c416c3b321a8927d925

aa63abf6

Fix DBFlushTest::FireOnFlushCompletedAfterCommittedResult hang (#6018) · 72de842a

由 Yi Wu 提交于 11月 08, 2019

Summary:
The test would fire two flushes to let them run in parallel. Previously it wait for the first job to be scheduled before firing the second. It is possible the job is not started before the second job being scheduled, making the two job combine into one. Change to wait for the first job being started.

Fixes https://github.com/facebook/rocksdb/issues/6017
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6018

Test Plan:
```
while ./db_flush_test --gtest_filter=*FireOnFlushCompletedAfterCommittedResult*; do :; done
```
and let it run for a while.
Signed-off-by: NYi Wu <yiwu@pingcap.com>

Differential Revision: D18405576

Pulled By: riversand963

fbshipit-source-id: 6ebb6262e033d5dc2ef81cb3eb410b314f2de4c9

72de842a

08 11月, 2019 7 次提交

Add file number/oldest referenced blob file number to {Sst,Live}FileMetaData (#6011) · f80050fa

由 Levi Tamasi 提交于 11月 07, 2019

Summary:
The patch exposes the file numbers of the SSTs as well as the oldest blob
files they contain a reference to through the GetColumnFamilyMetaData/
GetLiveFilesMetaData interface.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6011

Test Plan:
Fixed and extended the existing unit tests. (The earlier ColumnFamilyMetaDataTest
wasn't really testing anything because the generated memtables were never
flushed, so the metadata structure was essentially empty.)

Differential Revision: D18361697

Pulled By: ltamasi

fbshipit-source-id: d5ed1d94ac70858b84393c48711441ddfe1251e9

f80050fa

Download bzip2 packages from sourceforge (#5995) · 07a0ad3c

由 Yun Tang 提交于 11月 07, 2019

Summary:
From bzip2's official [download page](http://www.bzip.org/downloads.html), we could download it from sourceforge. This source would be more credible than previous web archive.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5995

Differential Revision: D18377662

fbshipit-source-id: e8353f83d5d6ea6067f78208b7bfb7f0d5b49c05

07a0ad3c

Fix MultiGet crash when no_block_cache is set (#5991) · 9836a1fa

由 anand76 提交于 11月 07, 2019

Summary:
This PR fixes https://github.com/facebook/rocksdb/issues/5975. In ```BlockBasedTable::RetrieveMultipleBlocks()```, we were calling ```MaybeReadBlocksAndLoadToCache()```, which is a no-op if neither uncompressed nor compressed block cache are configured.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5991

Test Plan:
1. Add unit tests that fail with the old code and pass with the new
2. make check and asan_check

Cc spetrunia

Differential Revision: D18272744

Pulled By: anand1976

fbshipit-source-id: e62fa6090d1a6adf84fcd51dfd6859b03c6aebfe

9836a1fa

Stress test to relax the iterator verification case for lower bound (#5869) · 1da1f042

由 sdong 提交于 11月 07, 2019

Summary:
In stress test, all iterator verification is turned off is lower bound is enabled. This might be stricter than needed. This PR relaxes the condition and include the case where lower bound is lower than both of seek key and upper bound. It seems to work mostly fine when I run crash test locally.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5869

Test Plan: Run crash_test

Differential Revision: D18363578

fbshipit-source-id: 23d57e11ea507949b8100f4190ddfbe8db052d5a

1da1f042

Add two test cases for single sorted universal periodic compaction (#6002) · 982a7532

由 sdong 提交于 11月 07, 2019

Summary:
It's useful to add test coverage for universal compaction's periodic compaction. Add two tests.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6002

Test Plan: Run the two tests

Differential Revision: D18363544

fbshipit-source-id: bbd04b54057315f64f959709006412db1f76d170

982a7532

Turn on periodic compaction in universal by default if compaction filter is used. (#5994) · f0b469e5

由 sdong 提交于 11月 07, 2019

Summary:
Recently, periodic compaction got turned on by default for leveled compaction is compaction filter is used. Since periodic compaction is now supported in universal compaction too, we do the same default for universal now.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5994

Test Plan: Add a new unit test.

Differential Revision: D18363744

fbshipit-source-id: 5093288ce990ee3cab0e44ffd92d8489fbcd6a48

f0b469e5

Partial rebalance of TEST_GROUPs for Travis (#6010) · 7b3222e1

由 Peter Dillinger 提交于 11月 07, 2019

Summary:
TEST_GROUP=1 has sometimes been timing out but generally taking
45-50 minutes vs. 20-25 for groups 2-4. Beyond the compilation time, tests in
group 1 consist of about 19 minutes of db_test, and 7 minutes of everything
else. This change moves most of that "everything else" to group 2.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6010

Test Plan: Travis for this PR, oncall watch Travis

Differential Revision: D18373536

Pulled By: pdillinger

fbshipit-source-id: 0b3af004c71e4fd6bc01a94dac34cc3079fc9ce1

7b3222e1

07 11月, 2019 3 次提交

db_stress: improve TestGet() failure printing (#5989) · 111ebf31

由 sdong 提交于 11月 06, 2019

Summary:
Right now, in db_stress's CF consistency test's TestGet case, if failure happens, we do normal string printing, rather than hex printing, so that some text is not printed out, which makes debugging harder. Fix it by printing hex instead.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5989

Test Plan: Build db_stress and see t passes.

Differential Revision: D18363552

fbshipit-source-id: 09d1b8f6fbff37441cbe7e63a1aef27551226cec

111ebf31

Workload generator (Mixgraph) based on prefix hotness (#5953) · 8ea087ad

由 Zhichao Cao 提交于 11月 06, 2019

Summary:
In the previous PR https://github.com/facebook/rocksdb/issues/4788, user can use db_bench mix_graph option to generate the workload that is from the social graph. The key is generated based on the key access hotness. In this PR, user can further model the key-range hotness and fit those to two-term-exponential distribution. First, user cuts the whole key space into small key ranges (e.g., key-ranges are the same size and the key-range number is the number of SST files). Then, user calculates the average access count per key of each key-range as the key-range hotness. Next, user fits the key-range hotness to two-term-exponential distribution (f(x) = f(x) = a*exp(b*x) + c*exp(d*x)) and generate the value of a, b, c, and d. They are the parameters in db_bench: prefix_dist_a, prefix_dist_b, prefix_dist_c, and prefix_dist_d. Finally, user can run db_bench by specify the parameters.
For example:
`./db_bench --benchmarks="mixgraph" -use_direct_io_for_flush_and_compaction=true -use_direct_reads=true -cache_size=268435456 -key_dist_a=0.002312 -key_dist_b=0.3467 -keyrange_dist_a=14.18 -keyrange_dist_b=-2.917 -keyrange_dist_c=0.0164 -keyrange_dist_d=-0.08082 -keyrange_num=30 -value_k=0.2615 -value_sigma=25.45 -iter_k=2.517 -iter_sigma=14.236 -mix_get_ratio=0.85 -mix_put_ratio=0.14 -mix_seek_ratio=0.01 -sine_mix_rate_interval_milliseconds=5000 -sine_a=350 -sine_b=0.0105 -sine_d=50000 --perf_level=2 -reads=1000000 -num=5000000 -key_size=48`
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5953

Test Plan: run db_bench with different parameters and checked the results.

Differential Revision: D18053527

Pulled By: zhichao-cao

fbshipit-source-id: 171f8b3142bd76462f1967c58345ad7e4f84bab7

8ea087ad

Enable write-conflict snapshot in stress tests (#5897) · 50804656

由 Maysam Yabandeh 提交于 11月 06, 2019

Summary:
DBImpl extends the public GetSnapshot() with GetSnapshotForWriteConflictBoundary() method that takes snapshots specially for write-write conflict checking. Compaction treats such snapshots differently to avoid GCing a value written after that, so that the write conflict remains visible even after the compaction. The patch extends stress tests with such snapshots.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5897

Differential Revision: D17937476

Pulled By: maysamyabandeh

fbshipit-source-id: bd8b0c578827990302194f63ae0181e15752951d

50804656

06 11月, 2019 3 次提交

Add clarifying/instructive header to TARGETS and defs.bzl · 834feaff

由 Peter Dillinger 提交于 11月 05, 2019

Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6008

Differential Revision: D18343273

Pulled By: pdillinger

fbshipit-source-id: f7d1c78d711bbfb0deea9ec88212c19ab2ec91b8

834feaff

Rename BlockBasedTable::ReadMetaBlock (#6009) · 67e735db

由 Yanqin Jin 提交于 11月 05, 2019

Summary:
According to
https://github.com/facebook/rocksdb/wiki/Rocksdb-BlockBasedTable-Format,
the block read by BlockBasedTable::ReadMetaBlock is actually the meta index
block. Therefore, it is better to rename the function to ReadMetaIndexBlock.

This PR also applies some format change to existing code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6009

Test Plan: make check

Differential Revision: D18333238

Pulled By: riversand963

fbshipit-source-id: 2c4340a29b3edba53d19c132cbfd04caf6242aed

67e735db

Add a limited support for iteration bounds into BaseDeltaIterator (#5403) · 230bcae7

由 Sergei Petrunia 提交于 11月 05, 2019

Summary:
For MDEV-19670: MyRocks: key lookups into deleted data are very slow

BaseDeltaIterator remembers iterate_upper_bound and will not let delta_iterator_
walk above the iterate_upper_bound if base_iterator_ is not valid
anymore.

== Rationale ==
The most straightforward way would be to make the delta_iterator
(which is a rocksdb::WBWIIterator) to support iterator bounds. But
checking for bounds has an extra CPU overhead.

So we put the check into BaseDeltaIterator, and only make it when
base_iterator_ is not valid.

(note: We could take it even further, and move the check a few lines
down, and only check iterator bounds ourselves if base_iterator_ is
not valid AND delta_iterator_ hit a tombstone).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5403

Differential Revision: D15863092

Pulled By: maysamyabandeh

fbshipit-source-id: 8da458e7b9af95ff49356666f69664b4a6ccf49b

230bcae7

05 11月, 2019 1 次提交

WritePrepared: Fix flaky test MaxCatchupWithNewSnapshot (#5850) · 52733b44

由 Maysam Yabandeh 提交于 11月 04, 2019

Summary:
MaxCatchupWithNewSnapshot tests that the snapshot sequence number will be larger than the max sequence number when the snapshot was taken. However since the test does not have access to the max sequence number when the snapshot was taken, it uses max sequence number after that, which could have advanced the snapshot by then, thus making the test flaky.
The fix is to compare with max sequence number before the snapshot was taken, which is a lower bound for the value when the snapshot was taken.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5850

Test Plan: ~/gtest-parallel/gtest-parallel --repeat=12800 ./write_prepared_transaction_test --gtest_filter="*MaxCatchupWithNewSnapshot*"

Differential Revision: D17608926

Pulled By: maysamyabandeh

fbshipit-source-id: b122ae5a27f982b290bd60da852e28d3c5eb0136

52733b44

02 11月, 2019 3 次提交

Fix assertion in universal compaction periodic compaction (#6000) · 0d91a981

由 sdong 提交于 11月 01, 2019

Summary:
We recently added periodic compaction to universal compaction. An old assertion that we can't onlyl compact the last sorted run triggered. However, with periodic compaction, it is possible that we only compact the last sorted run, so the assertion now became stricter than needed. Relaxing this assertion.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6000

Test Plan: This should be a low risk change. Will observe whether stress test will pass after it.

Differential Revision: D18285396

fbshipit-source-id: 9a6863debdf104c40a7f6c46ab62d84cdf5d8592

0d91a981

Revert "Disable pre-5.5 versions in the format compatibility test (#5990)" (#5999) · e4e1d35c

由 sdong 提交于 11月 01, 2019

Summary:
This reverts commit 351e2540.

All branches have been fixed to buildable on FB environments, so we can revert it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5999

Differential Revision: D18281947

fbshipit-source-id: 6deaaf1b5df2349eee5d6ed9b91208cd7e23ec8e

e4e1d35c

Use aggregate initialization for FlushJobInfo/CompactionJobInfo (#5997) · a44670e7

由 Levi Tamasi 提交于 11月 01, 2019

Summary:
FlushJobInfo and CompactionJobInfo are aggregates; we should use the
aggregate initialization syntax to ensure members (specifically those of
built-in types) are value-initialized.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5997

Test Plan: make check

Differential Revision: D18273398

Pulled By: ltamasi

fbshipit-source-id: 35b1a63ad9ca01605d288329858af72fffd7f392

a44670e7

01 11月, 2019 6 次提交

crash_test: disable periodic compaction in FIFO compaction. (#5993) · 5b656584

由 sdong 提交于 10月 31, 2019

Summary:
A recent commit make periodic compaction option valid in FIFO, which means TTL. But we fail to disable it in crash test, causing assert failure. Fix it by having it disabled.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5993

Test Plan: Restart "make crash_test" many times and make sure --periodic_compaction_seconds=0 is always the case when --compaction_style=2

Differential Revision: D18263223

fbshipit-source-id: c91a802017d83ae89ac43827d1b0012861933814

5b656584

Add new persistent 64-bit hash (#5984) · 18f57f5e

由 Peter Dillinger 提交于 10月 31, 2019

Summary:
For upcoming new SST filter implementations, we will use a new
64-bit hash function (XXH3 preview, slightly modified). This change
updates hash.{h,cc} for that change, adds unit tests, and out-of-lines
the implementations to keep hash.h as clean/small as possible.

In developing the unit tests, I discovered that the XXH3 preview always
returns zero for the empty string. Zero is problematic for some
algorithms (including an upcoming SST filter implementation) if it
occurs more often than at the "natural" rate, so it should not be
returned from trivial values using trivial seeds. I modified our fork
of XXH3 to return a modest hash of the seed for the empty string.

With hash function details out-of-lines in hash.h, it makes sense to
enable XXH_INLINE_ALL, so that direct calls to XXH64/XXH32/XXH3p
are inlined. To fix array-bounds warnings on some inline calls, I
injected some casts to uintptr_t in xxhash.cc. (Issue reported to Yann.)
Revised: Reverted using XXH_INLINE_ALL for now.  Some Facebook
checks are unhappy about #include on xxhash.cc file. I would
fix that by rename to xxhash_cc.h, but to best preserve history I want
to do that in a separate commit (PR) from the uintptr casts.

Also updated filter_bench for this change, improving the performance
predictability of dry run hashing and adding support for 64-bit hash
(for upcoming new SST filter implementations, minor dead code in the
tool for now).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5984

Differential Revision: D18246567

Pulled By: pdillinger

fbshipit-source-id: 6162fbf6381d63c8cc611dd7ec70e1ddc883fbb8

18f57f5e

Prepare filter tests for more implementations (#5967) · 685e8956

由 Peter Dillinger 提交于 10月 31, 2019

Summary:
This change sets up for alternate implementations underlying
BloomFilterPolicy:

* Refactor BloomFilterPolicy and expose in internal .h file so that it's easy to iterate over / select implementations for testing, regardless of what the best public interface will look like. Most notably updated db_bloom_filter_test to use this.
* Hide FullFilterBitsBuilder from unit tests (alternate derived classes planned); expose the part important for testing (CalculateSpace), as abstract class BuiltinFilterBitsBuilder. (Also cleaned up internally exposed interface to CalculateSpace.)
* Rename BloomTest -> BlockBasedBloomTest for clarity (despite ongoing confusion between block-based table and block-based filter)
* Assert that block-based filter construction interface is only used on BloomFilterPolicy appropriately constructed. (A couple of tests updated to add ", true".)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5967

Test Plan: make check

Differential Revision: D18138704

Pulled By: pdillinger

fbshipit-source-id: 55ef9273423b0696309e251f50b8c1b5e9ec7597

685e8956

Disable pre-5.5 versions in the format compatibility test (#5990) · 351e2540

由 Levi Tamasi 提交于 10月 31, 2019

Summary:
We have updated earlier release branches going back to 5.5 so they are
built using gcc7 by default. Disabling ancient versions before that
until we figure out a plan for them.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5990

Test Plan: Ran the script locally.

Differential Revision: D18252386

Pulled By: ltamasi

fbshipit-source-id: a7bbb30dc52ff2eaaf31a29ecc79f7cf4e2834dc

351e2540

Support periodic compaction in universal compaction (#5970) · aa6f7d09

由 sdong 提交于 10月 31, 2019

Summary:
Previously, periodic compaction is not supported in universal compaction. Add the support using following approach: if any file is marked as qualified for periodid compaction, trigger a full compaction. If a full compaction is prevented by files being compacted, try to compact the higher levels than files currently being compacted. If in this way we can only compact the last sorted run and none of the file to be compacted qualifies for periodic compaction, skip the compact. This is to prevent the same single level compaction from being executed again and again.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5970

Test Plan: Add several test cases.

Differential Revision: D18147097

fbshipit-source-id: 8ecc308154d9aca96fb192c51fbceba3947550c1

aa6f7d09

Make FIFO compaction take default 30 days TTL by default (#5987) · 2a9e5caf

由 sdong 提交于 10月 31, 2019

Summary:
Right now, by default FIFO compaction has no TTL. We believe that a default TTL of 30 days will be better. With this patch, the default will be changed to 30 days. Default of Options.periodic_compaction_seconds will mean the same as options.ttl. If Options.ttl and Options.periodic_compaction_seconds left default, a default 30 days TTL will be used. If both options are set, the stricter value of the two will be used.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5987

Test Plan: Add an option sanitize test to cover the case.

Differential Revision: D18237935

fbshipit-source-id: a6dcea1f36c3849e13c0a69e413d73ad8eab58c9

2a9e5caf

31 10月, 2019 3 次提交

Turn compaction asserts to runtime check (#5935) · dccaf9f0

由 Maysam Yabandeh 提交于 10月 30, 2019

Summary:
Compaction iterator has many assert statements that are active only during test runs. Some rare bugs would show up only at runtime could violate the assert condition but go unnoticed since assert statements are not compiled in release mode. Turning the assert statements to runtime check sone pors and cons:
Pros:
- A bug that would result into incorrect data would be detected early before the incorrect data is written to the disk.

Cons:
- Runtime overhead: which should be negligible since compaction cpu is the minority in the overall cpu usage
- The assert statements might already being violated at runtime, and turning them to runtime failure might result into reliability issues.

The patch takes a conservative step in this direction by logging the assert violations at runtime. If we see any violation reported in logs, we investigate. Otherwise, we can go ahead turning them to runtime error.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5935

Differential Revision: D18229697

Pulled By: maysamyabandeh

fbshipit-source-id: f1890eca80ccd7cca29737f1825badb9aa8038a8

dccaf9f0

crash_test: disable atomic flush with pipelined write (#5986) · 0337d87b

由 sdong 提交于 10月 30, 2019

Summary:
Recently, pipelined write is enabled even if atomic flush is enabled, which causing sanitizing failure in db_stress. Revert this change.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5986

Test Plan: Run "make crash_test_with_atomic_flush" and see it to run for some while so that the old sanitizing error (which showed up quickly) doesn't show up.

Differential Revision: D18228278

fbshipit-source-id: 27fdf2f8e3e77068c9725a838b9bef4ab25a2553

0337d87b

Add more release branches to tools/check_format_compatible.sh (#5985) · 15119f08

由 sdong 提交于 10月 30, 2019

Summary:
More release branches are created. We should include them in continuous format compatibility checks.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5985

Test Plan: Let's see whether it is passes.

Differential Revision: D18226532

fbshipit-source-id: 75d8cad5b03ccea4ce16f00cea1f8b7893b0c0c8

15119f08

30 10月, 2019 3 次提交

Move pipeline write waiting logic into WaitForPendingWrites() (#5716) · a3960fc8

由 sdong 提交于 10月 29, 2019

Summary:
In pipeline writing mode, memtable switching needs to wait for memtable writing to finish to make sure that when memtables are made immutable, inserts are not going to them. This is currently done in DBImpl::SwitchMemtable(). This is done after flush_scheduler_.TakeNextColumnFamily() is called to fetch the list of column families to switch. The function flush_scheduler_.TakeNextColumnFamily() itself, however, is not thread-safe when being called together with flush_scheduler_.ScheduleFlush().
This change provides a fix, which moves the waiting logic before flush_scheduler_.TakeNextColumnFamily(). WaitForPendingWrites() is a natural place where the logic can happen.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5716

Test Plan: Run all tests with ASAN and TSAN.

Differential Revision: D18217658

fbshipit-source-id: b9c5e765c9989645bf10afda7c5c726c3f82f6c3

a3960fc8

db_stress: CF Consistency check to use random CF to validate iterator results (#5983) · f22aaf8b

由 sdong 提交于 10月 29, 2019

Summary:
Right now, in db_stress's iterator tests, we always use the same CF to validate iterator results. This commit changes it so that a randomized CF is used in Cf consistency test, where every CF should have exactly the same data. This would help catch more bugs.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5983

Test Plan: Run "make crash_test_with_atomic_flush".

Differential Revision: D18217643

fbshipit-source-id: 3ac998852a0378bb59790b20c5f236f6a5d681fe

f22aaf8b

Auto enable Periodic Compactions if a Compaction Filter is used (#5865) · 4c9aa30a

由 Sagar Vemuri 提交于 10月 29, 2019

Summary:
- Periodic compactions are auto-enabled if a compaction filter or a compaction filter factory is set, in Level Compaction.
- The default value of `periodic_compaction_seconds` is changed to UINT64_MAX, which lets RocksDB auto-tune periodic compactions as needed. An explicit value of 0 will still work as before ie. to disable periodic compactions completely. For now, on seeing a compaction filter along with a UINT64_MAX value for `periodic_compaction_seconds`, RocksDB will make SST files older than 30 days to go through periodic copmactions.

Some RocksDB users make use of compaction filters to control when their data can be deleted, usually with a custom TTL logic. But it is occasionally possible that the compactions get delayed by considerable time due to factors like low writes to a key range, data reaching bottom level, etc before the TTL expiry. Periodic Compactions feature was originally built to help such cases. Now periodic compactions are auto enabled by default when compaction filters or compaction filter factories are used, as it is generally helpful to all cases to collect garbage.

`periodic_compaction_seconds` is set to a large value, 30 days, in `SanitizeOptions` when RocksDB sees that a `compaction_filter` or `compaction_filter_factory` is used.

This is done only for Level Compaction style.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5865

Test Plan:
- Added a new test `DBCompactionTest.LevelPeriodicCompactionWithCompactionFilters` to make sure that `periodic_compaction_seconds` is set if either `compaction_filter` or `compaction_filter_factory` options are set.
- `COMPILE_WITH_ASAN=1 make check`

Differential Revision: D17659180

Pulled By: sagar0

fbshipit-source-id: 4887b9cf2e53cf2dc93a7b658c6b15e1181217ee

4c9aa30a

29 10月, 2019 2 次提交

filter_bench not needed for ROCKSDB_LITE (#5978) · 26dc2963

由 Peter Dillinger 提交于 10月 28, 2019

Summary:
filter_bench is a specialized micro-benchmarking tool that
should not be needed with ROCKSDB_LITE. This should fix the LITE build.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5978

Test Plan: make LITE=1 check

Differential Revision: D18177941

Pulled By: pdillinger

fbshipit-source-id: b73a171404661e09e018bc99afcf8d4bf1e2949c

26dc2963

Upgrading version to 6.6.0 on Master (#5965) · 79018ba5

由 Vijay Nadimpalli 提交于 10月 28, 2019

Summary:
Upgrading version to 6.6.0 on Master.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5965

Differential Revision: D18119839

Pulled By: vjnadimpalli

fbshipit-source-id: 4adbcbb82b108d2f626e88c786453baad8455f4e

79018ba5

26 10月, 2019 4 次提交

Fix for lite build (#5971) · 1075c376

由 Vijay Nadimpalli 提交于 10月 25, 2019

Summary:
Fix for lite build
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5971

Test Plan: make J=1 -j64  LITE=1 all check

Differential Revision: D18148306

Pulled By: vjnadimpalli

fbshipit-source-id: 5b9a3edc3e73e054fee6b96e6f6e583cecc898f3

1075c376

More improvements to filter_bench (#5968) · 3f891c40

由 Peter Dillinger 提交于 10月 25, 2019

Summary:
* Adds support for plain table filter. This is not critical right now, but does add a -impl flag that will be useful for new filter implementations initially targeted at block-based table (and maybe later ported to plain table)
* Better mixing of inside vs. outside queries, for more realism
* A -best_case option handy for implementation tuning inner loop
* Option for whether to include hashing time in dry run / net timings

No modifications to production code, just filter_bench.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5968

Differential Revision: D18139872

Pulled By: pdillinger

fbshipit-source-id: 5b09eba963111b48f9e0525a706e9921070990e8

3f891c40

Update xxhash.cc to allow combined compilation (#5969) · b3dc2f36

由 Peter Dillinger 提交于 10月 25, 2019

Summary:
To fix unity_test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5969

Test Plan: make unity_test

Differential Revision: D18140426

Pulled By: pdillinger

fbshipit-source-id: d5516e6d665f57e3706b9f9b965b0c458e58ccef

b3dc2f36

API to get file_creation_time of the oldest file in the DB (#5948) · ec880436

由 Vijay Nadimpalli 提交于 10月 25, 2019

Summary:
Adding a new API to db.h that allows users to get file_creation_time of the oldest file in the DB.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5948

Test Plan: Added unit test.

Differential Revision: D18056151

Pulled By: vjnadimpalli

fbshipit-source-id: 448ec9d34cb6772e1e5a62db399ace00dcbfbb5d

ec880436

kvdb / rocksdb 12 个月 前同步成功

kvdb / rocksdb
12 个月前同步成功