提交 · 92f631da33e88ce63f1546c3a4865cc4dc1d4e13 · kvdb / rocksdb

19 6月, 2019 6 次提交

replace sprintf with its safe version snprintf (#5475) · 92f631da

由 Huisheng Liu 提交于 6月 18, 2019

Summary:
sprintf is unsafe and has buffer overrun risk. Replace it with the safer version snprintf where buffer size is supplied to avoid overrun.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5475

Differential Revision: D15879481

Pulled By: sagar0

fbshipit-source-id: 7ae1958ffc9727fa50261dfbb98ddd74e70a72d8

92f631da

Revert to respecting only the read_tier read option for index blocks (#5481) · d0c6aea1

由 Levi Tamasi 提交于 6月 18, 2019

Summary:
PR https://github.com/facebook/rocksdb/issues/5298 subtly changed how read options are applied to the index block
during a Get, MultiGet, or iteration. Earlier, only the read_tier option
applied to the index block read; since PR https://github.com/facebook/rocksdb/issues/5298, fill_cache and
verify_checksums also have an effect. This patch restores the earlier
behavior to prevent surprise memory increases for clients due to the
index block not being cached.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5481

Test Plan: make check

Differential Revision: D15883082

Pulled By: ltamasi

fbshipit-source-id: 9a065ec3a6db5a365cf6dd5e95190a20c5756356

d0c6aea1

Fix compilation with USE_HDFS (#5444) · 22087052

由 Andrew Kryczka 提交于 6月 18, 2019

Summary:
The changes in https://github.com/facebook/rocksdb/commit/8272a6de57ed701fb25bb660e074cab703ed3fe7 were untested with `USE_HDFS=1`. There were a couple compiler errors. This PR fixes them.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5444

Test Plan:
```
$ EXTRA_LDFLAGS="-L/tmp/hadoop-3.1.2/lib/native/" EXTRA_CXXFLAGS="-I/tmp/hadoop-3.1.2/include" USE_HDFS=1 make -j12 check
```

Differential Revision: D15885009

fbshipit-source-id: 2a0a63739e0b9a2819b461ad63ce1292c4833fe2

22087052

Update the version of ZStd for the Rocks Java static build · 5dc9fbd1

由 Adam Retter 提交于 6月 18, 2019

Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5228

Differential Revision: D15880451

Pulled By: sagar0

fbshipit-source-id: 84da6f42cac15367d95bffa5336ebd002e7c3308

5dc9fbd1

build on ARM64 (#5450) · 4bd0cf54

由 siddontang 提交于 6月 18, 2019

Summary:
Support building RocksDB on AWS ARM64

```
uname -m
aarch64
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5450

Differential Revision: D15879851

fbshipit-source-id: a9b56520a2cd9921338305a06d7103a40a3300b8

4bd0cf54

Fix a bug caused by secondary not skipping the beginning of new MANIFEST (#5472) · f287f8dc

由 Yanqin Jin 提交于 6月 18, 2019

Summary:
While the secondary is replaying after the primary, the primary may switch to a new MANIFEST. The secondary is already able to detect and follow the primary to the new MANIFEST. However, the current implementation has a bug, described as follows.
The new MANIFEST's first records have been generated by VersionSet::WriteSnapshot to describe the current state of the column families and the db as of the MANIFEST creation. Since the secondary instance has already finished recovering upon start, there is no need for the secondary to process these records. Actually, if the secondary were to replay these records, the secondary may end up adding the same SST files **again** to each column family, causing consistency checks done by VersionBuilder to fail. Therefore, we record the number of records to skip at the beginning of the new MANIFEST and ignore them.

Test plan (on dev server)
```
$make clean && make -j32 all
$./db_secondary_test
```
All existing unit tests must pass as well.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5472

Differential Revision: D15866771

Pulled By: riversand963

fbshipit-source-id: a1eec4837fb2ad13059398efb0f437e74fd53bed

f287f8dc

18 6月, 2019 8 次提交

fix rocksdb lite and clang contrun test failures (#5477) · ddd088c8

由 Zhongyi Xie 提交于 6月 17, 2019

Summary:
recent commit 671d15cb introduced some test failures:
```
===== Running stats_history_test
[==========] Running 9 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 9 tests from StatsHistoryTest
[ RUN      ] StatsHistoryTest.RunStatsDumpPeriodSec
monitoring/stats_history_test.cc:63: Failure
dbfull()->SetDBOptions({{"stats_dump_period_sec", "0"}})
Not implemented: Not supported in ROCKSDB LITE

db/db_options_test.cc:28:11: error: unused variable 'kMicrosInSec' [-Werror,-Wunused-const-variable]
const int kMicrosInSec = 1000000;
```
This PR fixes these failures
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5477

Differential Revision: D15871814

Pulled By: miasantreble

fbshipit-source-id: 0a7023914d2c1784d9d2d3f5bfb47310d4855394

ddd088c8

Block cache tracing: Fix minor bugs with downsampling and some benchmark results. (#5473) · bcfc53b4

由 haoyuhuang 提交于 6月 17, 2019

Summary:
As the code changes for block cache tracing are almost complete, I did a benchmark to compare the performance when block cache tracing is enabled/disabled.

 With 1% downsampling ratio, the performance overhead of block cache tracing is negligible. When we trace all block accesses, the throughput drops by 6 folds with 16 threads issuing random reads and all reads are served in block cache.

Setup:
RocksDB:    version 6.2
Date:       Mon Jun 17 17:11:13 2019
CPU:        24 * Intel Core Processor (Skylake)
CPUCache:   16384 KB
Keys:       20 bytes each
Values:     100 bytes each (100 bytes after compression)
Entries:    10000000
Prefix:    20 bytes
Keys per prefix:    0
RawSize:    1144.4 MB (estimated)
FileSize:   1144.4 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: NoCompression
Compression sampling rate: 0
Memtablerep: skip_list
Perf Level: 1

I ran the readrandom workload for 1 minute. Detailed throughput results:  (ops/second)
Sample rate 0: no block cache tracing.
Sample rate 1: trace all block accesses.
Sample rate 100: trace accesses 1% blocks.
1 thread |   |   |  -- | -- | -- | --
Sample rate | 0 | 1 | 100
1 MB block cache size | 13,094 | 13,166 | 13,341
10 GB block cache size | 202,243 | 188,677 | 229,182

16 threads |   |   |  -- | -- | -- | --
Sample rate | 0 | 1 | 100
1 MB block cache size | 208,761 | 178,700 | 201,872
10 GB block cache size | 2,645,996 | 426,295 | 2,587,605
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5473

Differential Revision: D15869479

Pulled By: HaoyuHuang

fbshipit-source-id: 7ae802abe84811281a6af8649f489887cd7c4618

bcfc53b4

Support computing miss ratio curves using sim_cache. (#5449) · 2d1dd5bc

由 haoyuhuang 提交于 6月 17, 2019

Summary:
This PR adds a BlockCacheTraceSimulator that reports the miss ratios given different cache configurations. A cache configuration contains "cache_name,num_shard_bits,cache_capacities". For example, "lru, 1, 1K, 2K, 4M, 4G".

When we replay the trace, we also perform lookups and inserts on the simulated caches.
In the end, it reports the miss ratio for each tuple <cache_name, num_shard_bits, cache_capacity> in a output file.

This PR also adds a main source block_cache_trace_analyzer so that we can run the analyzer in command line.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5449

Test Plan:
Added tests for block_cache_trace_analyzer.
COMPILE_WITH_ASAN=1 make check -j32.

Differential Revision: D15797073

Pulled By: HaoyuHuang

fbshipit-source-id: aef0c5c2e7938f3e8b6a10d4a6a50e6928ecf408

2d1dd5bc

Override check consistency for DBImplSecondary (#5469) · 7d8d5641

由 Yanqin Jin 提交于 6月 17, 2019

Summary:
`DBImplSecondary` calls `CheckConsistency()` during open. In the past, `DBImplSecondary` did not override this function thus `DBImpl::CheckConsistency()` is called.
The following can happen. The secondary instance is performing consistency check which calls `GetFileSize(file_path)` but the file at `file_path` is deleted by the primary instance. `DBImpl::CheckConsistency` does not account for this and fails the consistency check. This is undesirable. The solution is that, we call `DBImpl::CheckConsistency()` first. If it passes, then we are good. If not, we give it a second chance and handles the case of file(s) being deleted.

Test plan (on dev server):
```
$make clean && make -j20 all
$./db_secondary_test
```
All other existing unit tests must pass as well.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5469

Differential Revision: D15861845

Pulled By: riversand963

fbshipit-source-id: 507d72392508caed3cd003bb2e2aa43f993dd597

7d8d5641

Persistent Stats: persist stats history to disk (#5046) · 671d15cb

由 Zhongyi Xie 提交于 6月 17, 2019

Summary:
This PR continues the work in https://github.com/facebook/rocksdb/pull/4748 and https://github.com/facebook/rocksdb/pull/4535 by adding a new DBOption `persist_stats_to_disk` which instructs RocksDB to persist stats history to RocksDB itself. When statistics is enabled, and both options `stats_persist_period_sec` and `persist_stats_to_disk` are set, RocksDB will periodically write stats to a built-in column family in the following form: key -> (timestamp in microseconds)#(stats name), value -> stats value. The existing API `GetStatsHistory` will detect the current value of `persist_stats_to_disk` and either read from in-memory data structure or from the hidden column family on disk.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5046

Differential Revision: D15863138

Pulled By: miasantreble

fbshipit-source-id: bb82abdb3f2ca581aa42531734ac799f113e931b

671d15cb

Make db_bloom_filter_test parallel (#5467) · ee294c24

由 Maysam Yabandeh 提交于 6月 17, 2019

Summary:
When run under TSAN it sometimes goes over 10m and times out. The slowest ones are `DBBloomFilterTestWithParam.BloomFilter` which we have 6 of them. Making the tests run in parallel should take care of the timeout issue.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5467

Differential Revision: D15856912

Pulled By: maysamyabandeh

fbshipit-source-id: 26c43c55312974c1b809c070342dee037d0219f4

ee294c24

Integrate block cache tracing into db_bench (#5459) · d43b4cd5

由 haoyuhuang 提交于 6月 17, 2019

Summary:
This PR integrates the block cache tracing into db_bench. It adds three command line arguments.
-block_cache_trace_file (Block cache trace file path.) type: string default: ""
-block_cache_trace_max_trace_file_size_in_bytes (The maximum block cache
trace file size in bytes. Block cache accesses will not be logged if the
trace file size exceeds this threshold. Default is 64 GB.) type: int64
default: 68719476736
-block_cache_trace_sampling_frequency (Block cache trace sampling
frequency, termed s. It uses spatial downsampling and samples accesses to
one out of s blocks.) type: int32 default: 1
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5459

Differential Revision: D15832031

Pulled By: HaoyuHuang

fbshipit-source-id: 0ecf2f2686557251fe741a2769b21170777efa3d

d43b4cd5

Switch Travis to Xenial build (#4789) · d1ae67bd

由 Adam Retter 提交于 6月 17, 2019

Summary:
I think this should now also run on Travis's new virtualised infrastructure which affords more memory and CPU.

We also need to think about migrating from travis-ci.org to travis-ci.com.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4789

Differential Revision: D15856272

fbshipit-source-id: 10b41d21924e8a362bc9646a63ccd1a5dfc437c6

d1ae67bd

15 6月, 2019 5 次提交

Integrate block cache tracer in block based table reader. (#5441) · 7a8d7358

由 haoyuhuang 提交于 6月 14, 2019

Summary:
This PR integrates the block cache tracer into block based table reader. The tracer will write the block cache accesses using the trace_writer. The tracer is null in this PR so that nothing will be logged.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5441

Differential Revision: D15772029

Pulled By: HaoyuHuang

fbshipit-source-id: a64adb92642cd23222e0ba8b10d86bf522b42f9b

7a8d7358

Validate CF Options when creating a new column family (#5453) · f1219644

由 Sagar Vemuri 提交于 6月 14, 2019

Summary:
It seems like CF Options are not properly validated  when creating a new column family with `CreateColumnFamily` API; only a selected few checks are done. Calling `ColumnFamilyData::ValidateOptions`, which is the single source for all CFOptions validations,  will help fix this. (`ColumnFamilyData::ValidateOptions` is already called at the time of `DB::Open`).

**Test Plan:**
Added a new test: `DBTest.CreateColumnFamilyShouldFailOnIncompatibleOptions`
```
TEST_TMPDIR=/dev/shm ./db_test --gtest_filter=DBTest.CreateColumnFamilyShouldFailOnIncompatibleOptions
```
Also ran gtest-parallel to make sure the new test is not flaky.
```
TEST_TMPDIR=/dev/shm ~/gtest-parallel/gtest-parallel ./db_test --gtest_filter=DBTest.CreateColumnFamilyShouldFailOnIncompatibleOptions --repeat=10000
[10000/10000] DBTest.CreateColumnFamilyShouldFailOnIncompatibleOptions (15 ms)
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5453

Differential Revision: D15816851

Pulled By: sagar0

fbshipit-source-id: 9e702b9850f5c4a7e0ef8d39e1e6f9b81e7fe1e5

f1219644

fix compilation error on MSVC (#5458) · b47cfec5

由 Huisheng Liu 提交于 6月 14, 2019

Summary:
"__attribute__((__weak__))" was introduced in port\jemalloc_helper.h. It's not supported by Microsoft VS 2015, resulting in compile error. This fix adds a #if branch to work around the compile issue.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5458

Differential Revision: D15827285

fbshipit-source-id: 8c5f7ad31de1ac677bd96f16c4450767de834beb

b47cfec5

Set executeLocal on child lego jobs (#5456) · 58c78358

由 Maysam Yabandeh 提交于 6月 14, 2019

Summary:
This property is needed to run the child jobs on the same host and thus propagate the child job status back to the parent's.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5456

Reviewed By: yancouto

Differential Revision: D15824382

Pulled By: maysamyabandeh

fbshipit-source-id: 42f2efbedaa3a8b399281105f0ce793c1c9a6191

58c78358

Remove unused variable (#5457) · 89695bfb

由 haoyuhuang 提交于 6月 14, 2019

Summary:
This PR removes the unused variable that causes CLANG build to fail.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5457

Differential Revision: D15825027

Pulled By: HaoyuHuang

fbshipit-source-id: 72c847c39ca310560efcbc5938cffa6f31164068

89695bfb

14 6月, 2019 5 次提交

Integrate block cache tracer into db_impl (#5433) · bb417806

由 haoyuhuang 提交于 6月 13, 2019

Summary:
This PR integrates the block cache tracer class into db_impl.cc.
db_impl.cc contains a member variable of AtomicBlockCacheTraceWriter class and passes its reference to the block_based_table_reader.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5433

Differential Revision: D15728016

Pulled By: HaoyuHuang

fbshipit-source-id: 23d5659e8c82d556833dcc1a5558aac8c1f7db71

bb417806

Add missing check before calling PurgeObsoleteFiles in EnableFileDeletions (#5448) · a3b8c76d

由 Levi Tamasi 提交于 6月 13, 2019

Summary:
Calling PurgeObsoleteFiles with a JobContext for which HaveSomethingToDelete
is false is a precondition violation. This would trigger an assertion in debug builds;
however, in release builds with assertions disabled, this can result in the
pending_purge_obsolete_files_ counter in DBImpl underflowing, which in turn can lead
to the process hanging during database close.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5448

Differential Revision: D15792569

Pulled By: ltamasi

fbshipit-source-id: 82d92c9b4f6a9efcdc69dbb3d5a52a1ae2dd2472

a3b8c76d

Dynamic test whether sync_file_range returns ENOSYS (#5416) · 2c9df9f9

由 Andrew Kryczka 提交于 6月 13, 2019

Summary:
`sync_file_range` returns `ENOSYS` on Windows Subsystem for Linux even
when using a supposedly supported filesystem like ext4. To handle this
case we can do a dynamic check that a no-op `sync_file_range`
invocation, which is accomplished by passing zero for the `flags`
argument, succeeds.

Also I rearranged the function and comments to hopefully make it more
easily understandable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5416

Differential Revision: D15807061

fbshipit-source-id: d31d94e1f228b7850ea500e6199f8b5daf8cfbd3

2c9df9f9

Add Alluxio to USERS.md (#5434) · ec8111c5

由 Bin Fan 提交于 6月 13, 2019

Summary:
Add Alluxio's use case of RocksDB to `USERS.md` for metadata service
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5434

Differential Revision: D15766559

Pulled By: riversand963

fbshipit-source-id: b68ef851f8f92e0925c31e55296260225fdf849e

ec8111c5

Support rocksdbjava aarch64 build and test (#5258) · 5c76ba9d

由 Patrick Zhang 提交于 6月 13, 2019

Summary:
Verified with an Ampere Computing eMAG aarch64 system.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5258

Differential Revision: D15807309

Pulled By: maysamyabandeh

fbshipit-source-id: ab85d2fd3fe40e6094430ab0eba557b1e979510d

5c76ba9d

13 6月, 2019 4 次提交

Fix appveyor compliant about passing const to thread (#5447) · 60f3ec2c

由 Maysam Yabandeh 提交于 6月 12, 2019

Summary:
CLANG would complain if we pass const to lambda function and appveyor complains if we don't (https://github.com/facebook/rocksdb/pull/5443). The patch fixes that by using the default capture mode.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5447

Differential Revision: D15788722

Pulled By: maysamyabandeh

fbshipit-source-id: 47e7f49264afe31fdafe42cb8bf93da126abfca9

60f3ec2c

Disable pipeline writes in stress test (#5445) · f9842869

由 Maysam Yabandeh 提交于 6月 12, 2019

Summary:
The tsan crash tests are failing with a data race compliant with pipelined write option. Temporarily disable it until its concurrency issue are fixed.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5445

Differential Revision: D15783824

Pulled By: maysamyabandeh

fbshipit-source-id: 413a0c3230b86f524fc7eeea2cf8e8375406e65b

f9842869

Disable kPipelinedWrite in MultiThreaded (#5442) · f43edff9

由 Maysam Yabandeh 提交于 6月 12, 2019

Summary:
TSAN tests report a race condition. We temporarily exclude kPipelinedWrite from MultiThreaded until the race condition is fixed.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5442

Differential Revision: D15782349

Pulled By: maysamyabandeh

fbshipit-source-id: 42b4f9b3fa9137f0675e13ad132c0a06800c1bdd

f43edff9

Remove passing const variable to thread (#5443) · 4a285d0d

由 Maysam Yabandeh 提交于 6月 12, 2019

Summary:
CLANG complains that passing const to thread is not necessary. The patch removes it form PreparedHeap::Concurrent test.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5443

Differential Revision: D15781598

Pulled By: maysamyabandeh

fbshipit-source-id: 3aceb05d96182fa4726d6d37eed45fd3aac4c016

4a285d0d

12 6月, 2019 6 次提交

WritePrepared: switch PreparedHeap from priority_queue to deque (#5436) · 773f914a

由 Maysam Yabandeh 提交于 6月 11, 2019

Summary:
Internally PreparedHeap is currently using a priority_queue. The rationale was the in the initial design PreparedHeap::AddPrepared could be called in arbitrary order. With the recent optimizations, we call ::AddPrepared only from the main write queue, which results into in-order insertion into PreparedHeap. The patch thus replaces the underlying priority_queue with a more efficient deque implementation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5436

Differential Revision: D15752147

Pulled By: maysamyabandeh

fbshipit-source-id: e6960f2b2097e13137dded1ceeff3b10b03b0aeb

773f914a

WriteUnprepared: commit only from the 2nd queue (#5439) · ca1aee2a

由 Manuel Ung 提交于 6月 11, 2019

Summary:
This is a port of this PR into WriteUnprepared:
https://github.com/facebook/rocksdb/pull/5014

This also reverts this test change to restore some flaky write unprepared
tests: https://github.com/facebook/rocksdb/pull/5315

Tested with:
$ gtest-parallel ./transaction_test --gtest_filter=MySQLStyleTransactionTest/MySQLStyleTransactionTest.TransactionStressTest/9 --repeat=128
[128/128] MySQLStyleTransactionTest/MySQLStyleTransactionTest.TransactionStressTest/9 (18250 ms)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5439

Differential Revision: D15761405

Pulled By: lth

fbshipit-source-id: ae2581fd942d8a5b3f9278fd6bc3c1ac0b2c964c

ca1aee2a

Revert "Reduce iterator key comparison for upper/lower bound check (#5111)" (#5440) · ba64a4cf

由 Levi Tamasi 提交于 6月 11, 2019

Summary:
This reverts commit f3a78475.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5440

Differential Revision: D15765967

Pulled By: ltamasi

fbshipit-source-id: d027fe24132e3729289cd7c01857a7eb449d9dd0

ba64a4cf

Handle missing WAL in secondary mode (#5323) · 7177dc46

由 Yanqin Jin 提交于 6月 11, 2019

Summary:
In secondary mode, it is possible that the secondary lists the primary's WAL
directory, finds a WAL and tries to open it. It is possible that the primary
deletes the WAL after secondary listing dir but before the secondary opening
it. Then the secondary will fail to open the WAL file with a PathNotFound
status. In this case, we can return OK without replaying WAL and optionally
replay more MANIFEST.

Test Plan (on my dev machine):
Without this PR, the following will fail several times out of 100 runs.
```
~/gtest-parallel/gtest-parallel -r 100 -w 16 ./db_secondary_test --gtest_filter=DBSecondaryTest.SwitchToNewManifestDuringOpen
```
With this PR, the above should always succeed.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5323

Differential Revision: D15763878

Pulled By: riversand963

fbshipit-source-id: c7164fa7cb8d9001abc258b6a2dc93613e4f38ff

7177dc46

First commit for block cache trace analyzer (#5425) · 9bbccda0

由 haoyuhuang 提交于 6月 11, 2019

Summary:
This PR contains the first commit for block cache trace analyzer. It reads a block cache trace file and prints statistics of the traces.

We will extend this class to provide more functionalities.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5425

Differential Revision: D15709580

Pulled By: HaoyuHuang

fbshipit-source-id: 2f43bd2311f460ab569880819d95eeae217c20bb

9bbccda0

TransactionUtil::CheckKey() to skip unnecessary history (#4941) · 58c4aee4

由 sdong 提交于 6月 11, 2019

Summary:
If a memtable definitely covers a key, there isn't a need to check older memtables.
We can skip them by checking the earliest sequence number.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4941

Differential Revision: D13932666

fbshipit-source-id: b9d52f234b8ad9dd3bf6547645cd457175a3ca9b

58c4aee4

11 6月, 2019 6 次提交

Fix DBTest.DynamicMiscOptions so it passes even with Snappy disabled (#5438) · a94aef65

由 Levi Tamasi 提交于 6月 10, 2019

Summary:
This affects our "no compression" automated tests. Since PR #5368, DBTest.DynamicMiscOptions has been failing with:

db/db_test.cc:4889: Failure
dbfull()->SetOptions({{"compression", "kSnappyCompression"}})
Invalid argument: Compression type Snappy is not linked with the binary.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5438

Differential Revision: D15752100

Pulled By: ltamasi

fbshipit-source-id: 3f19eff7cafc03b333965be0203c5853d2a9cb71

a94aef65

Avoid deadlock between mutex_ and log_write_mutex_ (#5437) · c8c1a549

由 Maysam Yabandeh 提交于 6月 10, 2019

Summary:
To avoid deadlock mutex_ should never be acquired before log_write_mutex_. The patch documents that and also fixes one case in ::FlushWAL that acquires mutex_ through ::WriteStatusCheck when it already holds lock on log_write_mutex_.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5437

Differential Revision: D15749722

Pulled By: maysamyabandeh

fbshipit-source-id: f57b69c44b4b80cc6d7ddf3d3fdf4a9eb5a5a45a

c8c1a549

Remove global locks from FlushScheduler (#5372) · b2584577

由 Maysam Yabandeh 提交于 6月 10, 2019

Summary:
FlushScheduler's methods are instrumented with debug-time locks to check the scheduler state against a simple container definition. Since https://github.com/facebook/rocksdb/pull/2286 the scope of such locks are widened to the entire methods' body. The result is that the concurrency tested during testing (in debug mode) is stricter than the concurrency level manifested at runtime (in release mode).
The patch reverts this change to reduce the scope of such locks.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5372

Differential Revision: D15545831

Pulled By: maysamyabandeh

fbshipit-source-id: 01d69191afb1dd807d4bdc990fc74813ae7b5426

b2584577

Use CreateLoggerFromOptions function (#5427) · 641cc8d5

由 Yanqin Jin 提交于 6月 10, 2019

Summary:
Use `CreateLoggerFromOptions` function to reduce code duplication.

Test plan (on my machine)
```
$make clean && make -j32 db_secondary_test
$KEEP_DB=1 ./db_secondary_test
```
Verify all info logs of the secondary instance are properly logged.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5427

Differential Revision: D15748922

Pulled By: riversand963

fbshipit-source-id: bad7261df1b8373efc504f141efc7871e375a311

641cc8d5

Create a BlockCacheLookupContext to enable fine-grained block cache tracing. (#5421) · 5efa0d6b

由 haoyuhuang 提交于 6月 10, 2019

Summary:
BlockCacheLookupContext only contains the caller for now.
We will trace block accesses at five places:
1. BlockBasedTable::GetFilter.
2. BlockBasedTable::GetUncompressedDict.
3. BlockBasedTable::MaybeReadAndLoadToCache. (To trace access on data, index, and range deletion block.)
4. BlockBasedTable::Get. (To trace the referenced key and whether the referenced key exists in a fetched data block.)
5. BlockBasedTable::MultiGet. (To trace the referenced key and whether the referenced key exists in a fetched data block.)

We create the context at:
1. BlockBasedTable::Get. (kUserGet)
2. BlockBasedTable::MultiGet. (kUserMGet)
3. BlockBasedTable::NewIterator. (either kUserIterator, kCompaction, or external SST ingestion calls this function.)
4. BlockBasedTable::Open. (kPrefetch)
5. Index/Filter::CacheDependencies. (kPrefetch)
6. BlockBasedTable::ApproximateOffsetOf. (kCompaction or kUserApproximateSize).

I loaded 1 million key-value pairs into the database and ran the readrandom benchmark with a single thread. I gave the block cache 10 GB to make sure all reads hit the block cache after warmup. The throughput is comparable.
Throughput of this PR: 231334 ops/s.
Throughput of the master branch: 238428 ops/s.

Experiment setup:
RocksDB:    version 6.2
Date:       Mon Jun 10 10:42:51 2019
CPU:        24 * Intel Core Processor (Skylake)
CPUCache:   16384 KB
Keys:       20 bytes each
Values:     100 bytes each (100 bytes after compression)
Entries:    1000000
Prefix:    20 bytes
Keys per prefix:    0
RawSize:    114.4 MB (estimated)
FileSize:   114.4 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: NoCompression
Compression sampling rate: 0
Memtablerep: skip_list
Perf Level: 1

Load command: ./db_bench --benchmarks="fillseq" --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --statistics --cache_index_and_filter_blocks --cache_size=10737418240 --disable_auto_compactions=1 --disable_wal=1 --compression_type=none --min_level_to_compress=-1 --compression_ratio=1 --num=1000000

Run command: ./db_bench --benchmarks="readrandom,stats" --use_existing_db --threads=1 --duration=120 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --statistics --cache_index_and_filter_blocks --cache_size=10737418240 --disable_auto_compactions=1 --disable_wal=1 --compression_type=none --min_level_to_compress=-1 --compression_ratio=1 --num=1000000 --duration=120

TODOs:
1. Create a caller for external SST file ingestion and differentiate the callers for iterator.
2. Integrate tracer to trace block cache accesses.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5421

Differential Revision: D15704258

Pulled By: HaoyuHuang

fbshipit-source-id: 4aa8a55f8cb1576ffb367bfa3186a91d8f06d93a

5efa0d6b

Reuse data block iterator in BlockBasedTableReader::MultiGet() (#5314) · 63ace8ef

由 anand76 提交于 6月 10, 2019

Summary:
Instead of creating a new DataBlockIterator for every key in a MultiGet batch, reuse it if the next key is in the same block. This results in a small 1-2% cpu improvement.

TEST_TMPDIR=/dev/shm/multiget numactl -C 10 ./db_bench.tmp -use_existing_db=true -benchmarks="readseq,multireadrandom" -write_buffer_size=4194304 -target_file_size_base=4194304 -max_bytes_for_level_base=16777216 -num=12000000 -reads=12000000 -duration=90 -threads=1 -compression_type=none -cache_size=4194304000 -batch_size=32 -disable_auto_compactions=true -bloom_bits=10 -cache_index_and_filter_blocks=true -pin_l0_filter_and_index_blocks_in_cache=true -multiread_batched=true -multiread_stride=4

Without the change -
multireadrandom : 3.066 micros/op 326122 ops/sec; (29375968 of 29375968 found)

With the change -
multireadrandom : 3.003 micros/op 332945 ops/sec; (29983968 of 29983968 found)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5314

Differential Revision: D15742108

Pulled By: anand1976

fbshipit-source-id: 220fb0b8eea9a0d602ddeb371528f7af7936d771

63ace8ef

kvdb / rocksdb 11 个月 前同步成功

kvdb / rocksdb
11 个月前同步成功