提交 · 4ed3c1eb8875f9ba5fe4aa585ddac22a429a1cd9 · kvdb / rocksdb

15 12月, 2018 2 次提交

Fix flaky test DeleteFileRange (#4784) · 4ed3c1eb

由 Maysam Yabandeh 提交于 12月 14, 2018

Summary:
The test fails sporadically expecting the DB to be empty after DeleteFilesInRange(..., nullptr, nullptr) call which is not. Debugging shows cases where the files are skipped since they are being compacted. The patch fixes the test by waiting for the last CompactRange to finish before calling DeleteFilesInRange.
Verified by
```
~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.DeleteFileRange --repeat=10000
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4784

Differential Revision: D13469402

Pulled By: maysamyabandeh

fbshipit-source-id: 3d8f44abe205b82c69f01e7edf27e1f8098248e1

4ed3c1eb

Synchronize ticker and histogram metrics for Java API (#4733) · d6dfe516

由 Adam Singer 提交于 12月 14, 2018

Summary:
Updating the `HistogramType.java` and `TickerType.java` to expose and correct metrics for statistics callbacks.

Moved `NO_ITERATOR_CREATED` to the proper stat name and deprecated `NO_ITERATORS`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4733

Differential Revision: D13466936

Pulled By: sagar0

fbshipit-source-id: a58d1edcc07c7b68c3525b1aa05828212c89c6c7

d6dfe516

14 12月, 2018 6 次提交

Refine db_stress params for atomic flush (#4781) · 8d2b74d2

由 Andrew Kryczka 提交于 12月 13, 2018

Summary:
Separate flag for enabling option from flag for enabling dedicated atomic stress test. I have found setting the former without setting the latter can detect different problems.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4781

Differential Revision: D13463211

Pulled By: ajkr

fbshipit-source-id: 054f777885b2dc7d5ea99faafa21d6537eee45fd

8d2b74d2

Fix race condition on options_file_number_ (#4780) · 34954233

由 Maysam Yabandeh 提交于 12月 13, 2018

Summary:
options_file_number_ must be written under db::mutex_ sine its read is protected by mutex_ in ::GetLiveFiles(). However currently it is written in ::RenameTempFileToOptionsFile() which according to its contract must be called without holding db::mutex_. The patch fixes the race condition by also acquitting the mutex_ before writing options_file_number_. Also it does that only if the rename of option file is successful.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4780

Differential Revision: D13461411

Pulled By: maysamyabandeh

fbshipit-source-id: 2d5bae96a1f3e969ef2505b737cf2d7ae749787b

34954233

Improve flushing multiple column families (#4708) · 4fce44fc

由 Yanqin Jin 提交于 12月 13, 2018

Summary:
If one column family is dropped, we should simply skip it and continue to flush
other active ones.
Currently we use Status::ShutdownInProgress to notify caller of column families
being dropped. In the future, we should consider using a different Status code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4708

Differential Revision: D13378954

Pulled By: riversand963

fbshipit-source-id: 42f248cdf2d32d4c0f677cd39012694b8f1328ca

4fce44fc

Reduce runtime of compact_on_deletion_collector_test (#4779) · 67e5b542

由 Maysam Yabandeh 提交于 12月 13, 2018

Summary:
It sometimes times out with it is run with TSAN. The patch reduces the iteration from 50 to 30. This reduces the normal runtime from 5.2 to 3.1 seconds and should similarly address the TSAN timeout problem.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4779

Differential Revision: D13456862

Pulled By: maysamyabandeh

fbshipit-source-id: fdc0ad7d781b1c33b771d2415ff5fa2f1b5e2537

67e5b542

Get `CompactionJobInfo` from CompactFiles · 2670fe8c

由 DorianZheng 提交于 12月 13, 2018

Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4716

Differential Revision: D13207677

Pulled By: ajkr

fbshipit-source-id: d0ccf5a66df6cbb07288b0c5ebad81fd9df3926b

2670fe8c

Concurrent task limiter for compaction thread control (#4332) · a8b9891f

由 Burton Li 提交于 12月 13, 2018

Summary:
The PR is targeting to resolve the issue of:
https://github.com/facebook/rocksdb/issues/3972#issue-330771918

We have a rocksdb created with leveled-compaction with multiple column families (CFs), some of CFs are using HDD to store big and less frequently accessed data and others are using SSD.
When there are continuously write traffics going on to all CFs, the compaction thread pool is mostly occupied by those slow HDD compactions, which blocks fully utilize SSD bandwidth.
Since atomic write and transaction is needed across CFs, so splitting it to multiple rocksdb instance is not an option for us.

With the compaction thread control, we got 30%+ HDD write throughput gain, and also a lot smooth SSD write since less write stall happening.

ConcurrentTaskLimiter can be shared with multi-CFs across rocksdb instances, so the feature does not only work for multi-CFs scenarios, but also for multi-rocksdbs scenarios, who need disk IO resource control per tenant.

The usage is straight forward:
e.g.:

//
// Enable compaction thread limiter thru ColumnFamilyOptions
//
std::shared_ptr<ConcurrentTaskLimiter> ctl(NewConcurrentTaskLimiter("foo_limiter", 4));
Options options;
ColumnFamilyOptions cf_opt(options);
cf_opt.compaction_thread_limiter = ctl;
...

//
// Compaction thread limiter can be tuned or disabled on-the-fly
//
ctl->SetMaxOutstandingTask(12); // enlarge to 12 tasks
...
ctl->ResetMaxOutstandingTask(); // disable (bypass) thread limiter
ctl->SetMaxOutstandingTask(-1); // Same as above
...
ctl->SetMaxOutstandingTask(0);  // full throttle (0 task)

//
// Sharing compaction thread limiter among CFs (to resolve multiple storage perf issue)
//
std::shared_ptr<ConcurrentTaskLimiter> ctl_ssd(NewConcurrentTaskLimiter("ssd_limiter", 8));
std::shared_ptr<ConcurrentTaskLimiter> ctl_hdd(NewConcurrentTaskLimiter("hdd_limiter", 4));
Options options;
ColumnFamilyOptions cf_opt_ssd1(options);
ColumnFamilyOptions cf_opt_ssd2(options);
ColumnFamilyOptions cf_opt_hdd1(options);
ColumnFamilyOptions cf_opt_hdd2(options);
ColumnFamilyOptions cf_opt_hdd3(options);

// SSD CFs
cf_opt_ssd1.compaction_thread_limiter = ctl_ssd;
cf_opt_ssd2.compaction_thread_limiter = ctl_ssd;

// HDD CFs
cf_opt_hdd1.compaction_thread_limiter = ctl_hdd;
cf_opt_hdd2.compaction_thread_limiter = ctl_hdd;
cf_opt_hdd3.compaction_thread_limiter = ctl_hdd;

...

//
// The limiter is disabled by default (or set to nullptr explicitly)
//
Options options;
ColumnFamilyOptions cf_opt(options);
cf_opt.compaction_thread_limiter = nullptr;
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4332

Differential Revision: D13226590

Pulled By: siying

fbshipit-source-id: 14307aec55b8bd59c8223d04aa6db3c03d1b0c1d

a8b9891f

13 12月, 2018 1 次提交

Fix flaky test DBCompactionTest::DeleteFileRange (#4776) · 0aa17c10

由 Maysam Yabandeh 提交于 12月 12, 2018

Summary:
The test has been failing sporadically probably because the configured compaction options were actually unused. Verified that by the following:
```
~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.DeleteFileRange --repeat=1000
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4776

Differential Revision: D13441052

Pulled By: maysamyabandeh

fbshipit-source-id: d35075b9e6cef9b9c9d0d571f9cd72ade8eda55d

0aa17c10

12 12月, 2018 7 次提交

Expose column family id to `FlushJobInfo` · 4862720e

由 DorianZheng 提交于 12月 11, 2018

Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4772

Differential Revision: D13428923

Pulled By: ajkr

fbshipit-source-id: e351e9c5eea97816db25429e129357a8af90712a

4862720e

Direct I/O Close() shouldn't rewrite the last block (#4771) · ae25546a

由 Siying Dong 提交于 12月 11, 2018

Summary:
In Direct I/O case, WritableFileWriter::Close() rewrites the last block again, even if there is nothing new. The reason is that, Close() flushes the buffer. For non-direct I/O case, the buffer is empty in this case so it is a no-op. However, in direct I/O case, the partial data in the last block is kept in the buffer because it needs to be rewritten for the next write. This piece of data is flushed again. This commit fixes it by skipping this write out if `pending_sync_` flag shows that there isn't new data sync last sync.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4771

Differential Revision: D13420426

Pulled By: siying

fbshipit-source-id: 9d39ec9a215b1425d4ed40d85e0eba1f5daa75c6

ae25546a

Fix swallowing of exception in Java RocksDB when loading native library (#4728) · 49666d76

由 Tathagata Das 提交于 12月 11, 2018

Summary:
This PR fixes #4721. When an exception is caught and thrown as a different exception, then the original exception should be  inserted as a cause of the new exception. This bug in RocksDB was swallowing the underlying exception from `NativeLibraryLoader` and throwing the following exception
```
...
Caused by: java.lang.RuntimeException: Unable to load the RocksDB shared libraryjava.nio.channels.ClosedByInterruptException
  at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:67)
  at org.rocksdb.RocksDB.<clinit>(RocksDB.java:35)
  ... 73 more
```

The fix is simple and self-explanatory.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4728

Differential Revision: D13418371

Pulled By: sagar0

fbshipit-source-id: d76c25af2a83a0f8ba62cc8d7b721bfddc85fdf1

49666d76

Prepare FragmentedRangeTombstoneIterator for use in compaction (#4740) · cad248f5

由 Abhishek Madan 提交于 12月 11, 2018

Summary:
To support the flush/compaction use cases of RangeDelAggregator
in v2, FragmentedRangeTombstoneIterator now supports dropping tombstones
that cannot be read in the compaction output file. Furthermore,
FragmentedRangeTombstoneIterator supports the "snapshot striping" use
case by allowing an iterator to be split by a list of snapshots.
RangeDelAggregatorV2 will use these changes in a follow-up change.

In the process of making these changes, other miscellaneous cleanups
were also done in these files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4740

Differential Revision: D13287382

Pulled By: abhimadan

fbshipit-source-id: f5aeb03e1b3058049b80c02a558ee48f723fa48c

cad248f5

RocksJava must compile on JDK7 (#4768) · d3daa0db

由 Adam Retter 提交于 12月 11, 2018

Summary:
Fixes some RocksJava regressions recently introduced, whereby RocksJava would not build on JDK 7.
These should have been visible on Travis-CI!
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4768

Differential Revision: D13418173

Pulled By: sagar0

fbshipit-source-id: 57bf223188887f84d9e072031af2e0d2c8a69c30

d3daa0db

Change directory where ExternalSSTFileBasicTest runs (#4766) · dde3ef11

由 Sagar Vemuri 提交于 12月 11, 2018

Summary:
Change the directory where ExternalSSTFileBasicTest* tests run.

**Problem:**
Without this change, I spent considerable time chasing around a non-existent issue as ExternalSSTFileTest.* and ExternalSSTFileBasicTest.* create similar directories.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4766

Differential Revision: D13409384

Pulled By: sagar0

fbshipit-source-id: c33e1f4d505dfa6efbc788d6c57cdb680053ded3

dde3ef11

Fix issues with RocksJava dropColumnFamily (#4770) · f8943ec0

由 Adam Retter 提交于 12月 11, 2018

Summary:
Closes https://github.com/facebook/rocksdb/issues/4409
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4770

Differential Revision: D13416802

Pulled By: ajkr

fbshipit-source-id: 8a351e9b80dc9eeb6073467fbc67cd2f544917b0

f8943ec0

11 12月, 2018 4 次提交

Promote CompactionFilter* accessors to ColumnFamilyOptionsInterface (#3461) · 8261e002

由 Ben Clay 提交于 12月 10, 2018

Summary:
When adding CompactionFilter and CompactionFilterFactory settings to the Java layer, ColumnFamilyOptions was modified directly instead of ColumnFamilyOptionsInterface. This meant that the old-stye Options monolith was left behind.

This patch fixes that, by:
- promoting the CompactionFilter + CompactionFilterFactory setters from ColumnFamilyOptions -> ColumnFamilyOptionsInterface
- adding getters in ColumnFamilyOptionsInterface
- implementing setters in Options
- implementing getters in both ColumnFamilyOptions and Options
- adding testcases
- reusing a test CompactionFilterFactory by moving it to a common location
Pull Request resolved: https://github.com/facebook/rocksdb/pull/3461

Differential Revision: D13278788

Pulled By: sagar0

fbshipit-source-id: 72602c6eb97dc80734e718abb5e2e9958d3c753b

8261e002

Properly set smallest key of subcompaction output (#4723) · 64aabc91

由 Abhishek Madan 提交于 12月 10, 2018

Summary:
It is possible to see a situation like the following when
subcompactions are enabled:
1. A subcompaction boundary is set to `[b, e)`.
2. The first output file in a subcompaction has `c@20` as its smallest key
3. The range tombstone `[a, d)30` is encountered.
4. The tombstone is written to the range-del meta block and the new
   smallest key is set to `b@0` (since no keys in this subcompaction's
   output can be smaller than `b`).
5. A key `b@10` in a lower level will now reappear, since it is not
   covered by the truncated start key `b@0`.

In general, unless the smallest data key in a file has a seqnum of 0, it
is not safe to truncate a tombstone at the start key to have a seqnum of
0, since it can expose keys with a seqnum greater than 0 but less than
the tombstone's actual seqnum.

To fix this, when the lower bound of a file is from the subcompaction
boundaries, we now set the seqnum of an artificially extended smallest
key to the tombstone's seqnum. This is safe because subcompactions
operate over disjoint sets of keys, and the subcompactions that can
experience this problem are not the first subcompaction (which is
unbounded on the left).

Furthermore, there is now an assertion to detect the described anomalous
case.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4723

Differential Revision: D13236188

Pulled By: abhimadan

fbshipit-source-id: a6da6a113f2de1e2ff307ca72e055300c8fe5692

64aabc91

Reduce javadoc warnings (#4764) · 10e7de77

由 Adam Singer 提交于 12月 10, 2018

Summary:
Compile logs have a bit of noise due to missing javadoc annotations. Updating docs to reduce.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4764

Differential Revision: D13400193

Pulled By: sagar0

fbshipit-source-id: 65c7efb70747cc3bb35a336a6881ea6536ae5ff4

10e7de77

Fix inline comments for assumed_tracked (#4762) · 21fca397

由 Maysam Yabandeh 提交于 12月 10, 2018

Summary:
Fix the definition of assumed_tracked in Transaction that was introduced in #4680
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4762

Differential Revision: D13399150

Pulled By: maysamyabandeh

fbshipit-source-id: 2a30fe49e3c44adacd7e45cd48eae95023ca9dca

21fca397

08 12月, 2018 5 次提交

Enable checkpoint of read-only db (#4681) · f307479b

由 Yanqin Jin 提交于 12月 07, 2018

Summary:
1. DBImplReadOnly::GetLiveFiles should not return NotSupported. Instead, it
   should call DBImpl::GetLiveFiles(flush_memtable=false).
2. In DBImp::Recover, we should also recover the OPTIONS file name and/or
   number so that an immediate subsequent GetLiveFiles will get the correct
   OPTIONS name.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4681

Differential Revision: D13069205

Pulled By: riversand963

fbshipit-source-id: 3e6a0174307d06db5a01feb099b306cea1f7f88a

f307479b

Add PerfContext counters for index/filter block cache stats (#4540) · 1b01d23b

由 Anand Ananthabhotla 提交于 12月 07, 2018

Summary:
Add counters to track block cache index/filter hits and misses. We currently count aggregate hits and misses, which includes index/filter/data blocks.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4540

Differential Revision: D10459652

Pulled By: anand1976

fbshipit-source-id: 0c59eee7f12f5103dcb6686f0e7995babe63d425

1b01d23b

Updated the CentOS 6 Docker build for RocksJava to a newer GCC toolchain (#4756) · 4048762c

由 Adam Retter 提交于 12月 07, 2018

Summary:
Uses a newer build toolchain but the same old GLIBC when building releases of RocksJava for Linux x64 in the Docker Container.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4756

Differential Revision: D13383575

Pulled By: sagar0

fbshipit-source-id: 27c58814876e434d5fa61395e6664cfc5f6830b1

4048762c

Refactor BlockBasedTable::Open (#4636) · 0463f618

由 Sagar Vemuri 提交于 12月 07, 2018

Summary:
Refactored and simplified `BlockBasedTable::Open` to be similar to `BlockBasedTableBuilder::Finish` as both these functions complement each other. Also added `BlockBasedTableBuilder::WriteFooter` along the way.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4636

Differential Revision: D12933319

Pulled By: sagar0

fbshipit-source-id: 1ff1d02f6d80a63b5ba720a1fc75e71c7344137b

0463f618

fix tombstone collectable test (#4755) · c41c60be

由 Pengchao Wang 提交于 12月 07, 2018

Summary:
the original test does not give enough time difference between tombstone write time and the expire time point, which make test flaky.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4755

Reviewed By: maysamyabandeh

Differential Revision: D13369681

Pulled By: wpc

fbshipit-source-id: 22576f354c63cd0b39d8b35c3913303707503ea9

c41c60be

07 12月, 2018 1 次提交

Extend Transaction::GetForUpdate with do_validate (#4680) · b878f93c

由 Maysam Yabandeh 提交于 12月 06, 2018

Summary:
Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot (if there is any) before doing the read. After the read it also returns the latest value (expects the ReadOptions::snapshot to be nullptr). This allows RocksDB applications to use GetForUpdate similarly to how InnoDB does. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false).
The Java APIs are accordingly updated.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4680

Differential Revision: D13068508

Pulled By: maysamyabandeh

fbshipit-source-id: f0b59db28f7f6a078b60844d902057140765e67d

b878f93c

06 12月, 2018 4 次提交

Update HISTORY.md (#4753) · 1d679e35

由 Yanqin Jin 提交于 12月 05, 2018

Summary:
As titled. Update history to include a recent bug fix in
9be3e6b4.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4753

Differential Revision: D13350286

Pulled By: riversand963

fbshipit-source-id: b6324780dee4cb1757bc2209403a08531c150c08

1d679e35

Allow file-ingest-triggered flush to skip waiting for write-stall clear (#4751) · 9be3e6b4

由 Yanqin Jin 提交于 12月 05, 2018

Summary:
When write stall has already been triggered due to number of L0 files reaching
threshold, file ingestion must proceed with its flush without waiting for the
write stall condition to cleared by the compaction because compaction can wait
for ingestion to finish (circular wait).

In order to avoid this wait, we can set `FlushOptions.allow_write_stall` to be
true (default is false). Setting it to false can cause deadlock.

This can happen when the number of compaction threads is low.

Considere the following
```
Time  compaction_thread                        ingestion_thread
 |                                             num_running_ingest_file_++
 |    while(num_running_ingest_file_>0){wait}
 |                                             flush
 V
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4751

Differential Revision: D13343037

Pulled By: riversand963

fbshipit-source-id: d3b95938814af46ec4c463feff0b50c70bd8b23f

9be3e6b4

Move a function to critical section (#4752) · b96fccb1

由 Yanqin Jin 提交于 12月 05, 2018

Summary:
Test plan
```
$make clean && make -j32 all check
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4752

Differential Revision: D13344705

Pulled By: riversand963

fbshipit-source-id: fc3a43174d09d70ccc2b09decd78e1da1b6ba9d1

b96fccb1

Fix buck dev mode fbcode builds (#4747) · e58d7695

由 anand76 提交于 12月 05, 2018

Summary:
Don't enable ROCKSDB_JEMALLOC unless the build mode is opt and default
allocator is jemalloc. In dev mode, this is causing compile/link errors such as -
```
stderr: buck-out/dev/gen/rocksdb/src/rocksdb_lib#compile-pic-malloc_stats.cc.o4768b59e,gcc-5-glibc-2.23-clang/db/malloc_stats.cc.o:malloc_stats.cc:function rocksdb::DumpMallocStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*): error: undefined reference to 'malloc_stats_print'
clang-7.0: error: linker command failed with exit code 1
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4747

Differential Revision: D13324840

Pulled By: anand1976

fbshipit-source-id: 45ffbd4f63fe4d9e8a0473d8f066155e4ef64a14

e58d7695

04 12月, 2018 1 次提交

Revert "BaseDeltaIterator: always check valid() before accessing key(… (#4744) · 2f1ca4e8

由 Zhongyi Xie 提交于 12月 03, 2018

Summary:
…) (#4702)"

This reverts commit 3a18bb3e.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4744

Differential Revision: D13311869

Pulled By: miasantreble

fbshipit-source-id: 6300b12cc34828d8b9274e907a3aef1506d5d553

2f1ca4e8

01 12月, 2018 4 次提交

Update History for fast-forwarded 5.18 branch · 55479eb5

由 Fosco Marotto 提交于 11月 30, 2018

Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4704

Differential Revision: D13283300

Pulled By: gfosco

fbshipit-source-id: cb4fdaa93137e0bba64b781ba7e8fe31b19e5656

55479eb5

BaseDeltaIterator: always check valid() before accessing key() (#4702) · 3a18bb3e

由 Zhongyi Xie 提交于 11月 30, 2018

Summary:
Current implementation of `current_over_upper_bound_` fails to take into consideration that keys might be invalid in either base iterator or delta iterator. Calling key() in such scenario will lead to assertion failure and runtime errors.
This PR addresses the bug by adding check for valid keys before calling `IsOverUpperBound()`, also added test coverage for iterate_upper_bound usage in BaseDeltaIterator
Also recommit https://github.com/facebook/rocksdb/pull/4656 (It was reverted earlier due to bugs)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4702

Differential Revision: D13146643

Pulled By: miasantreble

fbshipit-source-id: 6d136929da12d0f2e2a5cea474a8038ec5cdf1d0

3a18bb3e

Make NewBloomFilterPolicy() use full filter by default (#4735) · 6e938c90

由 Siying Dong 提交于 11月 30, 2018

Summary:
Full block (use_block_based_builder=false) Bloom filter has clear CPU saving benefits but with limitation of using temp memory when building an SST file proportional to the SST file size. We reduced the chance of having large SST files with multi-level universal compaction. Now we change to a default with better performance.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4735

Differential Revision: D13266674

Pulled By: siying

fbshipit-source-id: 7594a4c3e32568a5a2adce22bb0e46553e55c602

6e938c90

fix unused param "options" error in jemalloc_nodump_allocator.cc (#4738) · b0f3d9b4

由 Zhongyi Xie 提交于 11月 30, 2018

Summary:
Currently tests are failing on master with the following message:
> util/jemalloc_nodump_allocator.cc:132:8: error: unused parameter ‘options’ [-Werror=unused-parameter]
 Status NewJemallocNodumpAllocator(

This PR attempts to fix the issue
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4738

Differential Revision: D13278804

Pulled By: miasantreble

fbshipit-source-id: 64a6204aa685bd85d8b5080655cafef9980fac2f

b0f3d9b4

30 11月, 2018 5 次提交

WritePrepared: followup fix for snapshot double release issue (#4734) · f1b0841f

由 Maysam Yabandeh 提交于 11月 29, 2018

Summary:
The fix in #4727 for double snapshot release was incomplete since it does not properly remove the duplicate entires in the snapshot list after finding that a snapshot is still valid. The patch does that and also improves the unit test to show the issue.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4734

Differential Revision: D13266260

Pulled By: maysamyabandeh

fbshipit-source-id: 351e2c40cca45a87b757774c11af74182314911e

f1b0841f

JemallocNodumpAllocator: option to limit tcache memory usage (#4736) · cf1df5d3

由 Yi Wu 提交于 11月 29, 2018

Summary:
Add option to limit tcache usage by allocation size. This is to reduce total tcache size in case there are many user threads accessing the allocator and incur non-trivial memory usage.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4736

Differential Revision: D13269305

Pulled By: yiwu-arbug

fbshipit-source-id: 95a9b7fc67facd66837c849137e30e137112e19d

cf1df5d3

Move FIFOCompactionPicker to a separate file (#4724) · 70645355

由 Sagar Vemuri 提交于 11月 29, 2018

Summary:
**Summary:**
Simplified the code layout by moving FIFOCompactionPicker to a separate file.
**Why?:**
While trying to add ttl functionality to universal compaction, I found that `FIFOCompactionPicker` class and its impl methods to be interspersed between `LevelCompactionPicker` methods which kind-of made the code a little hard to traverse. So I moved `FIFOCompactionPicker` to a separate compaction_picker_fifo.h/cc file, similar to `UniversalCompactionPicker`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4724

Differential Revision: D13227914

Pulled By: sagar0

fbshipit-source-id: 89471766ea67fa4d87664a41c057dd7df4b3d4e3

70645355

Fix a flaky test DBFlushTest.SyncFail (#4633) · 8d7bc76f

由 Yanqin Jin 提交于 11月 29, 2018

Summary:
There is a race condition in DBFlushTest.SyncFail, as illustrated below.
```
time         thread1                             bg_flush_thread
  |     Flush(wait=false, cfd)
  |     refs_before=cfd->current()->TEST_refs()   PickMemtable calls cfd->current()->Ref()
  V
```
The race condition between thread1 getting the ref count of cfd's current
version and bg_flush_thread incrementing the cfd's current version makes it
possible for later assertion on refs_before to fail. Therefore, we add test
sync points to enforce the order and assert on the ref count before and after
PickMemtable is called in bg_flush_thread.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4633

Differential Revision: D12967131

Pulled By: riversand963

fbshipit-source-id: a99d2bacb7869ec5d8d03b24ef2babc0e6ae1a3b

8d7bc76f

db/repair: reset Repair::db_lock_ in ctor (#4683) · 7dbee387

由 Kefu Chai 提交于 11月 29, 2018

Summary:
there is chance that

* the caller tries to repair the db when holding the db_lock, in
  that case the env implementation might not set the `lock`
  parameter of Repairer::Run().
* the caller somehow never calls Repairer::Run().

either way, the desctructor of Repair will compare the uninitialized
db_lock_ with nullptr, and tries to unlock it. there is good chance
that the db_lock_ is not nullptr, then boom.
Signed-off-by: NKefu Chai <tchaikov@gmail.com>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4683

Differential Revision: D13260287

Pulled By: riversand963

fbshipit-source-id: 878a119d2e9f10a0fa17ee62cf3fb24b33d49fa5

7dbee387

kvdb / rocksdb 11 个月 前同步成功

kvdb / rocksdb
11 个月前同步成功