- 08 11月, 2018 1 次提交
-
-
由 Zhongyi Xie 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4637 Differential Revision: D12963727 Pulled By: miasantreble fbshipit-source-id: 76053501afbecc6ef388ddc56542fa0185243e3f
-
- 07 11月, 2018 1 次提交
-
-
由 Siying Dong 提交于
Summary: valgrind tests with 1 thread run too long. To make it shorter, black list some long tests. These are already blacklisted in parallel valgrind tests, but they are not in non-parallel mode Pull Request resolved: https://github.com/facebook/rocksdb/pull/4642 Differential Revision: D12945237 Pulled By: siying fbshipit-source-id: 04cf977d435996480fe87aa09f14b17975b74f7d
-
- 06 11月, 2018 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: This property can help debug why SST files aren't being deleted. Previously we only had the property "rocksdb.is-file-deletions-enabled". However, even when that returned true, obsolete SSTs may still not be deleted due to the coarse-grained mechanism we use to prevent newly created SSTs from being accidentally deleted. That coarse-grained mechanism uses a lower bound file number for SSTs that should not be deleted, and this property exposes that lower bound. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4618 Differential Revision: D12898179 Pulled By: ajkr fbshipit-source-id: fe68acc041ddbcc9276bbd48976524d95aafc776
-
- 03 11月, 2018 2 次提交
-
-
由 Siying Dong 提交于
Summary: ExternalSSTFileTest.IngestNonExistingFile occasionally fail for number of SST files after manual compaction doesn't go down as expected. Although I don't find a reason how this can happen, adding an extra waiting to make sure obsolete file purging has finished before we check the files doesn't hurt. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4625 Differential Revision: D12910586 Pulled By: siying fbshipit-source-id: 2a5ddec6908c99cf3bcc78431c6f93151c2cab59
-
由 Zhongyi Xie 提交于
Summary: fix current failing lite test: > In file included from ./util/testharness.h:15:0, from ./table/mock_table.h:23, from ./db/db_test_util.h:44, from db/db_flush_test.cc:10: db/db_flush_test.cc: In member function ‘virtual void rocksdb::DBFlushTest_ManualFlushFailsInReadOnlyMode_Test::TestBody()’: db/db_flush_test.cc:250:35: error: ‘Properties’ is not a member of ‘rocksdb::DB’ ASSERT_TRUE(db_->GetIntProperty(DB::Properties::kBackgroundErrors, ^ make: *** [db/db_flush_test.o] Error 1 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4619 Differential Revision: D12898319 Pulled By: miasantreble fbshipit-source-id: 72de603b1f2e972fc8caa88611798c4e98e348c6
-
- 02 11月, 2018 3 次提交
-
-
由 Yanqin Jin 提交于
Summary: The new case is directIO = true, write_global_seqno = false in which we no longer write global_seqno to the external SST file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4614 Differential Revision: D12885001 Pulled By: riversand963 fbshipit-source-id: 7541bdc608b3a0c93d3c3c435da1b162b36673d4
-
由 Bo Hou 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4607 Reviewed By: siying Differential Revision: D12836696 Pulled By: jsjhoubo fbshipit-source-id: 7122ccb712d0b0f1cd998aa4477e0da1401bd870
-
由 Andrew Kryczka 提交于
Summary: The logic to wait for stall conditions to clear before beginning a manual flush didn't take into account whether the DB was in read-only mode. In read-only mode the stall conditions would never clear since no background work is happening, so the wait would be never-ending. It's probably better to return an error to the user. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4615 Differential Revision: D12888008 Pulled By: ajkr fbshipit-source-id: 1c474b42a7ac38d9fd0d0e2340ff1d53e684d83c
-
- 01 11月, 2018 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: A background compaction with pre-picked files (i.e., either a manual compaction or a bottom-pri compaction) fails when the DB is in read-only mode. In the failure handling, we forgot to unregister the compaction and the files it covered. Then subsequent manual compactions could conflict with this zombie compaction (possibly Halloween related) and wait forever for it to finish. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4611 Differential Revision: D12871217 Pulled By: ajkr fbshipit-source-id: 9d24e921d5bbd2ee8c2c9536a30abfa42a220c6e
-
- 31 10月, 2018 3 次提交
-
-
由 Yanqin Jin 提交于
Summary: Add unit tests to demonstrate that `VersionSet::Recover` is able to detect and handle cases in which the MANIFEST has valid atomic group, incomplete trailing atomic group, atomic group mixed with normal version edits and atomic group with incorrect size. With this capability, RocksDB identifies non-valid groups of version edits and do not apply them, thus guaranteeing that the db is restored to a state consistent with the most recent successful atomic flush before applying WAL. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4433 Differential Revision: D10079202 Pulled By: riversand963 fbshipit-source-id: a0e0b8bf4da1cf68e044d397588c121b66c68876
-
由 Abhishek Madan 提交于
Summary: Since the number of range deletions are reported in TableProperties, it is confusing to not report the number of merge operands and point deletions as top-level properties; they are accessible through the public API, but since they are not the "main" properties, they do not appear in aggregated table properties, or the string representation of table properties. This change promotes those two property keys to `rocksdb/table_properties.h`, adds corresponding uint64 members for them, deprecates the old access methods `GetDeletedKeys()` and `GetMergeOperands()` (though they are still usable for now), and removes `InternalKeyPropertiesCollector`. The property key strings are the same as before this change, so this should be able to read DBs written from older versions (though I haven't tested this yet). Pull Request resolved: https://github.com/facebook/rocksdb/pull/4594 Differential Revision: D12826893 Pulled By: abhimadan fbshipit-source-id: 9e4e4fbdc5b0da161c89582566d184101ba8eb68
-
由 Siying Dong 提交于
Summary: EnableFileDeletions() does info logging inside db mutex. This is not recommended in the code base, since there could be I/O involved. Move this outside the DB mutex. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4604 Differential Revision: D12834432 Pulled By: siying fbshipit-source-id: ffe5c2626fcfdb4c54a661a3c3b0bc95054816cf
-
- 30 10月, 2018 3 次提交
-
-
由 Andrew Kryczka 提交于
Summary: When there's a gap between files, we do not need to output tombstones starting at the next output file's begin key to the current output file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4592 Differential Revision: D12808627 Pulled By: ajkr fbshipit-source-id: 77c8b2e7523a95b1cd6611194144092c06acb505
-
由 Yanqin Jin 提交于
Summary: Since ErrorHandler::RecoverFromNoSpace is no-op in LITE mode, then we should not have this test in LITE mode. If we do keep it, it will cause the test thread to wait on bg_cv_ that will not be signalled. How to reproduce ``` $make clean && git checkout a27fce40 $OPT="-DROCKSDB_LITE -g" make -j20 $./db_io_failure_test --gtest_filter=DBIOFailureTest.NoSpaceCompactRange ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4596 Differential Revision: D12818516 Pulled By: riversand963 fbshipit-source-id: bc83524f40fff1e29506979017f7f4c2b70322f3
-
由 Yanqin Jin 提交于
Summary: For flush triggered by RocksDB due to memory usage approaching certain threshold (WriteBufferManager or Memtable full), we should cut the memtable only when the current active memtable is not empty, i.e. contains data. This is what we do for non-atomic flush. If we always cut memtable even when the active memtable is empty, we will generate extra, empty immutable memtable. This is not ideal since it may cause write stall. It also causes some DBAtomicFlushTest to fail because cfd->imm()->NumNotFlushed() is different from expectation. Test plan ``` $make clean && make J=1 -j32 all check $make clean && OPT="-DROCKSDB_LITE -g" make J=1 -j32 all check $make clean && TEST_TMPDIR=/dev/shm/rocksdb OPT=-g make J=1 -j32 valgrind_test ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4595 Differential Revision: D12818520 Pulled By: riversand963 fbshipit-source-id: d867bdbeacf4199fdd642debb085f94703c41a18
-
- 27 10月, 2018 2 次提交
-
-
由 Yanqin Jin 提交于
Summary: Adds a DB option `atomic_flush` to control whether to enable this feature. This PR is a subset of [PR 3752](https://github.com/facebook/rocksdb/pull/3752). Pull Request resolved: https://github.com/facebook/rocksdb/pull/4023 Differential Revision: D8518381 Pulled By: riversand963 fbshipit-source-id: 1e3bb33e99bb102876a31b378d93b0138ff6634f
-
由 Yi Wu 提交于
Summary: Rename the interface, as it is mean to be a generic interface for memory allocation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4590 Differential Revision: D10866340 Pulled By: yiwu-arbug fbshipit-source-id: 85cb753351a40cb856c046aeaa3f3b369eef3d16
-
- 26 10月, 2018 1 次提交
-
-
由 Abhishek Madan 提交于
Summary: This allows tombstone fragmenting to only be performed when the table is opened, and cached for subsequent accesses. On the same DB used in #4449, running `readrandom` results in the following: ``` readrandom : 0.983 micros/op 1017076 ops/sec; 78.3 MB/s (63103 of 100000 found) ``` Now that Get performance in the presence of range tombstones is reasonable, I also compared the performance between a DB with range tombstones, "expanded" range tombstones (several point tombstones that cover the same keys the equivalent range tombstone would cover, a common workaround for DeleteRange), and no range tombstones. The created DBs had 5 million keys each, and DeleteRange was called at regular intervals (depending on the total number of range tombstones being written) after 4.5 million Puts. The table below summarizes the results of a `readwhilewriting` benchmark (in order to provide somewhat more realistic results): ``` Tombstones? | avg micros/op | stddev micros/op | avg ops/s | stddev ops/s ----------------- | ------------- | ---------------- | ------------ | ------------ None | 0.6186 | 0.04637 | 1,625,252.90 | 124,679.41 500 Expanded | 0.6019 | 0.03628 | 1,666,670.40 | 101,142.65 500 Unexpanded | 0.6435 | 0.03994 | 1,559,979.40 | 104,090.52 1k Expanded | 0.6034 | 0.04349 | 1,665,128.10 | 125,144.57 1k Unexpanded | 0.6261 | 0.03093 | 1,600,457.50 | 79,024.94 5k Expanded | 0.6163 | 0.05926 | 1,636,668.80 | 154,888.85 5k Unexpanded | 0.6402 | 0.04002 | 1,567,804.70 | 100,965.55 10k Expanded | 0.6036 | 0.05105 | 1,667,237.70 | 142,830.36 10k Unexpanded | 0.6128 | 0.02598 | 1,634,633.40 | 72,161.82 25k Expanded | 0.6198 | 0.04542 | 1,620,980.50 | 116,662.93 25k Unexpanded | 0.5478 | 0.0362 | 1,833,059.10 | 121,233.81 50k Expanded | 0.5104 | 0.04347 | 1,973,107.90 | 184,073.49 50k Unexpanded | 0.4528 | 0.03387 | 2,219,034.50 | 170,984.32 ``` After a large enough quantity of range tombstones are written, range tombstone Gets can become faster than reading from an equivalent DB with several point tombstones. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4493 Differential Revision: D10842844 Pulled By: abhimadan fbshipit-source-id: a7d44534f8120e6aabb65779d26c6b9df954c509
-
- 25 10月, 2018 4 次提交
-
-
由 Zhongyi Xie 提交于
Summary: Currently there are two contrun test failures: * rocksdb-contrun-lite: > tools/db_bench_tool.cc: In function ‘int rocksdb::db_bench_tool(int, char**)’: tools/db_bench_tool.cc:5814:5: error: ‘DumpMallocStats’ is not a member of ‘rocksdb’ rocksdb::DumpMallocStats(&stats_string); ^ make: *** [tools/db_bench_tool.o] Error 1 * rocksdb-contrun-unity: > In file included from unity.cc:44:0: db/range_tombstone_fragmenter.cc: In member function ‘void rocksdb::FragmentedRangeTombstoneIterator::FragmentTombstones(std::unique_ptr<rocksdb::InternalIteratorBase<rocksdb::Slice> >, rocksdb::SequenceNumber)’: db/range_tombstone_fragmenter.cc:90:14: error: reference to ‘ParsedInternalKeyComparator’ is ambiguous auto cmp = ParsedInternalKeyComparator(icmp_); This PR will fix them Pull Request resolved: https://github.com/facebook/rocksdb/pull/4587 Differential Revision: D10846554 Pulled By: miasantreble fbshipit-source-id: 8d3358879e105060197b1379c84aecf51b352b93
-
由 Yanqin Jin 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4585 Differential Revision: D10841983 Pulled By: riversand963 fbshipit-source-id: 6a7e0b40065bcfbb10a2cac0cec1e8da0750a617
-
由 Abhishek Madan 提交于
Summary: Previously, range tombstones were accumulated from every level, which was necessary if a range tombstone in a higher level covered a key in a lower level. However, RangeDelAggregator::AddTombstones's complexity is based on the number of tombstones that are currently stored in it, which is wasteful in the Get case, where we only need to know the highest sequence number of range tombstones that cover the key from higher levels, and compute the highest covering sequence number at the current level. This change introduces this optimization, and removes the use of RangeDelAggregator from the Get path. In the benchmark results, the following command was used to initialize the database: ``` ./db_bench -db=/dev/shm/5k-rts -use_existing_db=false -benchmarks=filluniquerandom -write_buffer_size=1048576 -compression_type=lz4 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -value_size=112 -key_size=16 -block_size=4096 -level_compaction_dynamic_level_bytes=true -num=5000000 -max_background_jobs=12 -benchmark_write_rate_limit=20971520 -range_tombstone_width=100 -writes_per_range_tombstone=100 -max_num_range_tombstones=50000 -bloom_bits=8 ``` ...and the following command was used to measure read throughput: ``` ./db_bench -db=/dev/shm/5k-rts/ -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=5000000 -reads=100000 -threads=32 ``` The filluniquerandom command was only run once, and the resulting database was used to measure read performance before and after the PR. Both binaries were compiled with `DEBUG_LEVEL=0`. Readrandom results before PR: ``` readrandom : 4.544 micros/op 220090 ops/sec; 16.9 MB/s (63103 of 100000 found) ``` Readrandom results after PR: ``` readrandom : 11.147 micros/op 89707 ops/sec; 6.9 MB/s (63103 of 100000 found) ``` So it's actually slower right now, but this PR paves the way for future optimizations (see #4493). ---- Pull Request resolved: https://github.com/facebook/rocksdb/pull/4449 Differential Revision: D10370575 Pulled By: abhimadan fbshipit-source-id: 9a2e152be1ef36969055c0e9eb4beb0d96c11f4d
-
由 Zhongyi Xie 提交于
Summary: PR https://github.com/facebook/rocksdb/pull/4226 introduced per-level perf context which allows breaking down perf context by levels. This PR takes advantage of the feature to populate a few counters related to bloom filters Pull Request resolved: https://github.com/facebook/rocksdb/pull/4581 Differential Revision: D10518010 Pulled By: miasantreble fbshipit-source-id: 011244561783ec860d32d5b0fa6bce6e78d70ef8
-
- 24 10月, 2018 2 次提交
-
-
由 Neil Mayhew 提交于
Summary: This fixes three tests that fail with relatively recent tools and libraries: The tests are: * `spatial_db_test` * `table_test` * `db_universal_compaction_test` I'm using: * `gcc` 7.3.0 * `glibc` 2.27 * `snappy` 1.1.7 * `gflags` 2.2.1 * `zlib` 1.2.11 * `bzip2` 1.0.6.0.1 * `lz4` 1.8.2 * `jemalloc` 5.0.1 The versions used in the Travis environment (which is two Ubuntu LTS versions behind the current one and doesn't use `lz4` or `jemalloc`) don't seem to have a problem. However, to be safe, I verified that these tests pass with and without my changes in a trusty Docker container without `lz4` and `jemalloc`. However, I do get an unrelated set of other failures when using a trusty Docker container that uses `lz4` and `jemalloc`: ``` db/db_universal_compaction_test.cc:506: Failure Value of: num + 1 Actual: 3 Expected: NumSortedRuns(1) Which is: 4 [ FAILED ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/0, where GetParam() = (1, false) (1189 ms) [ RUN ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/1 db/db_universal_compaction_test.cc:506: Failure Value of: num + 1 Actual: 3 Expected: NumSortedRuns(1) Which is: 4 [ FAILED ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/1, where GetParam() = (1, true) (1246 ms) [ RUN ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/2 db/db_universal_compaction_test.cc:506: Failure Value of: num + 1 Actual: 3 Expected: NumSortedRuns(1) Which is: 4 [ FAILED ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/2, where GetParam() = (3, false) (1237 ms) [ RUN ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/3 db/db_universal_compaction_test.cc:506: Failure Value of: num + 1 Actual: 3 Expected: NumSortedRuns(1) Which is: 4 [ FAILED ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/3, where GetParam() = (3, true) (1195 ms) [ RUN ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/4 db/db_universal_compaction_test.cc:506: Failure Value of: num + 1 Actual: 3 Expected: NumSortedRuns(1) Which is: 4 [ FAILED ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/4, where GetParam() = (5, false) (1161 ms) [ RUN ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/5 db/db_universal_compaction_test.cc:506: Failure Value of: num + 1 Actual: 3 Expected: NumSortedRuns(1) Which is: 4 [ FAILED ] UniversalCompactionNumLevels/DBTestUniversalCompaction.DynamicUniversalCompactionReadAmplification/5, where GetParam() = (5, true) (1229 ms) ``` I haven't attempted to fix these since I'm not using trusty and Travis doesn't use `lz4` and `jemalloc`. However, the final commit in this PR does at least fix the compilation errors that occur when using trusty's version of `lz4`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4562 Differential Revision: D10510917 Pulled By: maysamyabandeh fbshipit-source-id: 59534042015ec339270e5fc2f6ac4d859370d189
-
由 Zhongyi Xie 提交于
Summary: clang analyzer currently fails with the following warnings: > db/log_reader.cc:323:9: warning: Undefined or garbage value returned to caller return r; ^~~~~~~~ db/log_reader.cc:344:11: warning: Undefined or garbage value returned to caller return r; ^~~~~~~~ db/log_reader.cc:369:11: warning: Undefined or garbage value returned to caller return r; Pull Request resolved: https://github.com/facebook/rocksdb/pull/4583 Differential Revision: D10523517 Pulled By: miasantreble fbshipit-source-id: 0cc8b8f27657b202bead148bbe7c4aa84fed095b
-
- 23 10月, 2018 2 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: There was a bug that the user comparator would receive the internal key instead of the user key. The bug was due to RangeMightExistAfterSortedRun expecting user key but receiving internal key when called in GenerateBottommostFiles. The patch augment an existing unit test to reproduce the bug and fixes it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4575 Differential Revision: D10500434 Pulled By: maysamyabandeh fbshipit-source-id: 858346d2fd102cce9e20516d77338c112bdfe366
-
由 Siying Dong 提交于
Summary: Level compaction usually performs poorly when the writes so heavy that the level targets can't be guaranteed. With this improvement, we improve level_compaction_dynamic_level_bytes = true so that in the write heavy cases, the level multiplier can be slightly adjusted based on the size of L0. We keep the behavior the same if number of L0 files is under 2X compaction trigger and the total size is less than options.max_bytes_for_level_base, so that unless write is so heavy that compaction cannot keep up, the behavior doesn't change. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4338 Differential Revision: D9636782 Pulled By: siying fbshipit-source-id: e27fc17a7c29c84b00064cc17536a01dacef7595
-
- 22 10月, 2018 1 次提交
-
-
由 Yi Wu 提交于
Summary: When `MockTimeEnv` is used in test to mock time methods, we cannot use `CondVar::TimedWait` because it is using real time, not the mocked time for wait timeout. On Mac the method can return immediately without awaking other waiting threads, if the real time is larger than `wait_until` (which is a mocked time). When that happen, the `wait()` method will fall into an infinite loop. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4560 Differential Revision: D10472851 Pulled By: yiwu-arbug fbshipit-source-id: 898902546ace7db7ac509337dd8677a527209d19
-
- 20 10月, 2018 1 次提交
-
-
由 Yanqin Jin 提交于
Summary: Current `log::Reader` does not perform retry after encountering `EOF`. In the future, we need the log reader to be able to retry tailing the log even after `EOF`. Current implementation is simple. It does not provide more advanced retry policies. Will address this in the future. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4394 Differential Revision: D9926508 Pulled By: riversand963 fbshipit-source-id: d86d145792a41bd64a72f642a2a08c7b7b5201e1
-
- 19 10月, 2018 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: We have already disabled it on Travis since it has been too flaky. The same problem arises in Appveyor as well. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4536 Differential Revision: D10452240 Pulled By: maysamyabandeh fbshipit-source-id: 728f4ecddf780097159dc0a0737d460eb5ce4f09
-
- 18 10月, 2018 2 次提交
-
-
由 Abhishek Madan 提交于
Summary: When there are no range deletions, flush and compaction perform a binary search on an effectively empty map every time they call ShouldDelete. This PR lazily initializes each stripe map entry so that the binary search can be elided in these cases. After this PR, the total amount of time spent in compactions is 52.541331s, and the total amount of time spent in flush is 5.532608s, the former of which is a significant improvement from the results after #4495. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4497 Differential Revision: D10428610 Pulled By: abhimadan fbshipit-source-id: 6f7e1ce3698fac3ef86d1197955e6b72e0931a0f
-
由 Zhongyi Xie 提交于
Summary: Current implementation of perf context is level agnostic. Making it hard to do performance evaluation for the LSM tree. This PR adds `PerfContextByLevel` to decompose the counters by level. This will be helpful when analyzing point and range query performance as well as tuning bloom filter Also replaced __thread with thread_local keyword for perf_context Pull Request resolved: https://github.com/facebook/rocksdb/pull/4226 Differential Revision: D10369509 Pulled By: miasantreble fbshipit-source-id: f1ced4e0de5fcebdb7f9cff36164516bc6382d82
-
- 16 10月, 2018 3 次提交
-
-
由 anand1976 提交于
Summary: When a CompactRange() call for a level is truncated before the end key is reached, because it exceeds max_compaction_bytes, we need to properly set the compaction_end parameter to indicate the stop key. The next CompactRange will use that as the begin key. We set it to the smallest key of the next file in the level after expanding inputs to get a clean cut. Previously, we were setting it before expanding inputs. So we could end up recompacting some files. In a pathological case, where a single key has many entries spanning all the files in the level (possibly due to merge operands without a partial merge operator, thus resulting in compaction output identical to the input), this would result in an endless loop over the same set of files. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4496 Differential Revision: D10395026 Pulled By: anand1976 fbshipit-source-id: f0c2f89fee29b4b3be53b6467b53abba8e9146a9
-
由 Yanqin Jin 提交于
Summary: Leverage existing `FlushJob` to implement atomic flush of multiple column families. This PR depends on other PRs and is a subset of #3752 . This PR itself is not sufficient in fulfilling atomic flush. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4262 Differential Revision: D9283109 Pulled By: riversand963 fbshipit-source-id: 65401f913e4160b0a61c0be6cd02adc15dad28ed
-
由 Andrew Kryczka 提交于
Summary: `CompactionIterator::snapshots_` is ordered by ascending seqnum, just like `DBImpl`'s linked list of snapshots from which it was copied. This PR exploits this ordering to make `findEarliestVisibleSnapshot` do binary search rather than linear scan. This can make flush/compaction significantly faster when many snapshots exist since that function is called on every single key. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4495 Differential Revision: D10386470 Pulled By: ajkr fbshipit-source-id: 29734991631227b6b7b677e156ac567690118a8b
-
- 13 10月, 2018 3 次提交
-
-
由 Yanqin Jin 提交于
Summary: We would like to collect file-system-level statistics including file name, offset, length, return code, latency, etc., which requires to add callbacks to intercept file IO function calls when RocksDB is running. To collect file-system-level statistics, users can inherit the class `EventListener`, as in `TestFileOperationListener `. Note that `TestFileOperationListener::ShouldBeNotifiedOnFileIO()` returns true. Pull Request resolved: https://github.com/facebook/rocksdb/pull/3933 Differential Revision: D10219571 Pulled By: riversand963 fbshipit-source-id: 7acc577a2d31097766a27adb6f78eaf8b1e8ff15
-
由 Yi Wu 提交于
Summary: The "je_" prefix of jemalloc APIs presents only when the macro `JEMALLOC_NO_RENAME` from jemalloc.h presents. With the patch I'm also adding -DROCKSDB_JEMALLOC flag in buck TARGETS. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4488 Differential Revision: D10355971 Pulled By: yiwu-arbug fbshipit-source-id: 03a2d69790a44ac89219c7525763fa937a63d95a
-
由 Chinmay Kamat 提交于
Summary: This commit adds code to acquire lock on the DB LOCK file before starting the repair process. This will prevent multiple processes from performing repair on the same DB simultaneously. Fixes repair_test to work with this change. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4435 Differential Revision: D10361499 Pulled By: riversand963 fbshipit-source-id: 3c512c48b7193d383b2279ccecabdb660ac1cf22
-
- 12 10月, 2018 1 次提交
-
-
由 Abhishek Madan 提交于
Summary: Using `./range_del_aggregator_bench --use_collapsed=false --num_range_tombstones=5000 --num_runs=1000`, here are the results before and after this change: Before: ``` ========================= Results: ========================= AddTombstones: 1822.61 us ShouldDelete (first): 94.5286 us ``` After: ``` ========================= Results: ========================= AddTombstones: 199.26 us ShouldDelete (first): 38.9344 us ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4487 Differential Revision: D10347288 Pulled By: abhimadan fbshipit-source-id: d44efe3a166d583acfdc3ec1199e0892f34dbfb7
-
- 11 10月, 2018 2 次提交
-
-
由 UncP 提交于
Summary: in `WriteThread::LaunchParallelMemTableWriters`, there is ` write_group->running.store(write_group->size); ` https://github.com/facebook/rocksdb/blob/master/db/write_thread.cc#L510 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4450 Differential Revision: D10201900 Pulled By: yiwu-arbug fbshipit-source-id: 96c8fbbba5aff7ba8a6ceb3117a2bd7cc9b2f34b
-
由 Simon Grätzer 提交于
Summary: Wrong I overwrite `WriteBatch::Handler::Continue` to return _false_ at some point, I always get the `Status::Corruption` error. I don't think this check is used correctly here: The counter in `found` cannot reflect all entries in the WriteBatch when we exit the loop early. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4478 Differential Revision: D10317416 Pulled By: yiwu-arbug fbshipit-source-id: cccae3382805035f9b3239b66682b5fcbba6bb61
-