- 06 4月, 2017 1 次提交
-
-
由 Siying Dong 提交于
Summary: Move some files under util/ to new directories env/, monitoring/ options/ and cache/ Closes https://github.com/facebook/rocksdb/pull/2090 Differential Revision: D4833681 Pulled By: siying fbshipit-source-id: 2fd8bef
-
- 05 4月, 2017 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Level-based L0->L0 compaction operates on spans of files that aren't currently being compacted. It reduces the number of L0 files, thus making write stall conditions harder to reach. - L0->L0 is triggered when base level is unavailable due to pending compactions - L0->L0 always outputs one file of at most `max_level0_burst_file_size` bytes. - Subcompactions are disabled for L0->L0 since we want to output one file. - Input files are chosen as the longest span of available files that will fit within the size limit. This minimizes number of files in L0. Closes https://github.com/facebook/rocksdb/pull/2027 Differential Revision: D4760318 Pulled By: ajkr fbshipit-source-id: 9d07183
-
- 04 4月, 2017 1 次提交
-
-
由 Siying Dong 提交于
Summary: It is confusing to have auto_roll_logger to stay under db/, which has nothing to do with database. Move filename together as it is a dependency. Closes https://github.com/facebook/rocksdb/pull/2080 Differential Revision: D4821141 Pulled By: siying fbshipit-source-id: ca7d768
-
- 23 3月, 2017 1 次提交
-
-
由 Daniel Black 提交于
Summary: Errors where: db/version_set.cc:1535:20: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] [this](const Fsize& f1, const Fsize& f2) -> bool { ^ db/version_set.cc:1541:20: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] [this](const Fsize& f1, const Fsize& f2) -> bool { ^ db/db_test.cc:2983:27: error: lambda capture 'kNumPutsBeforeWaitForFlush' is not required to be captured for this use [-Werror,-Wunused-lambda-capture] auto gen_l0_kb = [this, kNumPutsBeforeWaitForFlush](int size) { ^ Closes https://github.com/facebook/rocksdb/pull/1972 Differential Revision: D4685991 Pulled By: siying fbshipit-source-id: 9125379
-
- 16 3月, 2017 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: current logging ``` 2017/03/14-14:20:30.393432 7fedde9f5700 (Original Log Time 2017/03/14-14:20:30.393414) [default] Level summary: base level 1 max bytes base 268435456 files[1 0 0 0 0 0 0] max score 0.25 2017/03/14-14:20:30.393438 7fedde9f5700 [JOB 2] Try to delete WAL files size 61417909, prev total WAL file size 73820858, number of live WAL files 2. 2017/03/14-14:20:30.393464 7fedde9f5700 [DEBUG] [JOB 2] Delete /dev/shm/old_logging//MANIFEST-000001 type=3 #1 -- OK 2017/03/14-14:20:30.393472 7fedde9f5700 [DEBUG] [JOB 2] Delete /dev/shm/old_logging//000003.log type=0 #3 -- OK 2017/03/14-14:20:31.427103 7fedd49f1700 [default] New memtable created with log file: #9. Immutable memtables: 0. 2017/03/14-14:20:31.427179 7fedde9f5700 [JOB 3] Syncing log #6 2017/03/14-14:20:31.427190 7fedde9f5700 (Original Log Time 2017/03/14-14:20:31.427170) Calling FlushMemTableToOutputFile with column family [default], flush slots available 1, compaction slots allowed 1, compaction slots scheduled 1 2017/03/14-14:20:31. Closes https://github.com/facebook/rocksdb/pull/1990 Differential Revision: D4708695 Pulled By: IslamAbdelRahman fbshipit-source-id: cb8968f
-
- 14 3月, 2017 1 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: PinnableSlice Summary: Currently the point lookup values are copied to a string provided by the user. This incures an extra memcpy cost. This patch allows doing point lookup via a PinnableSlice which pins the source memory location (instead of copying their content) and releases them after the content is consumed by the user. The old API of Get(string) is translated to the new API underneath. Here is the summary for improvements: value 100 byte: 1.8% regular, 1.2% merge values value 1k byte: 11.5% regular, 7.5% merge values value 10k byte: 26% regular, 29.9% merge values The improvement for merge could be more if we extend this approach to pin the merge output and delay the full merge operation until the user actually needs it. We have put that for future work. PS: Sometimes we observe a small decrease in performance when switching from t5452014 to this patch but with the old Get(string) API. The d Closes https://github.com/facebook/rocksdb/pull/1756 Differential Revision: D4391738 Pulled By: maysamyabandeh fbshipit-source-id: 6f3edd3
-
- 24 2月, 2017 1 次提交
-
-
由 Siying Dong 提交于
Summary: …action The two options, min_partial_merge_operands and verify_checksums_in_compaction, are not seldom used. Remove them to reduce the total number of options. Also remove them from Java and C interface. Closes https://github.com/facebook/rocksdb/pull/1902 Differential Revision: D4601219 Pulled By: siying fbshipit-source-id: aad4cb2
-
- 22 2月, 2017 1 次提交
-
-
由 Aaron Gao 提交于
Summary: reimplement the compaction expansion on lower level. Considering such a case: input level file: 1[B E] 2[F G] 3[H I] 4 [J M] output level file: 5[A C] 6[D K] 7[L O] If we initially pick file 2, now we will compact file 2 and 6. But we can safely compact 2, 3 and 6 without expanding the output level. The previous code is messy and wrong. In this diff, I first determine the input range [a, b], and output range [c, d], then we get the range [e,f] = [min(a, c), max(b, d] and put all eligible clean-cut files within [e, f] into this compaction. **Note: clean-cut means the files don't have the same user key on the boundaries of some files that are not chosen in this compaction**. Closes https://github.com/facebook/rocksdb/pull/1760 Differential Revision: D4395564 Pulled By: lightmark fbshipit-source-id: 2dc2c5c
-
- 17 2月, 2017 1 次提交
-
-
由 Aaron Gao 提交于
Summary: fix lite bugs disable direct io in lite mode Closes https://github.com/facebook/rocksdb/pull/1870 Differential Revision: D4559866 Pulled By: yiwu-arbug fbshipit-source-id: 3761c51
-
- 14 2月, 2017 1 次提交
-
-
由 Sagar Vemuri 提交于
Summary: Remove disableDataSync, and another similarly named disable_data_sync options. This is being done to simplify options, and also because the performance gains of this feature can be achieved by other methods. Closes https://github.com/facebook/rocksdb/pull/1859 Differential Revision: D4541292 Pulled By: sagar0 fbshipit-source-id: 5b3a6ca
-
- 03 2月, 2017 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: merger.h was always a confusing name for me, simply give the file a better name Closes https://github.com/facebook/rocksdb/pull/1836 Differential Revision: D4505357 Pulled By: IslamAbdelRahman fbshipit-source-id: 07b28d8
-
- 09 1月, 2017 2 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: This reverts commit 54d94e9c. The pull request was landed by mistake. Closes https://github.com/facebook/rocksdb/pull/1755 Differential Revision: D4391678 Pulled By: maysamyabandeh fbshipit-source-id: 36d5149
-
由 Maysam Yabandeh 提交于
Summary: Currently the point lookup values are copied to a string provided by the user. This incures an extra memcpy cost. This patch allows doing point lookup via a PinnableSlice which pins the source memory location (instead of copying their content) and releases them after the content is consumed by the user. The old API of Get(string) is translated to the new API underneath. Here is the summary for improvements: 1. value 100 byte: 1.8% regular, 1.2% merge values 2. value 1k byte: 11.5% regular, 7.5% merge values 3. value 10k byte: 26% regular, 29.9% merge values The improvement for merge could be more if we extend this approach to pin the merge output and delay the full merge operation until the user actually needs it. We have put that for future work. PS: Sometimes we observe a small decrease in performance when switching from t5452014 to this patch but with the old Get(string) API. The difference is a little and could be noise. More importantly it is safely cancelled Closes https://github.com/facebook/rocksdb/pull/1732 Differential Revision: D4374613 Pulled By: maysamyabandeh fbshipit-source-id: a077f1a
-
- 16 11月, 2016 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Previously we used TableCache::NewIterator() for multiple purposes (data block iterator and range deletion iterator), and returned non-ok status in the data block iterator. In one case where the caller only used the range deletion block iterator (https://github.com/facebook/rocksdb/blob/9e7cf3469bc626b092ec48366d12873ecab22b4e/db/version_set.cc#L965-L973), we didn't check/free the data block iterator containing non-ok status, which caused a valgrind error. So, this diff decouples creation of data block and range deletion block iterators, and updates the callers accordingly. Both functions can return non-ok status in an InternalIterator. Since the non-ok status is returned in an iterator that the callers will definitely use, it should be more usable/less error-prone. Closes https://github.com/facebook/rocksdb/pull/1513 Differential Revision: D4181423 Pulled By: ajkr fbshipit-source-id: 835b8f5
-
- 05 11月, 2016 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Note: reviewed in https://reviews.facebook.net/D65115 - DBIter maintains a range tombstone accumulator. We don't cleanup obsolete tombstones yet, so if the user seeks back and forth, the same tombstones would be added to the accumulator multiple times. - DBImpl::NewInternalIterator() (used to make DBIter's underlying iterator) adds memtable/L0 range tombstones, L1+ range tombstones are added on-demand during NewSecondaryIterator() (see D62205) - DBIter uses ShouldDelete() when advancing to check whether keys are covered by range tombstones Closes https://github.com/facebook/rocksdb/pull/1464 Differential Revision: D4131753 Pulled By: ajkr fbshipit-source-id: be86559
-
- 04 11月, 2016 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: During Get()/MultiGet(), build up a RangeDelAggregator with range tombstones as we search through live memtable, immutable memtables, and SST files. This aggregator is then used by memtable.cc's SaveValue() and GetContext::SaveValue() to check whether keys are covered. added tests for Get on memtables/files; end-to-end tests mainly in https://reviews.facebook.net/D64761 Closes https://github.com/facebook/rocksdb/pull/1456 Differential Revision: D4111271 Pulled By: ajkr fbshipit-source-id: 6e388d4
-
- 02 11月, 2016 1 次提交
-
-
由 Benoit Girard 提交于
Summary: Closes https://github.com/facebook/rocksdb/pull/1427 Differential Revision: D4094732 Pulled By: yiwu-arbug fbshipit-source-id: b9b79e9
-
- 21 10月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: Changes in the diff API changes: - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear) - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file) - Deprecate AddFile() API Logic changes: - If our file overlap with the memtable we will flush the memtable - We will find the first level in the LSM tree that our file key range overlap with the keys in it - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it - We will assign a global sequence number to our new file - Remove AddFile restrictions by using global sequence numbers Other changes: - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob Test Plan: unit tests (still need to add more) addfile_stress (https://reviews.facebook.net/D65037) Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong Reviewed By: sdong Subscribers: jkedgar, hcz, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65061
-
- 19 10月, 2016 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: This diff introduces RangeDelAggregator, which takes ownership of iterators provided to it via AddTombstones(). The tombstones are organized in a two-level map (snapshot stripe -> begin key -> tombstone). Tombstone creation avoids data copy by holding Slices returned by the iterator, which remain valid thanks to pinning. For compaction, we create a hierarchical range tombstone iterator with structure matching the iterator over compaction input data. An aggregator based on that iterator is used by CompactionIterator to determine which keys are covered by range tombstones. In case of merge operand, the same aggregator is used by MergeHelper. Upon finishing each file in the compaction, relevant range tombstones are added to the output file's range tombstone metablock and file boundaries are updated accordingly. To check whether a key is covered by range tombstone, RangeDelAggregator::ShouldDelete() considers tombstones in the key's snapshot stripe. When this function is used outside of compaction, it also checks newer stripes, which can contain covering tombstones. Currently the intra-stripe check involves a linear scan; however, in the future we plan to collapse ranges within a stripe such that binary search can be used. RangeDelAggregator::AddToBuilder() adds all range tombstones in the table's key-range to a new table's range tombstone meta-block. Since range tombstones may fall in the gap between files, we may need to extend some files' key-ranges. The strategy is (1) first file extends as far left as possible and other files do not extend left, (2) all files extend right until either the start of the next file or the end of the last range tombstone in the gap, whichever comes first. One other notable change is adding release/move semantics to ScopedArenaIterator such that it can be used to transfer ownership of an arena-allocated iterator, similar to how unique_ptr is used for malloc'd data. Depends on D61473 Test Plan: compaction_iterator_test, mock_table, end-to-end tests in D63927 Reviewers: sdong, IslamAbdelRahman, wanning, yhchiang, lightmark Reviewed By: lightmark Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D62205
-
- 08 10月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: We always run consistency checks when compiling in debug mode allow users to set Options::force_consistency_checks to true to be able to run such checks even when compiling in release mode Test Plan: make check -j64 make release Reviewers: lightmark, sdong, yiwu Reviewed By: yiwu Subscribers: hermanlee4, andrewkr, yoshinorim, jkedgar, dhruba Differential Revision: https://reviews.facebook.net/D64701
-
- 28 9月, 2016 1 次提交
-
-
由 Aaron Gao 提交于
Summary: Add new Iterator API, `SeekForPrev`: find the last key that <= target key support prefix_extractor support prefix_same_as_start support upper_bound not supported in iterators without Prev() Also add tests in db_iter_test and db_iterator_test Pass all tests Cheers! Test Plan: make all check -j64 Reviewers: andrewkr, yiwu, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64149
-
- 24 9月, 2016 1 次提交
-
-
由 Yi Wu 提交于
Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs. Test Plan: make all check Reviewers: yhchiang, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64065
-
- 14 9月, 2016 1 次提交
-
-
由 Yi Wu 提交于
Summary: * Change constructor of MutableCFOptions to depends only on ColumnFamilyOptions. * Move `max_subcompactions`, `compaction_options_fifo` and `compaction_pri` to ImmutableCFOptions to make it clear that they are immutable. Test Plan: existing unit tests. Reviewers: yhchiang, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D63945
-
- 21 7月, 2016 2 次提交
-
-
由 omegaga 提交于
Summary: In T8216281 we decided to disable prefetching the index and filter during opening table handlers during startup (max_open_files = -1). Test Plan: Rely on `IndexAndFilterBlocksOfNewTableAddedToCache` to guarantee L0 indexes and filters are still cached and change `PinL0IndexAndFilterBlocksTest` to make sure other levels are not cached (maybe add one more test to test we don't cache other levels?) Reviewers: sdong, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D59913
-
由 Islam AbdelRahman 提交于
Summary: This diff update the code to pin the merge operator operands while the merge operation is done, so that we can eliminate the memcpy cost, to do that we need a new public API for FullMerge that replace the std::deque<std::string> with std::vector<Slice> This diff is stacked on top of D56493 and D56511 In this diff we - Update FullMergeV2 arguments to be encapsulated in MergeOperationInput and MergeOperationOutput which will make it easier to add new arguments in the future - Replace std::deque<std::string> with std::vector<Slice> to pass operands - Replace MergeContext std::deque with std::vector (based on a simple benchmark I ran https://gist.github.com/IslamAbdelRahman/78fc86c9ab9f52b1df791e58943fb187) - Allow FullMergeV2 output to be an existing operand ``` [Everything in Memtable | 10K operands | 10 KB each | 1 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=10000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000 [FullMergeV2] readseq : 0.607 micros/op 1648235 ops/sec; 16121.2 MB/s readseq : 0.478 micros/op 2091546 ops/sec; 20457.2 MB/s readseq : 0.252 micros/op 3972081 ops/sec; 38850.5 MB/s readseq : 0.237 micros/op 4218328 ops/sec; 41259.0 MB/s readseq : 0.247 micros/op 4043927 ops/sec; 39553.2 MB/s [master] readseq : 3.935 micros/op 254140 ops/sec; 2485.7 MB/s readseq : 3.722 micros/op 268657 ops/sec; 2627.7 MB/s readseq : 3.149 micros/op 317605 ops/sec; 3106.5 MB/s readseq : 3.125 micros/op 320024 ops/sec; 3130.1 MB/s readseq : 4.075 micros/op 245374 ops/sec; 2400.0 MB/s ``` ``` [Everything in Memtable | 10K operands | 10 KB each | 10 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="mergerandom,readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --merge_keys=1000 --num=10000 --disable_auto_compactions --value_size=10240 --write_buffer_size=1000000000 [FullMergeV2] readseq : 3.472 micros/op 288018 ops/sec; 2817.1 MB/s readseq : 2.304 micros/op 434027 ops/sec; 4245.2 MB/s readseq : 1.163 micros/op 859845 ops/sec; 8410.0 MB/s readseq : 1.192 micros/op 838926 ops/sec; 8205.4 MB/s readseq : 1.250 micros/op 800000 ops/sec; 7824.7 MB/s [master] readseq : 24.025 micros/op 41623 ops/sec; 407.1 MB/s readseq : 18.489 micros/op 54086 ops/sec; 529.0 MB/s readseq : 18.693 micros/op 53495 ops/sec; 523.2 MB/s readseq : 23.621 micros/op 42335 ops/sec; 414.1 MB/s readseq : 18.775 micros/op 53262 ops/sec; 521.0 MB/s ``` ``` [Everything in Block cache | 10K operands | 10 KB each | 1 operand per key] [FullMergeV2] $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions readseq : 14.741 micros/op 67837 ops/sec; 663.5 MB/s readseq : 1.029 micros/op 971446 ops/sec; 9501.6 MB/s readseq : 0.974 micros/op 1026229 ops/sec; 10037.4 MB/s readseq : 0.965 micros/op 1036080 ops/sec; 10133.8 MB/s readseq : 0.943 micros/op 1060657 ops/sec; 10374.2 MB/s [master] readseq : 16.735 micros/op 59755 ops/sec; 584.5 MB/s readseq : 3.029 micros/op 330151 ops/sec; 3229.2 MB/s readseq : 3.136 micros/op 318883 ops/sec; 3119.0 MB/s readseq : 3.065 micros/op 326245 ops/sec; 3191.0 MB/s readseq : 3.014 micros/op 331813 ops/sec; 3245.4 MB/s ``` ``` [Everything in Block cache | 10K operands | 10 KB each | 10 operand per key] DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq,readseq,readseq,readseq,readseq" --merge_operator="max" --num=100000 --db="/dev/shm/merge-random-10-operands-10K-10KB" --cache_size=1000000000 --use_existing_db --disable_auto_compactions [FullMergeV2] readseq : 24.325 micros/op 41109 ops/sec; 402.1 MB/s readseq : 1.470 micros/op 680272 ops/sec; 6653.7 MB/s readseq : 1.231 micros/op 812347 ops/sec; 7945.5 MB/s readseq : 1.091 micros/op 916590 ops/sec; 8965.1 MB/s readseq : 1.109 micros/op 901713 ops/sec; 8819.6 MB/s [master] readseq : 27.257 micros/op 36687 ops/sec; 358.8 MB/s readseq : 4.443 micros/op 225073 ops/sec; 2201.4 MB/s readseq : 5.830 micros/op 171526 ops/sec; 1677.7 MB/s readseq : 4.173 micros/op 239635 ops/sec; 2343.8 MB/s readseq : 4.150 micros/op 240963 ops/sec; 2356.8 MB/s ``` Test Plan: COMPILE_WITH_ASAN=1 make check -j64 Reviewers: yhchiang, andrewkr, sdong Reviewed By: sdong Subscribers: lovro, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D57075
-
- 20 7月, 2016 1 次提交
-
-
由 John Alexander 提交于
* Added new statistics and refactored to allow ioptions to be passed around as required to access environment and statistics pointers (and, as a convenient side effect, info_log pointer). * Prevent incrementing compression counter when compression is turned off in options. * Prevent incrementing compression counter when compression is turned off in options. * Added two more supported compression types to test code in db_test.cc * Prevent incrementing compression counter when compression is turned off in options. * Added new StatsLevel that excludes compression timing. * Fixed casting error in coding.h * Fixed CompressionStatsTest for new StatsLevel. * Removed unused variable that was breaking the Linux build
-
- 09 7月, 2016 1 次提交
-
-
由 Yi Wu 提交于
Summary: Fixing erros reported by clang static analyzer. * Removing some unused variables. * Adding assertions to fix false positives reported by clang analyzer. * Adding `__clang_analyzer__` macro to suppress false positive warnings. Test Plan: USE_CLANG=1 OPT=-g make analyze -j64 Reviewers: andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D60549
-
- 07 7月, 2016 1 次提交
-
-
由 Gunnar Kudrjavets 提交于
Summary: MyRocks release integration build breaks because we treat warnings caused by unused variables as errors. Variable `edit` is only used in debug builds. Therefore we need to guard it using `#ifndef NDEBUG` check. Test Plan: - `[p]arc diff --preview` for the default validation. - Verify that release build fails before this fix and passes after applying it. Reviewers: andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D60423
-
- 06 7月, 2016 2 次提交
-
-
由 sdong 提交于
Summary: Add option write_buffer_manager to help users control total memory spent on memtables across multiple DB instances. Test Plan: Add a new unit test. Reviewers: yhchiang, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: adela, benj, sumeet, muthu, leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D59925
-
由 Aaron Gao 提交于
Summary: Currently, if several flush outputs are committed together, we issue each manifest write per batch (1 batch = 1 flush = 1 sst file = 1+ continuous memtables). Each manifest write requires one fsync and one fsync to parent directory. In some cases, it becomes the bottleneck of write. We should batch them and write in one manifest write when possible. Test Plan: ` ./db_bench -benchmarks="fillseq" -max_write_buffer_number=16 -max_background_flushes=16 -disable_auto_compactions=true -min_write_buffer_number_to_merge=1 -write_buffer_size=65536 -level0_stop_writes_trigger=10000 -level0_slowdown_writes_trigger=10000` **Before** ``` Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags RocksDB: version 4.9 Date: Fri Jul 1 15:38:17 2016 CPU: 32 * Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz CPUCache: 20480 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 Prefix: 0 bytes Keys per prefix: 0 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) Write rate: 0 bytes/second Compression: Snappy Memtablerep: skip_list Perf Level: 1 WARNING: Assertions are enabled; benchmarks unnecessarily slow ------------------------------------------------ Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags DB path: [/tmp/rocksdbtest-112628/dbbench] fillseq : 166.277 micros/op 6014 ops/sec; 0.7 MB/s ``` **After** ``` Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags RocksDB: version 4.9 Date: Fri Jul 1 15:35:05 2016 CPU: 32 * Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz CPUCache: 20480 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 Prefix: 0 bytes Keys per prefix: 0 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) Write rate: 0 bytes/second Compression: Snappy Memtablerep: skip_list Perf Level: 1 WARNING: Assertions are enabled; benchmarks unnecessarily slow ------------------------------------------------ Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags DB path: [/tmp/rocksdbtest-112628/dbbench] fillseq : 52.328 micros/op 19110 ops/sec; 2.1 MB/s ``` Reviewers: andrewkr, IslamAbdelRahman, yhchiang, sdong Reviewed By: sdong Subscribers: igor, andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D60075
-
- 14 6月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: We have alot of code duplication whenever we call FullMerge we keep duplicating the instrumentation and statistics code This is a simple diff to refactor the code to use TimedFullMerge instead of FullMerge Test Plan: COMPILE_WITH_ASAN=1 make check -j64 Reviewers: andrewkr, yhchiang, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D59577
-
- 24 5月, 2016 1 次提交
-
-
由 Ashish Shenoy 提交于
Summary: Make CompactionOptionsFIFO a part of mutable_cf_options Test Plan: UT Reviewers: sdong Reviewed By: sdong Subscribers: andrewkr, lgalanis, dhruba Differential Revision: https://reviews.facebook.net/D58653
-
- 20 5月, 2016 1 次提交
-
-
由 Aaron Orenstein 提交于
Summary: Reduce use of argument-dependent name lookup in RocksDB. Test Plan: 'make check' passed. Reviewers: andrewkr Reviewed By: andrewkr Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58203
-
- 11 5月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: GetObsoleteFiles() and LogAndApply() functions modify obsolete_manifests_ vector we need to make sure that the mutex is held when we modify the obsolete_manifests_ Test Plan: run the test under TSAN Reviewers: andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58011
-
- 10 5月, 2016 1 次提交
-
-
由 sdong 提交于
Summary: Currently we estimate bytes needed for compaction by assuming fanout value to be level multiplier. It overestimates when size of a level exceeds the target by large. We estimate by the ratio of actual sizes in levels instead. Test Plan: Fix existing test cases and add a new one. Reviewers: IslamAbdelRahman, igor, yhchiang Reviewed By: yhchiang Subscribers: MarkCallaghan, leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D57789
-
- 21 4月, 2016 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: This is needed so we can measure compression ratio improvements achieved by D52287. The property compares raw data size against the total file size for a given level. If the level is empty it should return 0.0. Test Plan: new unit test Reviewers: IslamAbdelRahman, yhchiang, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D56967
-
- 07 4月, 2016 1 次提交
-
-
由 Igor Canadi 提交于
Summary: We use object `v` in the error message, which is not initialized if the edit is column family manipulation. This doesn't provide much useful info, so this diff is removing it. Instead, it dumps actual VersionEdit contents. Test Plan: compiles. would be great to get tests in version_set_test.cc that cover cases where a file write fails Reviewers: sdong, yhchiang, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D56349
-
- 02 4月, 2016 2 次提交
-
-
由 Aaron Gao 提交于
Summary: There is a hardcoded constraint in our statistics collection that prevents reading properties from more than 20 SST files. This means our statistics will be very inaccurate for databases with > 20 files since additional files are just ignored. The purpose of constraining the number of files used is to bound the I/O performed during statistics collection, since these statistics need to be recomputed every time the database reopened. However, this constraint doesn't take into account the case where option "max_open_files" is -1. In that case, all the file metadata has already been read, so MaybeInitializeFileMetaData() won't incur any I/O cost. so this diff gets rid of the 20-file constraint in case max_open_files == -1. Test Plan: write into unit test db/db_properties_test.cc - "ValidateSampleNumber". We generate 20 files with 2 rows and 10 files with 1 row. If max_open_files !=-1, the `rocksdb.estimate-num-keys` should be (10*1 + 10*2)/20 * 30 = 45. Otherwise, it should be the ground truth, 50. {F1089153} Reviewers: andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D56253
-
由 Marton Trencseni 提交于
Summary: When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache. What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention. Test Plan: 'export TEST_TMPDIR=/dev/shm/ && DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32' is OK. I didn't run the Java tests, I don't have Java set up on my devserver. Reviewers: sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D56133
-
- 26 3月, 2016 1 次提交
-
-
由 Islam AbdelRahman 提交于
Summary: We were not measuring the time spent in merge_operator when called from Version::Get() Test Plan: added a unittest Reviewers: sdong, yhchiang Reviewed By: yhchiang Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55905
-