- 24 4月, 2019 2 次提交
-
-
由 Adam Retter 提交于
Summary: I needed this change to be able to build the v6.0.1 release on Windows. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5227 Differential Revision: D15033815 Pulled By: sagar0 fbshipit-source-id: 579f3b8e694c34c0d43527eb2fa37175e37f5911
-
由 Siying Dong 提交于
Summary: It's hard to get DBIter to directly use InternalIterator::NextAndGetResult() because the code change would be complicated. Instead, use IteratorWrapper, where Next() is already using NextAndGetResult(). Performance number is hard to measure because it is small and ther is variation. I run readseq many times, and there seems to be 1% gain. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5214 Differential Revision: D15003635 Pulled By: siying fbshipit-source-id: 17af1965c409c2fe90cd85037fbd2c5a1364f82a
-
- 23 4月, 2019 3 次提交
-
-
由 Yuchi Chen 提交于
Summary: When I build RocksDB for 32bits/LITE/iOS environment, some errors like the following. ` table/block_based_table_reader.cc:971:44: error: implicit conversion loses integer precision: 'uint64_t' (aka 'unsigned long long') to 'size_t' (aka 'unsigned long') [-Werror,-Wshorten-64-to-32] size_t block_size = props_block_handle.size(); ~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~^~~~~~ ./util/file_reader_writer.h:177:8: error: private field 'env_' is not used [-Werror,-Wunused-private-field] Env* env_; ^ ` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5220 Differential Revision: D15023481 Pulled By: siying fbshipit-source-id: 1b5d121d3016f2b0a8a9a2cc1bd638479357f9f7
-
由 Sagar Vemuri 提交于
Summary: Log file_creation_time table property when a new table file is created. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5232 Differential Revision: D15033069 Pulled By: sagar0 fbshipit-source-id: aaac56a4c03a8f96c338cad1b0cdb7fbfb887647
-
由 Andrew Kryczka 提交于
Summary: The existing implementation does not guarantee bytes reach disk every `bytes_per_sync` when writing SST files, or every `wal_bytes_per_sync` when writing WALs. This can cause confusing behavior for users who enable this feature to avoid large syncs during flush and compaction, but then end up hitting them anyways. My understanding of the existing behavior is we used `sync_file_range` with `SYNC_FILE_RANGE_WRITE` to submit ranges for async writeback, such that we could continue processing the next range of bytes while that I/O is happening. I believe we can preserve that benefit while also limiting how far the processing can get ahead of the I/O, which prevents huge syncs from happening when the file finishes. Consider this `sync_file_range` usage: `sync_file_range(fd_, 0, static_cast<off_t>(offset + nbytes), SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE)`. Expanding the range to start at 0 and adding the `SYNC_FILE_RANGE_WAIT_BEFORE` flag causes any pending writeback (like from a previous call to `sync_file_range`) to finish before it proceeds to submit the latest `nbytes` for writeback. The latest `nbytes` are still written back asynchronously, unless processing exceeds I/O speed, in which case the following `sync_file_range` will need to wait on it. There is a second change in this PR to use `fdatasync` when `sync_file_range` is unavailable (determined statically) or has some known problem with the underlying filesystem (determined dynamically). The above two changes only apply when the user enables a new option, `strict_bytes_per_sync`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5183 Differential Revision: D14953553 Pulled By: siying fbshipit-source-id: 445c3862e019fb7b470f9c7f314fc231b62706e9
-
- 22 4月, 2019 1 次提交
-
-
由 Mike Kolupaev 提交于
Summary: Introduce BlockBasedTableOptions::index_shortening to give users control on which key shortening techniques to be used in building index blocks. Before this patch, both separators and successor keys where shortened in indexes. With this patch, the default is set to kShortenSeparators to only shorten the separators. Since each index block has many separators and only one successor (last key), the change should not have negative impact on index block size. However it should prevent many unnecessary block loads where due to approximation introduced by shorted successor, seek would land us to the previous block and then fix it by moving to the next one. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5174 Differential Revision: D14884185 Pulled By: al13n321 fbshipit-source-id: 1b08bc8c03edcf09b6b8c16e9a7eea08ad4dd534
-
- 20 4月, 2019 5 次提交
-
-
由 jsteemann 提交于
Summary: Savepoints are assumed to be used in a stack-wise fashion (only the top element should be used), so they were stored by `WriteBatch` in a member variable `save_points` using an std::stack. Conceptually this is fine, but the implementation had a few issues: - the `save_points_` instance variable was a plain pointer to a heap- allocated `SavePoints` struct. The destructor of `WriteBatch` simply deletes this pointer. However, the copy constructor of WriteBatch just copied that pointer, meaning that copying a WriteBatch with active savepoints will very likely have crashed before. Now a proper copy of the savepoints is made in the copy constructor, and not just a copy of the pointer - `save_points_` was an std::stack, which defaults to `std::deque` for the underlying container. A deque is a bit over the top here, as we only need access to the most recent savepoint (i.e. stack.top()) but never any elements at the front. std::deque is rather expensive to initialize in common environments. For example, the STL implementation shipped with GNU g++ will perform a heap allocation of more than 500 bytes to create an empty deque object. Although the `save_points_` container is created lazily by RocksDB, moving from a deque to a plain `std::vector` is much more memory-efficient. So `save_points_` is now a vector. - `save_points_` was changed from a plain pointer to an `std::unique_ptr`, making ownership more explicit. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5192 Differential Revision: D15024074 Pulled By: maysamyabandeh fbshipit-source-id: 5b128786d3789cde94e46465c9e91badd07a25d7
-
由 Sagar Vemuri 提交于
Summary: Fix HISTORY.md by removing a few items from 6.1.1 history as they did not make into the 6.1.fb branch. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5224 Differential Revision: D15017030 Pulled By: sagar0 fbshipit-source-id: 090724d326d29168952e06dc1a5090c03fdd739e
-
由 Yanqin Jin 提交于
Summary: Setting read_opts.total_order_seek achieves this, even with a different prefix extractor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5209 Differential Revision: D14980388 Pulled By: riversand963 fbshipit-source-id: 16527989a3d6b3e3ae8241c894d011326429d66e
-
由 anand76 提交于
Summary: Cleanup a couple of stray includes left by #5011. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5219 Differential Revision: D15007244 Pulled By: anand1976 fbshipit-source-id: 15ca1d4f977b5b60e99df3bfb8fc3db217d19bdd
-
由 Siying Dong 提交于
Summary: My compiler doesn't inline DBIter::Next() to arena wrapped iterator, even if it is a direct forward. Adding this annotation makes it inlined. It might not always work but inlinging this function to arena wrapped iterator always feels like the right decision. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5217 Differential Revision: D15004086 Pulled By: siying fbshipit-source-id: a4cffd79c6fb092669a3a90633c9aa5e494f8a66
-
- 19 4月, 2019 7 次提交
-
-
由 Sagar Vemuri 提交于
Summary: We found an issue in Periodic Compactions (introduced in #5166) where files were not being picked up for compactions as all the SST files created with older versions of RocksDB have `file_creation_time` as 0. (Note that `file_creation_time` is a new table property introduced in #5166). To address this, Periodic compactions now fall back to looking at the `creation_time` table property or the file's modification time (as given by the Env) when `file_creation_time` table property is found to be 0. Here how the file's modification time (and, in turn, the file age) is computed now: 1. Use `file_creation_time` table property if it is > 0. 1. If not, then use `creation_time` table property if it is > 0. 1. If not, then use file's mtime stat metadata given by the underlying Env. Don't consider the file at all for compaction if the modification time cannot be correctly determined based on the above conditions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5184 Differential Revision: D14907795 Pulled By: sagar0 fbshipit-source-id: 4bb2f3631f9a3e04470c674a1d13544584e1e56c
-
由 Zhongyi Xie 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5216 Differential Revision: D15003749 Pulled By: miasantreble fbshipit-source-id: a52c264e694cd7c55813be33ee22b4f3046b545a
-
由 Siying Dong 提交于
Summary: Reorganize the code so that no function call into ReadRangeDelAggregator is needed if there is no tomb range stone. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5202 Differential Revision: D14968155 Pulled By: siying fbshipit-source-id: 0bd61911293c7a27b4e1b8d57c66d0c4ad6a6a5f
-
由 Siying Dong 提交于
Summary: Several small changes for Next(): 1. Reducing branching by always update local_stats_.next_count_++ even if statistics is null. This should be faster than a branching. 2. Replacing ResetInternalKeysSkippedCounter() in Next() because the valid_ check is not needed in this case. 3. iter_->Valid() should always be true for non merge case. Remove this check. 4. Adding an inline annotation. It ends up with not picked up by my compiler, but it shouldn't hurt. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5200 Differential Revision: D15000391 Pulled By: siying fbshipit-source-id: be97f61c708968234fb8e5cf272b5c2ac07dc4dd
-
由 Siying Dong 提交于
Summary: In long scans, virtual function calls of Next(), Valid(), key() and value() are not trivial. By introducing NextAndGetResult(), Some of the Next(), Valid() and key() calls are consolidated into one virtual function call to reduce CPU. Also did some inline tricks and add some "final" randomly in some functions. Even without the "final" annotation, most Next() calls are inlined with -O3, but sometimes with a final it is inlined by O2 too. It doesn't hurt to add those final annotations. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5197 Differential Revision: D14945977 Pulled By: siying fbshipit-source-id: 7003969f9a5f1d5717f0bda503b91d19ba75ed88
-
由 Fosco Marotto 提交于
Summary: internal task: T35568575 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5199 Differential Revision: D14962794 Pulled By: gfosco fbshipit-source-id: 93838ede6d0235eaecff90d200faed9a8515bbbe
-
由 Yanqin Jin 提交于
Summary: As title. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5211 Differential Revision: D14992018 Pulled By: riversand963 fbshipit-source-id: b5720ea4742029e2fb47ff6d9f8d9de006db4ed4
-
- 18 4月, 2019 2 次提交
-
-
由 JiYou 提交于
Summary: `GetOverlappingInputsRangeBinarySearch` firstly use binary search to find a index in the given range `[begin, end]`. But after find the index, then use linear search to find the `start_index` and `end_index`. So the search process degraded to linear time. Here optmize the search process with below changes: - use `std::lower_bound` and `std::upper_bound` to get `lg(n)` search complexity. - use uniformed lambda for search process. - simplify process for `within_interval` true or false. - remove function `ExtendFileRangeWithinInterval` and `ExtendFileRangeOverlappingInterval`. Signed-off-by: NJiYou <jiyou09@gmail.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/4987 Differential Revision: D14984192 Pulled By: riversand963 fbshipit-source-id: fae4b8e59a21b7e350718d60cdc94dd55ac81e89
-
由 Zhongyi Xie 提交于
Summary: this PR fixes the following compile warning: ``` db/memtable.cc: In member function ‘virtual void rocksdb::MemTableIterator::Seek(const rocksdb::Slice&)’: db/memtable.cc:321:22: error: declaration of ‘user_key’ shadows a member of 'this' [-Werror=shadow] Slice user_key(ExtractUserKey(k)); ^ db/memtable.cc: In member function ‘virtual void rocksdb::MemTableIterator::SeekForPrev(const rocksdb::Slice&)’: db/memtable.cc:338:22: error: declaration of ‘user_key’ shadows a member of 'this' [-Werror=shadow] Slice user_key(ExtractUserKey(k)); ^ ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5204 Differential Revision: D14970160 Pulled By: miasantreble fbshipit-source-id: 388eb089f90c4528cc6d615dd4607fb53ceac705
-
- 17 4月, 2019 4 次提交
-
-
由 Zhongyi Xie 提交于
Summary: Depending on the config, manual compaction (leveled compaction style) does following compactions: L0->L1 L1->L2 ... Ln-1 -> Ln Ln -> Ln The final Ln -> Ln compaction is partly unnecessary as it recompacts all the files that were just generated by the Ln-1 -> Ln. We should avoid recompacting such files. This rule should be applied to Lmax only. Resolves issue https://github.com/facebook/rocksdb/issues/4995 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5138 Differential Revision: D14940106 Pulled By: miasantreble fbshipit-source-id: 8d3cf5507a17e76f3333cfd4bac5256d005636e5
-
由 Yanqin Jin 提交于
Summary: #4905 removed the implementation of `NewEmptyIterator` but kept its declaration in the public header. This breaks some systems that depend on RocksDB if the systems use `NewEmptyIterator`. Therefore, add it back to fix. cc maysamyabandeh please remind me if I miss anything here. Thanks Pull Request resolved: https://github.com/facebook/rocksdb/pull/5203 Differential Revision: D14968382 Pulled By: riversand963 fbshipit-source-id: 5fb86e99c8cfaf9f7a9473cdb1355d7558ff6e01
-
由 Siying Dong 提交于
Summary: Dummy cache size of 1MB is too large for small block sizes. Our GetDefaultCacheShardBits() use min_shard_size = 512L * 1024L to determine number of shards, so 1MB will excceeds the size of the whole shard and make the cache excceeds the budget. Change it to 256KB accordingly. There shouldn't be obvious performance impact, since inserting a cache entry every 256KB of memtable inserts is still infrequently enough. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5175 Differential Revision: D14954289 Pulled By: siying fbshipit-source-id: 2c275255c1ac3992174e06529e44c55538325c94
-
由 yiwu-arbug 提交于
Summary: This is second attempt for #5101. Original commit message: `BlockBasedTableIterator` avoid reading next block on `Next()` if it detects the iterator will be out of bound, by checking against index key. The optimization was added in #2239, and by the time it only check the bound per block. It seems later change make it a per-key check, which introduce unnecessary key comparisons. This patch come with two fixes: Fix 1: To optimize checking for bounds, we need comparing the bounds with index key as well. However BlockBasedTableIterator doesn't know whether its index iterator is internally using user keys or internal keys. The patch fixes that by extending InternalIterator with a user_key() function that is overridden by In IndexBlockIter. Fix 2: In #5101 we return `IsOutOfBound()=true` when block index key is out of bound. But the index key can be larger than smallest key of the next file on the level. That file can be within upper bound and should not be filtered out. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5142 Differential Revision: D14907113 Pulled By: siying fbshipit-source-id: ac95775c5b4e7b700f76ab43e39f45402c98fbfb
-
- 16 4月, 2019 5 次提交
-
-
由 Vijay Nadimpalli 提交于
Consolidating WAL creation which currently has duplicate logic in db_impl_write.cc and db_impl_open.cc (#5188) Summary: Right now, two separate pieces of code are used to create WAL files in DBImpl::Open function of db_impl_open.cc and DBImpl::SwitchMemtable function of db_impl_write.cc. This code change simply creates 1 function called DBImpl::CreateWAL in db_impl_open.cc which is used to replace existing WAL creation logic in DBImpl::Open and DBImpl::SwitchMemtable. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5188 Differential Revision: D14942832 Pulled By: vjnadimpalli fbshipit-source-id: d49230e04c36176015c8c1b422575872f92157fb
-
由 Yi Zhang 提交于
Summary: Found this when test driving the new MultiGet. If you pass unsorted result with sorted_result = false you'll trigger the ASSERT incorrect even though we'll sort down below. I've also added simple test cover sorted_result=true/false scenario copied from MultiGetSimple. anand1976 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5195 Differential Revision: D14935475 Pulled By: yizhang82 fbshipit-source-id: 1d2af5e3a003847d965066a16e3b19da68acf170
-
由 Yi Wu 提交于
Summary: Add `--seek_missing_prefix` flag to db_bench to allow benchmarking seeking to non-existing prefix. Usage example: ``` ./db_bench --db=/dev/shm/db_bench --use_existing_db=false --benchmarks=fillrandom --num=100000000 --prefix_size=9 --keys_per_prefix=10 ./db_bench --db=/dev/shm/db_bench --use_existing_db=true --benchmarks=seekrandom --disable_auto_compactions=true --num=100000000 --prefix_size=9 --keys_per_prefix=10 --reads=1000 --prefix_same_as_start=true --seek_missing_prefix=true ``` Also adding `--total_order_seek` and `--prefix_same_as_start` flags. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5163 Differential Revision: D14935724 Pulled By: riversand963 fbshipit-source-id: 7c41023f007febe373eb1589861f215432a9e18a
-
由 Fosco Marotto 提交于
Summary: Including latest fixes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5171 Differential Revision: D14875157 Pulled By: gfosco fbshipit-source-id: 86ec7ee3553a9b25ab71ed98966ce08a16322e2c
-
由 jsteemann 提交于
Summary: This branch contains two small improvements: * Create `LockMap` entries using `std::make_shared`. This saves one heap allocation per LockMap entry but also locates the control block and the LockMap object closely together in memory, which can help with caching * Reorder the members of `TrackedTrxInfo`, so that the resulting struct uses less memory (at least on 64bit systems) Pull Request resolved: https://github.com/facebook/rocksdb/pull/5193 Differential Revision: D14934536 Pulled By: maysamyabandeh fbshipit-source-id: f7b49812bb4b6029eef9d131e7cd56260df5b28e
-
- 13 4月, 2019 7 次提交
-
-
由 anand76 提交于
Summary: Add bounds check when looping through empty levels in FilePickerMultiGet Pull Request resolved: https://github.com/facebook/rocksdb/pull/5189 Differential Revision: D14925334 Pulled By: anand1976 fbshipit-source-id: 65d53247cf443153e28ce2b8b753fa51c6ae4566
-
由 yiwu-arbug 提交于
Summary: Before using prefix extractor `InDomain()` should be check. All uses in memtable.cc didn't check `InDomain()`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5190 Differential Revision: D14923773 Pulled By: miasantreble fbshipit-source-id: b3ad60bcca5f3a1a2b929a6eb34b0b7ba6326f04
-
由 Manuel Ung 提交于
Summary: In `PessimisticTransaction::TryLock`, we were calling `TrackKey` even when assume_tracked=true, which defeats the purpose of assume_tracked. Remove this. For keys that are already tracked, TrackKey will actually bump some counters (num_reads/num_writes) which are consumed in `TransactionBaseImpl::GetTrackedKeysSinceSavePoint`, and this is used to determine which keys were tracked since the last savepoint. I believe this functionality should still work, since I think the user should not call GetForUpdate/Put(assume_tracked=true) across savepoints, and if they do, they should not expect the Put(assume_tracked=true) to show up as a tracked key in the second savepoint. This is another 2-3% cpu improvement. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5173 Differential Revision: D14883809 Pulled By: lth fbshipit-source-id: 7d09f0772da422384af0519773e310c22b0cbca3
-
由 Maysam Yabandeh 提交于
Summary: When ReadOption doesn't specify a snapshot, WritePrepared::Get used kMaxSequenceNumber to avoid the cost of creating a new snapshot object (that requires sync over db_mutex). This creates a race condition if it is reading from the writes of a transaction that had duplicate keys: each instance of duplicate key is inserted with a different sequence number and depending on the ordering the ::Get might skip the newer one and read the older one that is obsolete. The patch fixes that by using last published seq as the snapshot sequence number. It also adds a check after the read is done to ensure that the max_evicted_seq has not advanced the aforementioned seq, which is a very unlikely event. If it did, then the read is not valid since the seq is not backed by an actually snapshot to let IsInSnapshot handle that properly when an overlapping commit is evicted from commit cache. A unit test is added to reproduce the race condition with duplicate keys. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5147 Differential Revision: D14758815 Pulled By: maysamyabandeh fbshipit-source-id: a56915657132cf6ba5e3f5ea1b5d78c803407719
-
由 ableegoldman 提交于
Summary: I would like to be able to read out the current Filter that has been set (or not) for a BlockBasedTableConfig. Added one public method to BlockBasedTableConfig: public Filter filterPolicy() { return filterPolicy; } Pull Request resolved: https://github.com/facebook/rocksdb/pull/5186 Differential Revision: D14921415 Pulled By: siying fbshipit-source-id: 2a63c8685480197862b49fc48916c757cd6daf95
-
由 Siying Dong 提交于
Summary: Since Statistics::measureTime() is deprecated, StatisticsImpl::measureTime() is not implemented. We realized that users might have a wrapped Statistics implementation in which measureTime() is implemented as forwarded to StatisticsImpl, and causes assert failure. In order to make the change less intrusive, we implement StatisticsImpl::measureTime(). We will revisit whether we need to remove it after several releases. Also, add a test to make sure that a Statistics implementation using the old interface still works. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5181 Differential Revision: D14907089 Pulled By: siying fbshipit-source-id: 29b6202fd04e30ed6f6adcaeb1000e87f10d1e1a
-
由 Yanqin Jin 提交于
Summary: as titled. False positive included, fixed anyway to make the check pass. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5185 Differential Revision: D14909384 Pulled By: riversand963 fbshipit-source-id: dc5177e72b1929ccfd6175a60e2cd7bdb9bd80f3
-
- 12 4月, 2019 3 次提交
-
-
由 vijaynadimpalli 提交于
Summary: When a new SST file is created via flush or compaction, we dump out the table properties, however only a few table properties are logged. The change here is to log all the table properties Pull Request resolved: https://github.com/facebook/rocksdb/pull/5168 Differential Revision: D14876928 Pulled By: vjnadimpalli fbshipit-source-id: 1aca42ad00f9f650761d39e187f8beeb8700149b
-
由 anand76 提交于
Summary: This PR introduces a new MultiGet() API, with the underlying implementation grouping keys based on SST file and batching lookups in a file. The reason for the new API is twofold - the definition allows callers to allocate storage for status and values on stack instead of std::vector, as well as return values as PinnableSlices in order to avoid copying, and it keeps the original MultiGet() implementation intact while we experiment with batching. Batching is useful when there is some spatial locality to the keys being queries, as well as larger batch sizes. The main benefits are due to - 1. Fewer function calls, especially to BlockBasedTableReader::MultiGet() and FullFilterBlockReader::KeysMayMatch() 2. Bloom filter cachelines can be prefetched, hiding the cache miss latency The next step is to optimize the binary searches in the level_storage_info, index blocks and data blocks, since we could reduce the number of key comparisons if the keys are relatively close to each other. The batching optimizations also need to be extended to other formats, such as PlainTable and filter formats. This also needs to be added to db_stress. Benchmark results from db_bench for various batch size/locality of reference combinations are given below. Locality was simulated by offsetting the keys in a batch by a stride length. Each SST file is about 8.6MB uncompressed and key/value size is 16/100 uncompressed. To focus on the cpu benefit of batching, the runs were single threaded and bound to the same cpu to eliminate interference from other system events. The results show a 10-25% improvement in micros/op from smaller to larger batch sizes (4 - 32). Batch Sizes 1 | 2 | 4 | 8 | 16 | 32 Random pattern (Stride length 0) 4.158 | 4.109 | 4.026 | 4.05 | 4.1 | 4.074 - Get 4.438 | 4.302 | 4.165 | 4.122 | 4.096 | 4.075 - MultiGet (no batching) 4.461 | 4.256 | 4.277 | 4.11 | 4.182 | 4.14 - MultiGet (w/ batching) Good locality (Stride length 16) 4.048 | 3.659 | 3.248 | 2.99 | 2.84 | 2.753 4.429 | 3.728 | 3.406 | 3.053 | 2.911 | 2.781 4.452 | 3.45 | 2.833 | 2.451 | 2.233 | 2.135 Good locality (Stride length 256) 4.066 | 3.786 | 3.581 | 3.447 | 3.415 | 3.232 4.406 | 4.005 | 3.644 | 3.49 | 3.381 | 3.268 4.393 | 3.649 | 3.186 | 2.882 | 2.676 | 2.62 Medium locality (Stride length 4096) 4.012 | 3.922 | 3.768 | 3.61 | 3.582 | 3.555 4.364 | 4.057 | 3.791 | 3.65 | 3.57 | 3.465 4.479 | 3.758 | 3.316 | 3.077 | 2.959 | 2.891 dbbench command used (on a DB with 4 levels, 12 million keys)- TEST_TMPDIR=/dev/shm numactl -C 10 ./db_bench.tmp -use_existing_db=true -benchmarks="readseq,multireadrandom" -write_buffer_size=4194304 -target_file_size_base=4194304 -max_bytes_for_level_base=16777216 -num=12000000 -reads=12000000 -duration=90 -threads=1 -compression_type=none -cache_size=4194304000 -batch_size=32 -disable_auto_compactions=true -bloom_bits=10 -cache_index_and_filter_blocks=true -pin_l0_filter_and_index_blocks_in_cache=true -multiread_batched=true -multiread_stride=4 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5011 Differential Revision: D14348703 Pulled By: anand1976 fbshipit-source-id: 774406dab3776d979c809522a67bedac6c17f84b
-
由 Siying Dong 提交于
Summary: Change the behavior of OptimizeForSmallDb() so that it is less likely to go out of memory. Change the behavior of OptimizeForPointLookup() to take advantage of the new memtable whole key filter, and move away from prefix extractor as well as hash-based indexing, as they are prone to misuse. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5165 Differential Revision: D14880709 Pulled By: siying fbshipit-source-id: 9af30e3c9e151eceea6d6b38701a58f1f9fb692d
-
- 11 4月, 2019 1 次提交
-
-
由 Sagar Vemuri 提交于
Summary: Introducing Periodic Compactions. This feature allows all the files in a CF to be periodically compacted. It could help in catching any corruptions that could creep into the DB proactively as every file is constantly getting re-compacted. And also, of course, it helps to cleanup data older than certain threshold. - Introduced a new option `periodic_compaction_time` to control how long a file can live without being compacted in a CF. - This works across all levels. - The files are put in the same level after going through the compaction. (Related files in the same level are picked up as `ExpandInputstoCleanCut` is used). - Compaction filters, if any, are invoked as usual. - A new table property, `file_creation_time`, is introduced to implement this feature. This property is set to the time at which the SST file was created (and that time is given by the underlying Env/OS). This feature can be enabled on its own, or in conjunction with `ttl`. It is possible to set a different time threshold for the bottom level when used in conjunction with ttl. Since `ttl` works only on 0 to last but one levels, you could set `ttl` to, say, 1 day, and `periodic_compaction_time` to, say, 7 days. Since `ttl < periodic_compaction_time` all files in last but one levels keep getting picked up based on ttl, and almost never based on periodic_compaction_time. The files in the bottom level get picked up for compaction based on `periodic_compaction_time`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5166 Differential Revision: D14884441 Pulled By: sagar0 fbshipit-source-id: 408426cbacb409c06386a98632dcf90bfa1bda47
-