- 09 9月, 2020 2 次提交
-
-
由 Jay Zhuang 提交于
Summary: gcc-4.8 returns error when using the constructor. Not sure if it's a compiler bug/limitation or code issue: ``` table/block_based/block_based_table_reader.cc:3183:67: error: use of deleted function ‘rocksdb::WritableFileStringStreamAdapter::WritableFileStringStreamAdapter(rocksdb::WritableFileStringStreamAdapter&&)’ ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/7358 Reviewed By: pdillinger Differential Revision: D23577651 Pulled By: jay-zhuang fbshipit-source-id: b0197e3d3538da61a6f3866410d88d2047fb9695
-
由 Akanksha Mahajan 提交于
Summary: Replace FSWritableFile pointer with FSWritableFilePtr object in WritableFileWriter. This new object wraps FSWritableFile pointer. Objective: If tracing is enabled, FSWritableFile Ptr returns FSWritableFileTracingWrapper pointer that includes all necessary information in IORecord and calls underlying FileSystem and invokes IOTracer to dump that record in a binary file. If tracing is disabled then, underlying FileSystem pointer is returned directly. FSWritableFilePtr wrapper class is added to bypass the FSWritableFileWrapper when tracing is disabled. Test Plan: make check -j64 Pull Request resolved: https://github.com/facebook/rocksdb/pull/7193 Reviewed By: anand1976 Differential Revision: D23355915 Pulled By: akankshamahajan15 fbshipit-source-id: e62a27a13c1fd77e36a6dbafc7006d969bed25cf
-
- 05 9月, 2020 1 次提交
-
-
由 Jay Zhuang 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7315 Test Plan: `ASSERT_STATUS_CHECKED=1 make sst_dump_test && ./sst_dump_test` And manually run `./sst_dump --file=*.sst` before and after the change. Reviewed By: pdillinger Differential Revision: D23361669 Pulled By: jay-zhuang fbshipit-source-id: 5bf51a2a90ee35c8c679e5f604732ec2aef5949a
-
- 04 9月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: In block-based table builder, the cut-over from buffered to unbuffered mode involves sampling the buffered blocks and generating a dictionary. There was a bug where `SstFileWriter` passed zero as the `target_file_size` causing the cutover to happen immediately, so there were no samples available for generating the dictionary. This PR changes the meaning of `target_file_size == 0` to mean buffer the whole file before cutting over. It also adds dictionary compression support to `sst_dump --command=recompress` for easy evaluation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7323 Reviewed By: cheng-chang Differential Revision: D23412158 Pulled By: ajkr fbshipit-source-id: 3b232050e70ef3c2ee85a4b5f6fadb139c569873
-
- 03 9月, 2020 1 次提交
-
-
由 Bingyi Sun 提交于
Summary: Fix typo in comment for SeekForGetImpl(). Rename "bounary" to "boundary" Pull Request resolved: https://github.com/facebook/rocksdb/pull/7328 Reviewed By: riversand963 Differential Revision: D23439748 Pulled By: zhichao-cao fbshipit-source-id: 83a34c417c71a3210ce54a090d76c4d5571313f3
-
- 28 8月, 2020 1 次提交
-
-
由 Jay Zhuang 提交于
Summary: A new file interface `SupportPrefetch()` is added. When the user overrides it to `false`, an internal prefetch buffer will be used for readahead. Useful for non-directIO but FS doesn't have readahead support. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7312 Reviewed By: anand1976 Differential Revision: D23329847 Pulled By: jay-zhuang fbshipit-source-id: 71cd4ce6f4a820840294e4e6aec111ab76175527
-
- 26 8月, 2020 1 次提交
-
-
由 sdong 提交于
Summary: Right now all I/O failures under PartitionIndexReader::CacheDependencies() is swallowed. This doesn't impact correctness but we've made a decision that any I/O error in read path now should be returned to users for awareness. Return errors in those cases instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7297 Test Plan: Add a new unit test that ingest errors in this code path and see Get() fails. Only one I/O path is hit in PartitionIndexReader::CacheDependencies(). Several option changes are attempt but not able to got other pread paths triggered. Not sure whether other failure cases would be even possible. Would rely on continuous stress test to validate it. Reviewed By: anand1976 Differential Revision: D23257950 fbshipit-source-id: 859dbc92fa239996e1bb378329344d3d54168c03
-
- 25 8月, 2020 1 次提交
-
-
由 mrambacher 提交于
Summary: More tests now pass. When in doubt, I added a TODO comment to check what should happen with an ignored error. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7305 Reviewed By: akankshamahajan15 Differential Revision: D23301262 Pulled By: ajkr fbshipit-source-id: 5f120edc7393560aefc0633250277bbc7e8de9e6
-
- 21 8月, 2020 1 次提交
-
-
由 mrambacher 提交于
Summary: This test uses database functionality and required more extensive work to get it to pass than the other tests. The DB functionality required for this test now passes the check. When it was unclear what the proper behavior was for unchecked status codes, a TODO was added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7283 Reviewed By: akankshamahajan15 Differential Revision: D23251497 Pulled By: ajkr fbshipit-source-id: 52b79629bdafa0a58de8ead1d1d66f141b331523
-
- 13 8月, 2020 1 次提交
-
-
由 Levi Tamasi 提交于
Summary: The patch cleans up and refactors `CompressBlock` and `CompressBlockInternal` a bit. In particular, it does the following: * It renames `CompressBlockInternal` to `CompressData` and moves it to `util/compression.h`, where other general compression-related utilities are located. This will facilitate reuse in the BlobDB write path. * The signature of the method is changed so it now takes `compression_format_version` (similarly to the compression library specific methods) instead of `format_version` (which is specific to the block based table). * `GetCompressionFormatForVersion` no longer takes `compression_type` as a parameter. This parameter was only used in a (not entirely up-to-date) assertion; also, removing it eliminates the need to ensure this precondition holds at all call sites. * Does some minor cleanup in `CompressBlock`, for instance, it is now possible to pass only one of `sampled_output_fast` and `sampled_output_slow`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7249 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23087278 Pulled By: ltamasi fbshipit-source-id: e6316e45baed8b4e7de7c1780c90501c2a3439b3
-
- 11 8月, 2020 1 次提交
-
-
由 Yuhong Guo 提交于
Summary: 1. `std::random_shuffle` is deprecated and now we can use `std::shuffle` ``` /rocksdb/db/prefix_test.cc:590:12: error: 'random_shuffle<std::__1::__wrap_iter<unsigned long long *> >' is deprecated [-Werror,-Wdeprecated-declarations] std::random_shuffle(prefixes.begin(), prefixes.end()); ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/algorithm:2982:1: note: 'random_shuffle<std::__1::__wrap_iter<unsigned long long *> >' has been explicitly marked deprecated here _LIBCPP_DEPRECATED_IN_CXX14 void ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__config:1107:39: note: expanded from macro '_LIBCPP_DEPRECATED_IN_CXX14' # define _LIBCPP_DEPRECATED_IN_CXX14 _LIBCPP_DEPRECATED ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__config:1090:48: note: expanded from macro '_LIBCPP_DEPRECATED' # define _LIBCPP_DEPRECATED __attribute__ ((deprecated)) ``` 2. `c_test` link error with `-DROCKSDB_BUILD_SHARED=OFF`: ``` [ 7%] Linking CXX executable c_test ld: library not found for -lrocksdb-shared clang: error: linker command failed with exit code 1 (use -v to see invocation) make[5]: *** [c_test] Error 1 make[4]: *** [CMakeFiles/c_test.dir/all] Error 2 make[4]: *** Waiting for unfinished jobs.... ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/7205 Reviewed By: ajkr Differential Revision: D23030641 Pulled By: pdillinger fbshipit-source-id: f270e50fc0b824ca1a0876ec5c65d33f55a72dd0
-
- 08 8月, 2020 1 次提交
-
-
由 anand76 提交于
Summary: Introduce io_timeout in ReadOptions and enabled deadline/io_timeout for Iterators. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7161 Test Plan: New unit tests in db_basic_test Reviewed By: riversand963 Differential Revision: D22687352 Pulled By: anand1976 fbshipit-source-id: 67bbb0e6d7ae80b256589244468494292538c6ec
-
- 06 8月, 2020 1 次提交
-
-
由 sdong 提交于
Summary: IteratorIterator::IsOutOfBound() and IteratorIterator::MayBeOutOfUpperBound() are two functions that related to upper bound check. It is hard for users to reason about this complexity. Consolidate the two functions into one and assign an enum as results to improve readability. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7200 Test Plan: Run all existing test. Would run crash test with atomic for a while. Reviewed By: anand1976 Differential Revision: D22833181 fbshipit-source-id: a0c724267056adbd0476bde74650e6c7226077e6
-
- 05 8月, 2020 1 次提交
-
-
由 sdong 提交于
Summary: https://github.com/facebook/rocksdb/pull/5289 introduces a performance regression that caused an upper bound check within every BlockBasedTableIterator::Next(). This is unnecessary if we've checked the boundary key for current block and it is within upper bound. Fix the bug. Also rename the boolean to a enum so that the code is slightly better readable. The original regression was probably to fix a bug that the block upper bound check status is not reset after a new block is created. Fix it bug so that the regression can be avoided without hitting the bug. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7209 Test Plan: Run all existing tests. Will run atomic black box crash test for a while. Reviewed By: anand1976 Differential Revision: D22859246 fbshipit-source-id: cbdad1f5e656c55fd8b71726d5a4f6cb53ff9140
-
- 04 8月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Previously, a `ReadOptions` object was stored in every `BlockBasedTableIterator` and every `LevelIterator`. This redundancy consumes extra memory, resulting in the `Arena` making more allocations, and iteration observing worse cache performance. This PR migrates callers of `NewInternalIterator()` and `MakeInputIterator()` to provide a `ReadOptions` object guaranteed to outlive the returned iterator. When the iterator's lifetime will be managed by the user, this lifetime guarantee is achieved by storing the `ReadOptions` value in `ArenaWrappedDBIter`. Then, sub-iterators of `NewInternalIterator()` and `MakeInputIterator()` can hold a reference-to-const `ReadOptions`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7210 Test Plan: - `make check` under ASAN and valgrind - benchmark: on a DB with 2 L0 files and 3 L1+ levels, this PR reduced `Arena` allocation 4792 -> 4160 bytes. Reviewed By: anand1976 Differential Revision: D22861323 Pulled By: ajkr fbshipit-source-id: 54aebb3e89c872eeab0f5793b4b6e42878d093ce
-
- 30 7月, 2020 1 次提交
-
-
由 sdong 提交于
Summary: NextAndGetResult() is not implemented in memtable and is very simply implemented in level iterator. The result is that for a normal leveled iterator, performance regression will be observed for calling PrepareValue() for most iterator Next(). Mitigate the problem by implementing the function for both iterators. In level iterator, the implementation cannot be perfect as when calling file iterator's SeekToFirst() we don't have information about whether the value is prepared. Fortunately, the first key should not cause a big portion of the CPu. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7179 Test Plan: Run normal crash test for a while. Reviewed By: anand1976 Differential Revision: D22783840 fbshipit-source-id: c19f45cdf21b756190adef97a3b66ccde3936e05
-
- 23 7月, 2020 1 次提交
-
-
由 mrambacher 提交于
Summary: When paraoid_files_checks=true, a rolling key-value hash is generated and compared to what is written to the file. If the values do not match, the SST file is rejected. Code put in place for the check for both flush and compaction jobs. Corresponding test added to corruption_test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7134 Reviewed By: cheng-chang Differential Revision: D22646149 fbshipit-source-id: 8fde1984a1a11edd3bd82a413acffc5ea7aa683f
-
- 21 7月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: PR https://github.com/facebook/rocksdb/issues/6944 transitioned `BlockIter` from using `Comparator*` to using concrete `UserComparatorWrapper` and `InternalKeyComparator`. However, adding them as instance variables to `BlockIter` was not optimal. Bloating `BlockIter` caused the `ArenaWrappedDBIter`'s arena allocator to do more heap allocations (in certain cases) which harmed performance of `DB::NewIterator()`. This PR pushes down the concrete comparator objects to the point of usage, which forces them to be on the stack. As a result, the `BlockIter` is back to its original size prior to https://github.com/facebook/rocksdb/issues/6944 (actually a bit smaller since there were two `Comparator*` before). Pull Request resolved: https://github.com/facebook/rocksdb/pull/7149 Test Plan: verified our internal `DB::NewIterator()`-heavy regression test no longer reports regression. Reviewed By: riversand963 Differential Revision: D22623189 Pulled By: ajkr fbshipit-source-id: f6d69accfe5de51e0bd9874a480b32b29909bab6
-
- 10 7月, 2020 2 次提交
-
-
由 mrambacher 提交于
Summary: Cleans up some of the dependencies on test code in the Makefile while building tools: - Moves the test::RandomString, DBBaseTest::RandomString into Random - Moves the test::RandomHumanReadableString into Random - Moves the DestroyDir method into file_utils - Moves the SetupSyncPointsToMockDirectIO into sync_point. - Moves the FaultInjection Env and FS classes under env These changes allow all of the tools to build without dependencies on test_util, thereby simplifying the build dependencies. By moving the FaultInjection code, the dependency in db_stress on different libraries for debug vs release was eliminated. Tested both release and debug builds via Make and CMake for both static and shared libraries. More work remains to clean up how the tools are built and remove some unnecessary dependencies. There is also more work that should be done to get the Makefile and CMake to align in their builds -- what is in the libraries and the sizes of the executables are different. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7097 Reviewed By: riversand963 Differential Revision: D22463160 Pulled By: pdillinger fbshipit-source-id: e19462b53324ab3f0b7c72459dbc73165cc382b2
-
由 Andrew Kryczka 提交于
Summary: This is a followup to https://github.com/facebook/rocksdb/issues/6646. In that PR, for simplicity I just appended a comparison against the 0th restart key in case `BinarySeek()`'s binary search landed at index 0. As a result there were `2/(N+1) + log_2(N)` key comparisons. This PR does it differently. Now we expand the binary search range by one so it also covers the case where target is at or before the restart key at index 0. As a result, it involves `log_2(N+1)` key comparisons. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7068 Test Plan: ran readrandom with mostly default settings and counted key comparisons using `PerfContext`. before: `user_key_comparison_count = 28881965` after: `user_key_comparison_count = 27823245` setup command: ``` $ TEST_TMPDIR=/dev/shm/dbbench ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -level_compaction_dynamic_level_bytes=true -num=10000000 ``` benchmark command: ``` $ TEST_TMPDIR=/dev/shm/dbbench/ ./db_bench -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=10000000 -compression_type=none -reads=1000000 -perf_level=3 ``` Reviewed By: anand1976 Differential Revision: D22357032 Pulled By: ajkr fbshipit-source-id: 8b01e9c1c2a4e9d02fc9dfe16c1cc0327f8bdf24
-
- 09 7月, 2020 3 次提交
-
-
由 Zitan Chen 提交于
Summary: Although PR https://github.com/facebook/rocksdb/issues/7032 fixes the construction of the `SstFileDumper` in `GetFileDbIdentities` by setting a proper `Env` of the `Options` passed in the constructor, the file path was not corrected accordingly. This actually disables backup engine to use db session ids in the file names since the `db_session_id` is always empty. Now it is fixed by setting the correct path in the construction of `SstFileDumper`. Furthermore, to preserve the Direct IO property that backup engine already has, parameter `EnvOptions` is added to `GetFileDbIdentities` and `SstFileDumper`. The `BackupUsingDirectIO` test is updated accordingly. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7104 Test Plan: backupable_db_test and some manual tests. Reviewed By: ajkr Differential Revision: D22443245 Pulled By: gg814 fbshipit-source-id: 056a9bb8b82947c5e73d7c3fbb62bfe23af5e562
-
由 Akanksha Mahajan 提交于
Update Flush policy in PartitionedIndexBuilder on switching from user-key to internal-key mode (#7096) Summary: When format_version is high enough to support user-key and there are index entries for same user key that spans multiple data blocks then it changes from user-key mode to internal-key mode. But the flush policy is not reset to point to Block Builder of internal-keys. After this switch, no entries are added to user key index partition result, thus it never triggers flushing the block. Fix: 1. After adding the entry in sub_builder_index_, if there is a switch from user-key to internal-key, then flush policy is updated to point to Block Builder of internal-keys index partition. 2. Set sub_builder_index_->seperator_is_key_plus_seq_ = true if seperator_is_key_plus_seq_ is set to true so that subsequent partitions can also use internal key mode. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7096 Test Plan: make check -j64 Reviewed By: ajkr Differential Revision: D22416598 Pulled By: akankshamahajan15 fbshipit-source-id: 01fc2dc07ea1b32f8fb803995ebe6e9a3fbe67ac
-
由 rockeet 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7080 Reviewed By: riversand963 Differential Revision: D22412352 Pulled By: ajkr fbshipit-source-id: 1d7f4c1621040a0130245139b52c3f4d3deac865
-
- 08 7月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Replace `BlockIter::comparator_` and `IndexBlockIter::user_comparator_wrapper_` with a concrete `UserComparatorWrapper` and `InternalKeyComparator`. The motivation for this change was the inconvenience of not knowing the concrete type of `BlockIter::comparator_`, which prevented calling specialized internal key comparison functions to optimize comparison of keys with global seqno applied. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6944 Test Plan: benchmark setup -- single file DBs, in-memory, no compression. "normal_db" created by regular flush; "ingestion_db" created by ingesting a file. Both DBs have same contents. ``` $ TEST_TMPDIR=/dev/shm/normal_db/ ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=10485760000 -disable_auto_compactions=true -compression_type=none -num=1000000 $ ./ldb write_extern_sst ./tmp.sst --db=/dev/shm/ingestion_db/dbbench/ --compression_type=no --hex --create_if_missing < <(./sst_dump --command=scan --output_hex --file=/dev/shm/normal_db/dbbench/000007.sst | awk 'began {print "0x" substr($1, 2, length($1) - 2), "==>", "0x" $5} ; /^Sst file format: block-based/ {began=1}') $ ./ldb ingest_extern_sst ./tmp.sst --db=/dev/shm/ingestion_db/dbbench/ ``` benchmark run command: ``` $ TEST_TMPDIR=/dev/shm/$DB/ ./db_bench -benchmarks=seekrandom -seek_nexts=$SEEK_NEXT -use_existing_db=true -cache_index_and_filter_blocks=false -num=1000000 -cache_size=0 -threads=1 -reads=200000000 -mmap_read=1 -verify_checksum=false ``` results: perf improved marginally for ingestion_db and did not change significantly for normal_db: SEEK_NEXT | DB | code | ops/sec | % change -- | -- | -- | -- | -- 0 | normal_db | master | 350880 | 0 | normal_db | PR6944 | 351040 | 0.0 0 | ingestion_db | master | 343255 | 0 | ingestion_db | PR6944 | 349424 | 1.8 10 | normal_db | master | 218711 | 10 | normal_db | PR6944 | 217892 | -0.4 10 | ingestion_db | master | 220334 | 10 | ingestion_db | PR6944 | 226437 | 2.8 Reviewed By: pdillinger Differential Revision: D21924676 Pulled By: ajkr fbshipit-source-id: ea4288a2eefa8112eb6c651a671c1de18c12e538
-
- 03 7月, 2020 1 次提交
-
-
由 Peter Dillinger 提交于
Summary: Even though local bisection gave me a clear signal (and still does) that reverting https://github.com/facebook/rocksdb/issues/7049 would fix the failures in MultiThreadedDBTest, https://github.com/facebook/rocksdb/issues/7022 seems to be the root cause. Reverting https://github.com/facebook/rocksdb/issues/7022 and keeping https://github.com/facebook/rocksdb/issues/7049 seems to fix the issue in local reproducer also. (Had these landed in opposite order, bisection would have found the root cause.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7071 Reviewed By: akankshamahajan15 Differential Revision: D22362857 Pulled By: pdillinger fbshipit-source-id: ed63df3d74e9d4ce1604de8fe43b216166c7a3f0
-
- 02 7月, 2020 1 次提交
-
-
由 Akanksha Mahajan 提交于
Update Flush policy in PartitionedIndexBuilder on switching from user-key to internal-key mode (#7022) Summary: When format_version is high enough to support user-key and there are index entries for same user key that spans multiple data blocks then it changes from user-key mode to internal-key mode. But the flush policy is not reset to point to Block Builder of internal-keys. After this switch, no entries are added to user key index partition result, thus it never triggers flushing the block. Fix: After adding the entry in sub_builder_index_, if there is a switch from user-key to internal-key, then flush policy is updated to point to Block Builder of internal-keys index partition. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7022 Test Plan: 1. make check -j64 2. Added one unit test case Reviewed By: ajkr Differential Revision: D22197734 Pulled By: akankshamahajan15 fbshipit-source-id: d87e9e46bccab8e896ee6979d6b79c51f73d479e
-
- 01 7月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: With mmap enabled on an uncompressed file, we were previously always doing a heap allocation to obtain the scratch buffer for `RandomAccessFileReader::Read()`. However, that allocation was unnecessary as the underlying file reader returned a pointer into its mapped memory, not the provided scratch buffer. This PR makes passes the `BlockFetcher`'s inline buffer as the scratch buffer if the data block is small enough (less than `kDefaultStackBufferSize` bytes, currently 5000). Ideally we would not pass a scratch buffer at all for an mmap read; however, the `RandomAccessFile::Read()` API guarantees such a buffer is provided, and non-standard implementations may be relying on it even when `Options::allow_mmap_reads == true`. In that case, this PR still works but introduces an extra copy from the inline buffer to a heap buffer. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7043 Reviewed By: cheng-chang Differential Revision: D22320606 Pulled By: ajkr fbshipit-source-id: ad964dd23df34e07d979c6032c2dfe5454c98b52
-
- 30 6月, 2020 1 次提交
-
-
由 Anand Ananthabhotla 提交于
Summary: Current implementation of the ```read_options.deadline``` option only checks the deadline for random file reads during point lookups. This PR extends the checks to file opens, prefetches and preloads as part of table open. The main changes are in the ```BlockBasedTable```, partitioned index and filter readers, and ```TableCache``` to take ReadOptions as an additional parameter. In ```BlockBasedTable::Open```, in order to retain existing behavior w.r.t checksum verification and block cache usage, we filter out most of the options in ```ReadOptions``` except ```deadline```. However, having the ```ReadOptions``` gives us more flexibility to honor other options like verify_checksums, fill_cache etc. in the future. Additional changes in callsites due to function signature changes in ```NewTableReader()``` and ```FilePrefetchBuffer```. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6982 Test Plan: Add new unit tests in db_basic_test Reviewed By: riversand963 Differential Revision: D22219515 Pulled By: anand1976 fbshipit-source-id: 8a3b92f4a889808013838603aa3ca35229cd501b
-
- 27 6月, 2020 1 次提交
-
-
由 sdong 提交于
Summary: We are still keeping unity build working. So it's a good idea to add to a pre-commit CI. A latest GCC docker image just to get a little bit more coverage. Fix three small issues to make it pass. Also make unity_test to run db_basic_test rather than db_test to cut the test time. There is no point to run expensive tests here. It was set to run db_test before db_basic_test was separated out. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7026 Test Plan: watch tests to pass. Reviewed By: zhichao-cao Differential Revision: D22223197 fbshipit-source-id: baa3b6cbb623bf359829b63ce35715c75bcb0ed4
-
- 25 6月, 2020 2 次提交
-
-
由 Zitan Chen 提交于
Add a new option for BackupEngine to store table files under shared_checksum using DB session id in the backup filenames (#6997) Summary: `BackupableDBOptions::new_naming_for_backup_files` is added. This option is false by default. When it is true, backup table filenames under directory shared_checksum are of the form `<file_number>_<crc32c>_<db_session_id>.sst`. Note that when this option is true, it comes into effect only when both `share_files_with_checksum` and `share_table_files` are true. Three new test cases are added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6997 Test Plan: Passed make check. Reviewed By: ajkr Differential Revision: D22098895 Pulled By: gg814 fbshipit-source-id: a1d9145e7fe562d71cde7ac995e17cb24fd42e76
-
由 sdong 提交于
Summary: It's useful to build RocksDB using a more recent clang version in CI. Add a CircleCI build and fix some issues with it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7025 Test Plan: See all tests pass. Reviewed By: pdillinger Differential Revision: D22215700 fbshipit-source-id: 914a729c2cd3f3ac4a627cc0ac58d4691dca2168
-
- 23 6月, 2020 1 次提交
-
-
由 Peter Dillinger 提交于
Summary: New experimental option BBTO::optimize_filters_for_memory builds filters that maximize their use of "usable size" from malloc_usable_size, which is also used to compute block cache charges. Rather than always "rounding up," we track state in the BloomFilterPolicy object to mix essentially "rounding down" and "rounding up" so that the average FP rate of all generated filters is the same as without the option. (YMMV as heavily accessed filters might be unluckily lower accuracy.) Thus, the option near-minimizes what the block cache considers as "memory used" for a given target Bloom filter false positive rate and Bloom filter implementation. There are no forward or backward compatibility issues with this change, though it only works on the format_version=5 Bloom filter. With Jemalloc, we see about 10% reduction in memory footprint (and block cache charge) for Bloom filters, but 1-2% increase in storage footprint, due to encoding efficiency losses (FP rate is non-linear with bits/key). Why not weighted random round up/down rather than state tracking? By only requiring malloc_usable_size, we don't actually know what the next larger and next smaller usable sizes for the allocator are. We pick a requested size, accept and use whatever usable size it has, and use the difference to inform our next choice. This allows us to narrow in on the right balance without tracking/predicting usable sizes. Why not weight history of generated filter false positive rates by number of keys? This could lead to excess skew in small filters after generating a large filter. Results from filter_bench with jemalloc (irrelevant details omitted): (normal keys/filter, but high variance) $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9 Build avg ns/key: 29.6278 Number of filters: 5516 Total size (MB): 200.046 Reported total allocated memory (MB): 220.597 Reported internal fragmentation: 10.2732% Bits/key stored: 10.0097 Average FP rate %: 0.965228 $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory Build avg ns/key: 30.5104 Number of filters: 5464 Total size (MB): 200.015 Reported total allocated memory (MB): 200.322 Reported internal fragmentation: 0.153709% Bits/key stored: 10.1011 Average FP rate %: 0.966313 (very few keys / filter, optimization not as effective due to ~59 byte internal fragmentation in blocked Bloom filter representation) $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9 Build avg ns/key: 29.5649 Number of filters: 162950 Total size (MB): 200.001 Reported total allocated memory (MB): 224.624 Reported internal fragmentation: 12.3117% Bits/key stored: 10.2951 Average FP rate %: 0.821534 $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory Build avg ns/key: 31.8057 Number of filters: 159849 Total size (MB): 200 Reported total allocated memory (MB): 208.846 Reported internal fragmentation: 4.42297% Bits/key stored: 10.4948 Average FP rate %: 0.811006 (high keys/filter) $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9 Build avg ns/key: 29.7017 Number of filters: 164 Total size (MB): 200.352 Reported total allocated memory (MB): 221.5 Reported internal fragmentation: 10.5552% Bits/key stored: 10.0003 Average FP rate %: 0.969358 $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory Build avg ns/key: 30.7131 Number of filters: 160 Total size (MB): 200.928 Reported total allocated memory (MB): 200.938 Reported internal fragmentation: 0.00448054% Bits/key stored: 10.1852 Average FP rate %: 0.963387 And from db_bench (block cache) with jemalloc: $ ./db_bench -db=/dev/shm/dbbench.no_optimize -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false $ ./db_bench -db=/dev/shm/dbbench -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -optimize_filters_for_memory -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false $ (for FILE in /dev/shm/dbbench.no_optimize/*.sst; do ./sst_dump --file=$FILE --show_properties | grep 'filter block' ; done) | awk '{ t += $4; } END { print t; }' 17063835 $ (for FILE in /dev/shm/dbbench/*.sst; do ./sst_dump --file=$FILE --show_properties | grep 'filter block' ; done) | awk '{ t += $4; } END { print t; }' 17430747 $ #^ 2.1% additional filter storage $ ./db_bench -db=/dev/shm/dbbench.no_optimize -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000 rocksdb.block.cache.index.add COUNT : 33 rocksdb.block.cache.index.bytes.insert COUNT : 8440400 rocksdb.block.cache.filter.add COUNT : 33 rocksdb.block.cache.filter.bytes.insert COUNT : 21087528 rocksdb.bloom.filter.useful COUNT : 4963889 rocksdb.bloom.filter.full.positive COUNT : 1214081 rocksdb.bloom.filter.full.true.positive COUNT : 1161999 $ #^ 1.04 % observed FP rate $ ./db_bench -db=/dev/shm/dbbench -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -optimize_filters_for_memory -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000 rocksdb.block.cache.index.add COUNT : 33 rocksdb.block.cache.index.bytes.insert COUNT : 8448592 rocksdb.block.cache.filter.add COUNT : 33 rocksdb.block.cache.filter.bytes.insert COUNT : 18220328 rocksdb.bloom.filter.useful COUNT : 5360933 rocksdb.bloom.filter.full.positive COUNT : 1321315 rocksdb.bloom.filter.full.true.positive COUNT : 1262999 $ #^ 1.08 % observed FP rate, 13.6% less memory usage for filters (Due to specific key density, this example tends to generate filters that are "worse than average" for internal fragmentation. "Better than average" cases can show little or no improvement.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/6427 Test Plan: unit test added, 'make check' with gcc, clang and valgrind Reviewed By: siying Differential Revision: D22124374 Pulled By: pdillinger fbshipit-source-id: f3e3aa152f9043ddf4fae25799e76341d0d8714e
-
- 20 6月, 2020 1 次提交
-
-
由 Peter Dillinger 提交于
Summary: Although RocksDB falls over in various other ways with KVs around 4GB or more, this change fixes how XXH32 and XXH64 were being called by the block checksum code to support >= 4GB in case that should ever happen, or the code copied for other uses. This change is not a schema compatibility issue because the checksum verification code would checksum the first (block_size + 1) mod 2^32 bytes while the checksum construction code would checksum the first block_size mod 2^32 plus the compression type byte, meaning the XXH32/64 checksums for >=4GB block would not match about 255/256 times. While touching this code, I refactored to consolidate redundant implementations, improving diagnostics and performance tracking in some cases. Also used less confusing language in those diagnostics. Makes https://github.com/facebook/rocksdb/issues/6875 obsolete. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6978 Test Plan: I was able to write a test for this using an SST file writer and VerifyChecksum in a reader. The test fails before the fix, though I'm leaving the test disabled because I don't think it's worth the expense of running regularly. Reviewed By: gg814 Differential Revision: D22143260 Pulled By: pdillinger fbshipit-source-id: 982993d16134e8c50bea2269047f901c1783726e
-
- 18 6月, 2020 2 次提交
-
-
由 sdong 提交于
Summary: Compressed block cache is disabled in https://github.com/facebook/rocksdb/pull/4650 for no good reason. Re-enable it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6990 Test Plan: Add a unit test to make sure a general function works with read-only DB + compressed block cache. Reviewed By: ltamasi Differential Revision: D22072755 fbshipit-source-id: 2a55df6363de23a78979cf6c747526359e5dc7a1
-
由 Zitan Chen 提交于
Summary: `db_id` and `db_session_id` are now part of the table properties for all formats and stored in SST files. This adds about 99 bytes to each new SST file. The `TablePropertiesNames` for these two identifiers are `rocksdb.creating.db.identity` and `rocksdb.creating.session.identity`. In addition, SST files generated from SstFileWriter and Repairer have DB identity “SST Writer” and “DB Repairer”, respectively. Their DB session IDs are generated in the same way as `DB::GetDbSessionId`. A table property test is added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6983 Test Plan: make check and some manual tests. Reviewed By: zhichao-cao Differential Revision: D22048826 Pulled By: gg814 fbshipit-source-id: afdf8c11424a6f509b5c0b06dafad584a80103c9
-
- 16 6月, 2020 1 次提交
-
-
由 Levi Tamasi 提交于
Summary: When using parameterized tests, `gtest` sometimes prints the test parameters. If no other printing method is available, it essentially produces a hex dump of the object. This can cause issues with valgrind with types like `TestArgs` in `table_test`, where the object layout has gaps (with uninitialized contents) due to the members' alignment requirements. The patch fixes the uninitialized reads by providing an `operator<<` for `TestArgs` and also makes sure all members are initialized (in a consistent order) on all code paths. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6980 Test Plan: `valgrind --leak-check=full ./table_test` Reviewed By: siying Differential Revision: D22045536 Pulled By: ltamasi fbshipit-source-id: 6f5920ac28c712d0aa88162fffb80172ed769c32
-
- 14 6月, 2020 1 次提交
-
-
由 Zhen Li 提交于
Summary: Persistent cache feature caused rocks db crash on windows. I posted a issue for it, https://github.com/facebook/rocksdb/issues/6919. I found this is because no "persistent_cache_key_prefix" is generated for persistent cache. Looking repo history, "GetUniqueIdFromFile" is not implemented on Windows. So my fix is adding "NewId()" function in "persistent_cache" and using it to generate prefix for persistent cache. In this PR, i also re-enable related test cases defined in "db_test2" and "persistent_cache_test" for windows. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6932 Test Plan: 1. run related test cases in "db_test2" and "persistent_cache_test" on windows and see it passed. 2. manually run db_bench.exe with "read_cache_path" and verified. Reviewed By: riversand963 Differential Revision: D21911608 Pulled By: cheng-chang fbshipit-source-id: cdfd938d54a385edbb2836b13aaa1d39b0a6f1c2
-
- 13 6月, 2020 1 次提交
-
-
由 Levi Tamasi 提交于
Summary: `HarnessTest` in `table_test.cc` currently tests many parameter combinations sequentially in a loop. This is problematic from a testing perspective, since if the test fails, we have no way of knowing how many/which combinations have failed. It can also cause timeouts on our test system due to the sheer number of combinations tested. (Specifically, the parallel compression threads parameter added by https://github.com/facebook/rocksdb/pull/6262 seems to have been the last straw.) There is some DIY code there that splits the load among eight test cases but that does not appear to be sufficient anymore. Instead, the patch turns `HarnessTest` into a parameterized test, so all the parameter combinations can be tested separately and potentially concurrently. It also cleans up the tests a little, fixes `RandomizedLongDB`, which did not get updated when the parallel compression threads parameter was added, and turns `FooterTests` into a standalone test case (since it does not actually need a fixture class). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6974 Test Plan: `make check` Reviewed By: siying Differential Revision: D22029572 Pulled By: ltamasi fbshipit-source-id: 51baea670771c33928f2eb3902bd69dcf540aa41
-
- 11 6月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: This saves up to two key comparisons in block seeks. The first key comparison saved is a redundant key comparison against the restart key where the linear scan starts. This comparison is saved in all cases except when the found key is in the first restart interval. The second key comparison saved is a redundant key comparison against the restart key where the linear scan ends. This is only saved in cases where all keys in the restart interval are less than the target (probability roughly `1/restart_interval`). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6646 Test Plan: ran a benchmark with mostly default settings and counted key comparisons before: `user_key_comparison_count = 19399529` after: `user_key_comparison_count = 18431498` setup command: ``` $ TEST_TMPDIR=/dev/shm/dbbench ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -level_compaction_dynamic_level_bytes=true -num=10000000 ``` benchmark command: ``` $ TEST_TMPDIR=/dev/shm/dbbench/ ./db_bench -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=10000000 -compression_type=none -reads=1000000 -perf_level=3 ``` Reviewed By: pdillinger Differential Revision: D20849707 Pulled By: ajkr fbshipit-source-id: 1f01c5cd99ea771fd27974046e37b194f1cdcfac
-
- 10 6月, 2020 1 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Memory pinned by `pin_l0_filter_and_index_blocks_in_cache` needs to be predictable based on user config. This PR makes sure we do not pin extra memory for large files generated by intra-L0 (see https://github.com/facebook/rocksdb/issues/6889). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6911 Test Plan: unit test Reviewed By: siying Differential Revision: D21835818 Pulled By: ajkr fbshipit-source-id: a11a088549d06bed8aacc2548d266e5983f0ead4
-