- 12 8月, 2021 3 次提交
-
-
由 Baptiste Lemaire 提交于
Summary: Current internal regression tests pass in an old option flag `experimental_allow_mempurge` to a more recently built db. This flag was retired and removed in a recent PR (https://github.com/facebook/rocksdb/issues/8628), and therefore, the following error comes up : `Failed: Invalid argument: Could not find option: : experimental_allow_mempurge`. In this PR, I reintroduce the two flags retired in https://github.com/facebook/rocksdb/issues/8628, `experimental_allow_mempurge` and `experimental_mempurge_policy` in `db_options.cc` and mark them both as `kDeprecated`. This is a temporary fix to save us time to find a long term solution, which hopefully will consist in ignoring options prefixed with `experimental_` that are no longer recognized. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8650 Reviewed By: pdillinger Differential Revision: D30257307 Pulled By: bjlemaire fbshipit-source-id: 35303655fd2dd9789fd9e3c450e9d8009f3c1f54
-
由 Peter Dillinger 提交于
Summary: The last few releases overlooked adding to this test. This change fixes that. This change also fixes the problem of older branches not understanding ROCKSDB_NO_FBCODE and referencing compilers no longer supported. During the test, build_detect_platform is patched to force no FBCODE compiler usage. (We should not need to update old branches perpetually.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/8651 Test Plan: local run reproduces regression described in https://github.com/facebook/rocksdb/issues/8650 Reviewed By: jay-zhuang, zhichao-cao Differential Revision: D30261872 Pulled By: pdillinger fbshipit-source-id: 02b447d224d7e0eb8613c63185437ded146713bc
-
由 Jay Zhuang 提交于
Summary: Add comment for `options.allow_fallocate` that btrfs preallocated space are not freed and a suggestion to disable preallocation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8646 Test Plan: No code change Reviewed By: ajkr Differential Revision: D30240050 Pulled By: jay-zhuang fbshipit-source-id: 75b7190bc8276ce8d8ac2d0cb9064b386cbf4768
-
- 11 8月, 2021 2 次提交
-
-
由 Baptiste Lemaire 提交于
Summary: Changes the API of the MemPurge process: the `bool experimental_allow_mempurge` and `experimental_mempurge_policy` flags have been replaced by a `double experimental_mempurge_threshold` option. This change of API reflects another major change introduced in this PR: the MemPurgeDecider() function now works by sampling the memtables being flushed to estimate the overall amount of useful payload (payload minus the garbage), and then compare this useful payload estimate with the `double experimental_mempurge_threshold` value. Therefore, when the value of this flag is `0.0` (default value), mempurge is simply deactivated. On the other hand, a value of `DBL_MAX` would be equivalent to always going through a mempurge regardless of the garbage ratio estimate. At the moment, a `double experimental_mempurge_threshold` value else than 0.0 or `DBL_MAX` is opnly supported`with the `SkipList` memtable representation. Regarding the sampling, this PR includes the introduction of a `MemTable::UniqueRandomSample` function that collects (approximately) random entries from the memtable by using the new `SkipList::Iterator::RandomSeek()` under the hood, or by iterating through each memtable entry, depending on the target sample size and the total number of entries. The unit tests have been readapted to support this new API. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8628 Reviewed By: pdillinger Differential Revision: D30149315 Pulled By: bjlemaire fbshipit-source-id: 1feef5390c95db6f4480ab4434716533d3947f27
-
由 Levi Tamasi 提交于
Summary: The patch attempts to deflake `DBTestXactLogIterator.TransactionLogIteratorCorruptedLog` by disabling file deletions while retrieving the list of WAL files and truncating the first WAL file. This is to prevent the `PurgeObsoleteFiles` call triggered by `GetSortedWalFiles` from invalidating the result of `GetSortedWalFiles`. The patch also cleans up the test case a bit and changes it to using `test::TruncateFile` instead of calling the `truncate` syscall directly. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8627 Test Plan: `make check` Reviewed By: akankshamahajan15 Differential Revision: D30147002 Pulled By: ltamasi fbshipit-source-id: db11072a4ad8900a2f859cb5294e22b1888c23f6
-
- 10 8月, 2021 4 次提交
-
-
由 Andrew Kryczka 提交于
Summary: `GenericRateLimiter` slow path handles requests that cannot be satisfied immediately. Such requests enter a queue, and their thread stays in `Request()` until they are granted or the rate limiter is stopped. These threads are responsible for unblocking themselves. The work to do so is split into two main duties. (1) Waiting for the next refill time. (2) Refilling the bytes and granting requests. Prior to this PR, the slow path logic involved a leader election algorithm to pick one thread to perform (1) followed by (2). It elected the thread whose request was at the front of the highest priority non-empty queue since that request was most likely to be granted. This algorithm was efficient in terms of reducing intermediate wakeups, which is a thread waking up only to resume waiting after finding its request is not granted. However, the conceptual complexity of this algorithm was too high. It took me a long time to draw a timeline to understand how it works for just one edge case yet there were so many. This PR drops the leader election to reduce conceptual complexity. Now, the two duties can be performed by whichever thread acquires the lock first. The risk of this change is increasing the number of intermediate wakeups, however, we took steps to mitigate that. - `wait_until_refill_pending_` flag ensures only one thread performs (1). This\ prevents the thundering herd problem at the next refill time. The remaining\ threads wait on their condition variable with an unbounded duration -- thus we\ must remember to notify them to ensure forward progress. - (1) is typically done by a thread at the front of a queue. This is trivial\ when the queues are initially empty as the first choice that arrives must be\ the only entry in its queue. When queues are initially non-empty, we achieve\ this by having (2) notify a thread at the front of a queue (preferring higher\ priority) to perform the next duty. - We do not require any additional wakeup for (2). Typically it will just be\ done by the thread that finished (1). Combined, the second and third bullet points above suggest the refill/granting will typically be done by a request at the front of its queue. This is important because one wakeup is saved when a granted request happens to be in an already running thread. Note there are a few cases that still lead to intermediate wakeup, however. The first two are existing issues that also apply to the old algorithm, however, the third (including both subpoints) is new. - No request may be granted (only possible when rate limit dynamically\ decreases). - Requests from a different queue may be granted. - (2) may be run by a non-front request thread causing it to not be granted even\ if some requests in that same queue are granted. It can happen for a couple\ (unlikely) reasons. - A new request may sneak in and grab the lock at the refill time, before the\ thread finishing (1) can wake up and grab it. - A new request may sneak in and grab the lock and execute (1) before (2)'s\ chosen candidate can wake up and grab the lock. Then that non-front request\ thread performing (1) can carry over to perform (2). Pull Request resolved: https://github.com/facebook/rocksdb/pull/8602 Test Plan: - Use existing tests. The edge cases listed in the comment are all performance\ related; I could not really think of any related to correctness. The logic\ looks the same whether a thread wakes up/finishes its work early/on-time/late,\ or whether the thread is chosen vs. "steals" the work. - Verified write throughput and CPU overhead are basically the same with and\ without this change, even in a rate limiter heavy workload: Test command: ``` $ rm -rf /dev/shm/dbbench/ && TEST_TMPDIR=/dev/shm /usr/bin/time ./db_bench -benchmarks=fillrandom -num_multi_db=64 -num_low_pri_threads=64 -num_high_pri_threads=64 -write_buffer_size=262144 -target_file_size_base=262144 -max_bytes_for_level_base=1048576 -rate_limiter_bytes_per_sec=16777216 -key_size=24 -value_size=1000 -num=10000 -compression_type=none -rate_limiter_refill_period_us=1000 ``` Results before this PR: ``` fillrandom : 108.463 micros/op 9219 ops/sec; 9.0 MB/s 7.40user 8.84system 1:26.20elapsed 18%CPU (0avgtext+0avgdata 256140maxresident)k ``` Results after this PR: ``` fillrandom : 108.108 micros/op 9250 ops/sec; 9.0 MB/s 7.45user 8.23system 1:26.68elapsed 18%CPU (0avgtext+0avgdata 255688maxresident)k ``` Reviewed By: hx235 Differential Revision: D30048013 Pulled By: ajkr fbshipit-source-id: 6741bba9d9dfbccab359806d725105817fef818b
-
由 Lucian Grijincu 提交于
Summary: UBSAN revealed a pointer underflow when `LZ4HC_init_internal` is called with a null `start`. Reviewed By: ajkr Differential Revision: D30181874 fbshipit-source-id: ca9bbac1a85c58782871d7f153af733b000cc66c
-
由 Jay Zhuang 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8631 Reviewed By: siying Differential Revision: D30200385 Pulled By: jay-zhuang fbshipit-source-id: 0fa2bb15e74ff81762d767f234078e0fe0106c55
-
由 sdong 提交于
Summary: Some FIFO users want to keep the data for longer, but the old data is rarely accessed. This feature allows users to configure FIFO compaction so that data older than a threshold is moved to a warm storage tier. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8310 Test Plan: Add several unit tests. Reviewed By: ajkr Differential Revision: D28493792 fbshipit-source-id: c14824ea634814dee5278b449ab5c98b6e0b5501
-
- 08 8月, 2021 1 次提交
-
-
由 Akanksha Mahajan 提交于
Summary: FaultInjectionTestFS injects error in Rename operation. Because of injected error, info.log fails to be created if rename returns error and info_log is set to nullptr which leads to this assertion Pull Request resolved: https://github.com/facebook/rocksdb/pull/8632 Test Plan: run the db_stress job locally Reviewed By: ajkr Differential Revision: D30167387 Pulled By: akankshamahajan15 fbshipit-source-id: 8d08c4c33e8f0cabd368bbb498d21b9de0660067
-
- 07 8月, 2021 7 次提交
-
-
由 Roy Crihfield 提交于
Summary: * `rocksdb_optimistictransactiondb_checkpoint_object_create` * `rocksdb_optimistictransactiondb_write` Pull Request resolved: https://github.com/facebook/rocksdb/pull/8526 Reviewed By: ajkr Differential Revision: D30076822 Pulled By: jay-zhuang fbshipit-source-id: a59956a8d5449e75d39a8087fbb2bad148cf697d
-
由 Andrew Kryczka 提交于
Summary: This draining mechanism should not be run during `JoinThreads()` because it can detach threads that will be joined. Joining detached threads would throw an exception. With this PR, we skip draining when `JoinThreads()` has already decided what threads to `join()`, so the threads will exit naturally once the work queue empties. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8635 Test Plan: verified it unblocked using `WaitForJobsAndJoinAllThreads()` in https://github.com/facebook/rocksdb/issues/8611. Reviewed By: riversand963 Differential Revision: D30174587 Pulled By: ajkr fbshipit-source-id: 144966398a607987e0763c7152a0f653fdbf3c8b
-
由 Levi Tamasi 提交于
Summary: `CompareKeyContext::operator()` on the trunk has a bug: when comparing column family IDs, `lhs` is used for both sides of the comparison. This results in the `KeyContext`s getting sorted solely based on key, which in turn means that keys with the same column family do not necessarily form a single range in the sorted list. This violates an assumption of the batched `MultiGet` logic, leading to the same column family showing up multiple times in the list of `MultiGetColumnFamilyData`. The end result is the code attempting to check out the thread-local `SuperVersion` for the same CF multiple times, causing an assertion violation in debug builds and memory corruption/crash in release builds. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8633 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D30169182 Pulled By: ltamasi fbshipit-source-id: a47710652df7e95b14b40fb710924c11a8478023
-
由 Zaorang Yang 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8533 Reviewed By: ajkr Differential Revision: D29718067 fbshipit-source-id: b4b91c9271362e7a7d47ddbaf28f56fb537cc668
-
由 Peter (Stig) Edwards 提交于
Summary: Unused since https://github.com/facebook/rocksdb/commit/ab718b415fc9b2a66a2ed642c18803f764839d7b . Noticed on https://lgtm.com/projects/g/facebook/rocksdb/snapshot/b215f1a83226f111ff52305987af93564272b7d3/files/tools/db_crashtest.py?sort=name&dir=ASC&mode=heatmap#xf254f528ad18f108:1 Pull Request resolved: https://github.com/facebook/rocksdb/pull/8599 Reviewed By: ajkr Differential Revision: D30057041 Pulled By: zhichao-cao fbshipit-source-id: e80438cf9717086d2bf67461e19393d426a7676e
-
由 HappyUncle 提交于
Summary: Fix help message. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8615 Reviewed By: siying Differential Revision: D30136092 Pulled By: mrambacher fbshipit-source-id: edf4112570514d709560baaf96a47c5f36f00665
-
由 Peter Dillinger 提交于
Summary: Guarantees that if a restore is interrupted, DB::Open will fail. This works by restoring CURRENT first to CURRENT.tmp then as a final step renaming to CURRENT. Also makes restore respect BackupEngineOptions::sync (default true). When set, the restore is guaranteed persisted by the time it returns OK. Also makes the above atomicity guarantee work in case the interruption is power loss or OS crash (not just process interruption or crash). Fixes https://github.com/facebook/rocksdb/issues/8500 Pull Request resolved: https://github.com/facebook/rocksdb/pull/8568 Test Plan: added to backup mini-stress unit test. Passes with gtest_repeat=100 (whereas fails 7 times without the CURRENT.tmp) Reviewed By: akankshamahajan15 Differential Revision: D29812605 Pulled By: pdillinger fbshipit-source-id: 24e9a993b305b1835ca95558fa7a7152e54cda8e
-
- 06 8月, 2021 4 次提交
-
-
由 Brendan MacDonell 提交于
Summary: By default, the low priority pool is not the flush pool, so calling `Env#setBackgroundThreads` without providing a priority will not do what the caller expected. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8576 Reviewed By: ajkr Differential Revision: D29925154 Pulled By: mrambacher fbshipit-source-id: cd7211fc374e7d9929a9b88ea0a5ba8134b76099
-
由 mrambacher 提交于
Summary: - Changed MergeOperator, CompactionFilter, and CompactionFilterFactory into Customizable classes. - Added Options/Configurable/Object Registration for TTL and Cassandra variants - Changed the StringAppend MergeOperators to accept a string delimiter rather than a simple char. Made the delimiter into a configurable option - Added tests for new functionality Pull Request resolved: https://github.com/facebook/rocksdb/pull/8481 Reviewed By: zhichao-cao Differential Revision: D30136050 Pulled By: mrambacher fbshipit-source-id: 271d1772835935b6773abaf018ee71e42f9491af
-
由 Akanksha Mahajan 提交于
Summary: Dynamically configure BlockBasedTableOptions.prepopulate_block_cache using DB::SetOptions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8620 Test Plan: Added new unit test Reviewed By: anand1976 Differential Revision: D30091319 Pulled By: akankshamahajan15 fbshipit-source-id: fb586d1848a8dd525bba7b2f9eeac34f2fc6d82c
-
由 Levi Tamasi 提交于
Summary: We've been seeing occasional crashes on CI while inserting into the vectors in `ObsoleteFilesTest.DeleteObsoleteOptionsFile`. The crashes don't reproduce locally (could be either a race or an object lifecycle issue) but the good news is that the vectors in question are not really used for anything meaningful by the test. (The assertion about the sizes of the two vectors being equal is guaranteed to hold, since the two sync points where they are populated are right after each other.) The patch simply removes the vectors from the test, alongside the associated callbacks and sync points. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8624 Test Plan: `make check` Reviewed By: akankshamahajan15 Differential Revision: D30118485 Pulled By: ltamasi fbshipit-source-id: 0a4c3d06584e84cd2b1dcc212d274fa1b89cb647
-
- 05 8月, 2021 5 次提交
-
-
由 Yanqin Jin 提交于
Summary: Update HISTORY.md for PR https://github.com/facebook/rocksdb/issues/8585 . Pull Request resolved: https://github.com/facebook/rocksdb/pull/8623 Reviewed By: ltamasi Differential Revision: D30121910 Pulled By: riversand963 fbshipit-source-id: 525af43fad908a498f22ed4f934ec5cbf60e6d25
-
由 Andrew Kryczka 提交于
Summary: Previously we attempted to rename "LOG" to "LOG.old.*" without checking its existence first. "LOG" had no reason to exist in a new DB. Errors in renaming a non-existent "LOG" were swallowed via `PermitUncheckedError()` so things worked. However the storage service's error monitoring was detecting all these benign rename failures. So it is better to fix it. Also with this PR we can now distinguish rename failure for other reasons and return them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8622 Test Plan: new unit test Reviewed By: akankshamahajan15 Differential Revision: D30115189 Pulled By: ajkr fbshipit-source-id: e2f337ffb2bd171be0203172abc8e16e7809b170
-
由 Akanksha Mahajan 提交于
Summary: Fixed clang failure because of memory leak Pull Request resolved: https://github.com/facebook/rocksdb/pull/8621 Test Plan: CircleCI clang job Reviewed By: pdillinger Differential Revision: D30114337 Pulled By: akankshamahajan15 fbshipit-source-id: 16572b9bcbaa053c2ab7bc1c344148d0e6f8039c
-
由 anand76 提交于
Summary: ```FaultInjectionTestFS``` injects various types of read errors in ```FileSystem``` APIs. One type of error is corruption errors, where data is intentionally corrupted or truncated. There is corresponding validation in db_stress to verify that an injected error results in a user visible Get/MultiGet error. However, for corruption errors, its hard to know when a corruption is supposed to be detected by the user request, due to prefetching and, in case of direct IO, padding. This results in false positives. So remove that functionality. Block checksum validation for Get/MultiGet is confined to ```BlockFetcher```, so we don't lose a lot by disabling this since its a small surface area to test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8616 Reviewed By: zhichao-cao Differential Revision: D30074422 Pulled By: anand1976 fbshipit-source-id: 6a61fac18f95514c15364b75013799ddf83294df
-
由 hx235 提交于
Summary: Context: As need for new feature of resource management using RocksDB's rate limiter like [https://github.com/facebook/rocksdb/issues/8595](https://github.com/facebook/rocksdb/pull/8595) arises, it is about time to re-learn our rate limiter and make this learning process easier for others by improving its readability. The comment/assertion/one extra else-branch are added based on my best understanding toward the rate_limiter.cc and rate_limiter_test.cc up to date after giving it a hard read. - Add code comments/assertion/one extra else-branch (that is not affecting existing behavior, see PR comment) to describe how leader-election works under multi-thread settings in GenericRateLimiter::Request() - Add code comments to describe a non-obvious trick during clean-up of rate limiter destructor - Add code comments to explain more about the starvation being fixed in GenericRateLimiter::Refill() through partial byte-granting - Add code comments to the rate limiter's setup in a complicated unit test in rate_limiter_test Pull Request resolved: https://github.com/facebook/rocksdb/pull/8596 Test Plan: - passed existing rate_limiter_test.cc Reviewed By: ajkr Differential Revision: D29982590 Pulled By: hx235 fbshipit-source-id: c3592986bb5b0c90d8229fe44f425251ec7e8a0a
-
- 04 8月, 2021 3 次提交
-
-
由 Levi Tamasi 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8619 Reviewed By: riversand963 Differential Revision: D30081937 Pulled By: ltamasi fbshipit-source-id: 57505957ae2c22d4b194aa28cb3fd261b3b39919
-
由 Yanqin Jin 提交于
Summary: PR https://github.com/facebook/rocksdb/issues/5908 added `flush_jobs_info_` to `FlushJob` to make sure `OnFlushCompleted()` is called after committing flush results to MANIFEST. However, `flush_jobs_info_` is not updated in atomic flush, causing `NotifyOnFlushCompleted()` to skip `OnFlushCompleted()`. This PR fixes this, in a similar way to https://github.com/facebook/rocksdb/issues/5908 that handles regular flush. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8585 Test Plan: make check Reviewed By: jay-zhuang Differential Revision: D29913720 Pulled By: riversand963 fbshipit-source-id: 4ff023c98372fa2c93188d4a5c8a4e9ffa0f4dda
-
由 Akanksha Mahajan 提交于
Summary: Insert warm blocks (data, uncompressed dict, index and filter blocks) during flush in Block cache which is enabled under option BlockBasedTableOptions.prepopulate_block_cache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8561 Test Plan: Added unit test Reviewed By: anand1976 Differential Revision: D29773411 Pulled By: akankshamahajan15 fbshipit-source-id: 6631123c10134340ef0bd7e90baafaa6deba0e66
-
- 03 8月, 2021 5 次提交
-
-
由 Baptiste Lemaire 提交于
Summary: The db_stress crash was caused by a call to `IsFlushPending()` made by a stats function which triggered an `assert([false])`, which I didn't plan when I created the `trigger_flush` bool. It turns out that this bool variable is not useful: I created it because I thought the `imm_flush_needed` atomic bool would actually trigger a flush. It turns out that this bool is only checked in `IsFlushPending` - this is its only use - and a flush is triggered by either a background thread checking on the imm array, or by an explicit call to `SchedulePendingFlush` which creates a flush request, that is then added to a flush request queue. In this PR, I reverted the MemtableList::Add function to what it was before my changes. I tested the fix by running the exact command line that deterministically triggered the assert error (see below), which confirmed that this is where the error was coming from. I also run `db_crashtest.py whitebox` and `blackbox` for a couple hours locally before committing this PR. Experiment run: ```./db_stress --acquire_snapshot_one_in=0 --allow_concurrent_memtable_write=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=76.90653425292307 --bottommost_compression_type=disable --cache_index_and_filter_blocks=1 --cache_size=1048576 --checkpoint_one_in=1000000 --checksum_type=kCRC32c --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=0 --compaction_ttl=2 --compression_max_dict_buffer_bytes=0 --compression_max_dict_bytes=0 --compression_parallel_threads=1 --compression_type=zstd --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --db=/dev/shm/rocksdb/rocksdb_crashtest_blackbox --db_write_buffer_size=0 --delpercent=4 --delrangepercent=1 --destroy_db_initially=0 --enable_compaction_filter=1 --enable_pipelined_write=0 --expected_values_path=/dev/shm/rocksdb/rocksdb_crashtest_expected --experimental_allow_mempurge=1 --experimental_mempurge_policy=kAlternate --fail_if_options_file_error=1 --file_checksum_impl=none --flush_one_in=1000000 --format_version=2 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=14 --index_type=0 --iterpercent=0 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=1 --mark_for_compaction_one_file_in=10 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=100000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=64 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtablerep=skip_list --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_read_fault_one_in=32 --open_write_fault_one_in=16 --ops_per_thread=100000000 --optimize_filters_for_memory=1 --paranoid_file_checks=0 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=1000 --prefix_size=-1 --prefixpercent=0 --progress_reports=0 --read_fault_one_in=0 --readpercent=60 --recycle_log_file_num=1 --reopen=20 --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=0 --subcompactions=3 --sync=1 --sync_fault_injection=False --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=1 --unpartitioned_pinning=3 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=0 --use_full_merge_v1=1 --use_merge=0 --use_multiget=0 --use_ribbon_filter=1 --user_timestamp_size=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --write_buffer_size=33554432 --write_dbid_to_manifest=1 --writepercent=35``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/8604 Reviewed By: pdillinger Differential Revision: D30047295 Pulled By: bjlemaire fbshipit-source-id: b9e379bfa3d6b9bd2b275725fb0bca4bd81a3dbe
-
由 Merlin Mao 提交于
Summary: PR https://github.com/facebook/rocksdb/pull/8572 looses custom types in the options file. Need more API changes to fix this issue. Revert this PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8607 Reviewed By: ajkr Differential Revision: D30058289 Pulled By: autopear fbshipit-source-id: 78f5a154c0bf193e8441bae4a36fa79b95277fd4
-
由 Levi Tamasi 提交于
Summary: The `ColumnFamilyData::UnrefAndTryDelete` code currently on the trunk unlocks the DB mutex before destroying the `ThreadLocalPtr` holding the per-thread `SuperVersion` pointers when the only remaining reference is the back reference from `super_version_`. The idea behind this was to break the circular dependency between `ColumnFamilyData` and `SuperVersion`: when the penultimate reference goes away, `ColumnFamilyData` can clean up the `SuperVersion`, which can in turn clean up `ColumnFamilyData`. (Assuming there is a `SuperVersion` and it is not referenced by anything else.) However, unlocking the mutex throws a wrench in this plan by making it possible for another thread to jump in and take another reference to the `ColumnFamilyData`, keeping the object alive in a zombie `ThreadLocalPtr`-less state. This can cause issues like https://github.com/facebook/rocksdb/issues/8440 , https://github.com/facebook/rocksdb/issues/8382 , and might also explain the `was_last_ref` assertion failures from the `ColumnFamilySet` destructor we sometimes observe during close in our stress tests. Digging through the archives, this unlocking goes way back to 2014 (or earlier). The original rationale was that `SuperVersionUnrefHandle` used to lock the mutex so it can call `SuperVersion::Cleanup`; however, this logic turned out to be deadlock-prone. https://github.com/facebook/rocksdb/pull/3510 fixed the deadlock but left the unlocking in place. https://github.com/facebook/rocksdb/pull/6147 then introduced the circular dependency and associated cleanup logic described above (in order to enable iterators to keep the `ColumnFamilyData` for dropped column families alive), and moved the unlocking-relocking snippet to its present location in `UnrefAndTryDelete`. Finally, https://github.com/facebook/rocksdb/pull/7749 fixed a memory leak but apparently exacerbated the race by (otherwise correctly) switching to `UnrefAndTryDelete` in `SuperVersion::Cleanup`. The patch simply eliminates the unlocking and relocking, which has been unnecessary ever since https://github.com/facebook/rocksdb/issues/3510 made `SuperVersionUnrefHandle` lock-free. This closes the window during which another thread could increase the reference count, and hopefully fixes the issues above. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8605 Test Plan: Ran `make check` and stress tests locally. Reviewed By: pdillinger Differential Revision: D30051035 Pulled By: ltamasi fbshipit-source-id: 8fe559e4b4ad69fc142579f8bc393ef525918528
-
由 yangzaorang 提交于
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8537 Reviewed By: ajkr Differential Revision: D29838132 Pulled By: jay-zhuang fbshipit-source-id: e3e78d5f85f240a1800ace417a8b634f74488e41
-
由 Mikhail Golubev 提交于
Summary: An arbitrary string can be used as a delimiter in StringAppend merge operator flavor. In particular, it allows using an empty string, combining binary values for the same key byte-to-byte one next to another. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8536 Reviewed By: mrambacher Differential Revision: D29962120 Pulled By: zhichao-cao fbshipit-source-id: 4ef5d846a47835cf428a11200409e30e2dbffc4f
-
- 31 7月, 2021 2 次提交
-
-
由 mrambacher 提交于
Summary: Prior to this change, the "wal_dir" DBOption would always be set (defaults to dbname) when the DBOptions were sanitized. Because of this setitng in the options file, it was not possible to rename/relocate a database directory after it had been created and use the existing options file. After this change, the "wal_dir" option is only set under specific circumstances. Methods were added to the ImmutableDBOptions class to see if it is set and if it is set to something other than the dbname. Additionally, a method was added to retrieve the effective value of the WAL dir (either the option or the dbname/path). Tests were added to the core and ldb to test that a database could be created and renamed without issue. Additional tests for various permutations of wal_dir were also added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8582 Reviewed By: pdillinger, autopear Differential Revision: D29881122 Pulled By: mrambacher fbshipit-source-id: 67d3d033dc8813d59917b0a3fba2550c0efd6dfb
-
由 Yanqin Jin 提交于
Summary: This PR tries to remove some unnecessary checks as well as unreachable code blocks to improve readability. An obvious non-public API method naming typo is also corrected. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8565 Test Plan: make check Reviewed By: lth Differential Revision: D29963984 Pulled By: riversand963 fbshipit-source-id: cc96e8f09890e5cfe9b20eadb63bdca5484c150a
-
- 30 7月, 2021 3 次提交
-
-
由 Peter Dillinger 提交于
Summary: Calling the GetImpl function could leave reference to a local callback function in a field of a parameter struct. As this is performance-critical code, I'm not going to attempt to sanitize this code too much, but make the existing hack a bit cleaner by reverting what it overwrites in the input struct. Added SaveAndRestore utility class to make that easier. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8590 Test Plan: added unit test for SaveAndRestore; existing tests for GetImpl Reviewed By: riversand963 Differential Revision: D29947983 Pulled By: pdillinger fbshipit-source-id: 2f608853f970bc06724e834cc84dcc4b8599ddeb
-
由 Baptiste Lemaire 提交于
Summary: Introduction of a new `fillanddeleteuniquerandom` benchmark (`db_bench`) with 5 new option flags to simulate a benchmark where the following sequence is repeated multiple times: "A set of keys S1 is inserted ('`disposable entries`'), then after some delay another set of keys S2 is inserted ('`persistent entries`') and the first set of keys S1 is deleted. S2 artificially represents the insertion of hypothetical results from some undefined computation done on the first set of keys S1. The next sequence can start as soon as the last disposable entry in the set S1 of this sequence is inserted, if the `delay` is non negligible." New flags: - `disposable_entries_delete_delay`: minimum delay in microseconds between insertion of the last `disposable` entry, and the start of the insertion of the first `persistent` entry. - `disposable_entries_batch_size`: number of `disposable` entries inserted at the beginning of each sequence. - `disposable_entries_value_size`: size of the random `value` string for the `disposable` entries. - `persistent_entries_batch_size`: number of `persistent` entries inserted at the end of each sequence, right before the deletion of the `disposable` entries starts. - `persistent_entries_value_size`: size of the random value string for the `persistent` entries. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8593 Reviewed By: pdillinger Differential Revision: D29974436 Pulled By: bjlemaire fbshipit-source-id: f578033e5b45e8268ba6fa6f38f4770c2e6e801d
-
由 sdong 提交于
Summary: If DB::GetSortedWalFiles() runs without file deletion disbled, file might get deleted in the middle and error is returned to users. It makes the function hard to use. Fix it by disabling file deletion if it is not done. Fix another minor issue of logging within DB mutex, which should not be done unless a major failure happens. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8591 Test Plan: Run all existing tests Reviewed By: pdillinger Differential Revision: D29969412 fbshipit-source-id: d5f42b5271608a35b9b07687ce18157d7447b0de
-
- 29 7月, 2021 1 次提交
-
-
由 Peter Dillinger 提交于
Summary: * Basic handling of SST file with just range tombstones rather than failing assertion about smallest_seqno <= largest_seqno * Adds --verbose option so that there exists a way to see the INFO output from Repairer. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8544 Test Plan: unit test added, manual testing for --verbose Reviewed By: ajkr Differential Revision: D29954805 Pulled By: pdillinger fbshipit-source-id: 696af25805fc36cc178b04ba6045922a22625fd9
-