- 23 2月, 2018 2 次提交
-
-
由 Igor Sugak 提交于
Reviewed By: igorsugak fbshipit-source-id: 4a93675cc1931089ddd574cacdb15d228b1e5f37
-
由 David Lai 提交于
Reviewed By: everiq, igorsugak Differential Revision: D7046710 fbshipit-source-id: 8e10b1f1e2aecebbfb229c742e214db887e5a461
-
- 22 2月, 2018 2 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Use the rsync tempfile naming convention in our `BackupEngine`. The temp file follows the format, `.<filename>.<suffix>`, which is later renamed to `<filename>`. We fix `tmp` as the `<suffix>` as we don't need to use random bytes for now. The benefit is gluster treats this tempfile naming convention specially and applies hashing only to `<filename>`, so the file won't need to be linked or moved when it's renamed. Our gluster team suggested this will make things operationally easier. Closes https://github.com/facebook/rocksdb/pull/3463 Differential Revision: D6893333 Pulled By: ajkr fbshipit-source-id: fd7622978f4b2487fce33cde40dd3124f16bcaa8
-
由 Maysam Yabandeh 提交于
Summary: Under a certain sequence of accessing PreparedHeap, there was a bug that would not successfully empty the heap. This would result in performance issues when the heap content is moved to old_prepared_ after max_evicted_seq_ advances the orphan prepared sequence numbers. The patch fixed the bug and add more unit tests. It also does more logging when the unlikely scenarios are faced Closes https://github.com/facebook/rocksdb/pull/3526 Differential Revision: D7038486 Pulled By: maysamyabandeh fbshipit-source-id: f1e40bea558f67b03d2a29131fcb8734c65fce97
-
- 21 2月, 2018 4 次提交
-
-
由 Sagar Vemuri 提交于
Summary: Added a new iterator property: `rocksdb.iterator.internal-key` to get the internal-key (converted to user key) at which the iterator stopped. Closes https://github.com/facebook/rocksdb/pull/3525 Differential Revision: D7033694 Pulled By: sagar0 fbshipit-source-id: d51e6c00f5e9d766c6276ef79774b81c6c5216f8
-
由 jsteemann 提交于
Summary: In case it is found that a key is already marked as locked in a stripe's map of locked keys, it is not necessary to look it up again using `std::unordered_map<std::string, ...>::at(size_t)`. Instead, we can use the already found position using the iterator produced by the previous `find` operation. Reusing the iterator will avoid having to hash the key again and do additional "random" memory lookups in the map of keys (though the data will very likely sit available in caches here already due to the previous find operation) Closes https://github.com/facebook/rocksdb/pull/3505 Differential Revision: D7036446 Pulled By: sagar0 fbshipit-source-id: cced51547b2bd2d49394f6bc8c5896f09fa80f68
-
由 Andrew Kryczka 提交于
Summary: - made `CreateCheckpoint` properly return `InvalidArgument` when called with an empty directory. Previously it triggered an assertion failure due to a bug in the logic. - made `ldb` set empty `checkpoint_dir` if that's what the user specifies, so that we can use it to properly test `CreateCheckpoint` in the future. Differential Revision: D6874562 fbshipit-source-id: dcc1bd41768261d9338987fa7711444289707ed7
-
由 Igor Sugak 提交于
Summary: Add a static cast to perform the left shift as with an unsigned type. make ubsan_check Closes https://github.com/facebook/rocksdb/pull/3517 Reviewed By: sagar0 Differential Revision: D7016044 Pulled By: igorsugak fbshipit-source-id: baf72f6197edd8f7220d010b15a23d6de6a72c49
-
- 17 2月, 2018 3 次提交
-
-
由 Po-Chuan Hsieh 提交于
Summary: utilities/column_aware_encoding_util.cc:61:23: error: cannot use dynamic_cast with -fno-rtti table_reader_.reset(dynamic_cast<BlockBasedTable*>(table_reader.release())); ^ 1 error generated. It was added as a [local patch](https://svnweb.freebsd.org/ports/head/databases/rocksdb/files/patch-utilities-column_aware_encoding_util.cc) on FreeBSD since RocksDB 5.8. It also fixes #2707. Closes https://github.com/facebook/rocksdb/pull/3514 Differential Revision: D7005571 Pulled By: siying fbshipit-source-id: 351a9055d21d0accdd7a932e8e7bfcd3c8e22068
-
由 Maysam Yabandeh 提交于
Summary: These are optimization that we applied to improve sysbech's update_noindex performance. 1. Make use of LIKELY compiler hint 2. Move std::atomic so the subclass 3. Make use of skip_prepared in non-2pc transactions. Closes https://github.com/facebook/rocksdb/pull/3512 Differential Revision: D7000075 Pulled By: maysamyabandeh fbshipit-source-id: 1ab8292584df1f6305a4992973fb1b7933632181
-
由 Mike Kolupaev 提交于
Summary: Deadlock: a memtable flush holds DB::mutex_ and calls ThreadLocalPtr::Scrape(), which locks ThreadLocalPtr mutex; meanwhile, a thread exit handler locks ThreadLocalPtr mutex and calls SuperVersionUnrefHandle, which tries to lock DB::mutex_. This deadlock is hit all the time on our workload. It blocks our release. In general, the problem is that ThreadLocalPtr takes an arbitrary callback and calls it while holding a lock on a global mutex. The same global mutex is (at least in some cases) locked by almost all ThreadLocalPtr methods, on any instance of ThreadLocalPtr. So, there'll be a deadlock if the callback tries to do anything to any instance of ThreadLocalPtr, or waits for another thread to do so. So, probably the only safe way to use ThreadLocalPtr callbacks is to do only do simple and lock-free things in them. This PR fixes the deadlock by making sure that local_sv_ never holds the last reference to a SuperVersion, and therefore SuperVersionUnrefHandle never has to do any nontrivial cleanup. I also searched for other uses of ThreadLocalPtr to see if they may have similar bugs. There's only one other use, in transaction_lock_mgr.cc, and it looks fine. Closes https://github.com/facebook/rocksdb/pull/3510 Reviewed By: sagar0 Differential Revision: D7005346 Pulled By: al13n321 fbshipit-source-id: 37575591b84f07a891d6659e87e784660fde815f
-
- 16 2月, 2018 6 次提交
-
-
由 Andrew Kryczka 提交于
Summary: Calling `std::vector::reserve()` causes memory to be reallocated and then data to be moved. It was called prior to adding every block. This reallocation could be done a huge amount of times, e.g., for users with large index blocks. Instead, we can simply use `std::vector::emplace_back()` in such a way that preserves the no-memory-leak guarantee, while letting the vector decide when to reallocate space. Now I see reallocation/moving happen O(logN) times, rather than O(N) times, where N is the final size of vector. Closes https://github.com/facebook/rocksdb/pull/3508 Differential Revision: D6994228 Pulled By: ajkr fbshipit-source-id: ab7c11e13ff37c8c6c8249be7a79566a4068cd27
-
由 Yi Wu 提交于
Summary: Add a legocastle job to continuously build the last 10 commits every 4 hours and report lite build binary size to scuba. Closes https://github.com/facebook/rocksdb/pull/3511 Differential Revision: D7001730 Pulled By: yiwu-arbug fbshipit-source-id: 7c8ca87c46d663c786a0d32be69ebbe7b19a5eb9
-
由 Maysam Yabandeh 提交于
Summary: The MemTableRep API was broken by this commit: 813719e9 This patch reverts the changes and instead adds InsertKey (and etc.) overloads to extend the MemTableRep API without breaking the existing classes that inherit from it. Closes https://github.com/facebook/rocksdb/pull/3513 Differential Revision: D7004134 Pulled By: maysamyabandeh fbshipit-source-id: e568d91fe1e17dd76c0c1f6c7dd51a18633b1c4f
-
由 jsteemann 提交于
Summary: - removed a few unneeded variables - fused some variable declarations and their assignments - fixed right-trimming code in string_util.cc to not underflow - simplifed an assertion - move non-nullptr check assertion before dereferencing of that pointer - pass an std::string function parameter by const reference instead of by value (avoiding potential copy) Closes https://github.com/facebook/rocksdb/pull/3507 Differential Revision: D7004679 Pulled By: sagar0 fbshipit-source-id: 52944952d9b56dfcac3bea3cd7878e315bb563c4
-
由 Zhongyi Xie 提交于
Summary: Right now it is possible that a file gets assigned to L0 but also assigned the seqno from a higher level which it doesn't fit Under the current impl, it is possibe that seqno in lower levels (Ln) can be equal to smallest seqno of higher levels (Ln-1), which is undesirable from universal compaction's point of view. This should fix the intermittent failure of `ExternalSSTFileBasicTest.IngestFileWithGlobalSeqnoPickedSeqno` Closes https://github.com/facebook/rocksdb/pull/3411 Differential Revision: D6813802 Pulled By: miasantreble fbshipit-source-id: 693d0462fa94725ccfb9d8858743e6d2d9992d14
-
由 jsteemann 提交于
Summary: Somehow the indentation was incorrect in this file. The only change in this PR is to get it right again in order to make the code more readable. Please reject if you think it's not worth it. Closes https://github.com/facebook/rocksdb/pull/3504 Differential Revision: D6996011 Pulled By: miasantreble fbshipit-source-id: 060514a3a8c910d34bad795b36eb4d278512b154
-
- 15 2月, 2018 1 次提交
-
-
由 Fosco Marotto 提交于
Summary: As in #3425 Closes https://github.com/facebook/rocksdb/pull/3497 Differential Revision: D6979588 Pulled By: gfosco fbshipit-source-id: e9fb32d04ad45575dfe9de1d79348d158e474197
-
- 14 2月, 2018 5 次提交
-
-
由 Siying Dong 提交于
Summary: We don't do fsync() after truncate in direct I/O writeable file (in fact we don't do any fsync ever). This can cause metadata not persistent to disk after the file is generated. We call it instead. Closes https://github.com/facebook/rocksdb/pull/3500 Differential Revision: D6981482 Pulled By: siying fbshipit-source-id: 7e2b591b7e5dd1b96fc0775515b8b9e6092980ef
-
由 Igor Sugak 提交于
Summary: This fixes shift and signed-integer-overflow UBSAN checks in fault_injection_test by using a larger and unsigned type. Closes https://github.com/facebook/rocksdb/pull/3498 Reviewed By: siying Differential Revision: D6981116 Pulled By: igorsugak fbshipit-source-id: 3688f62cce570534b161e9b5f42109ebc9ae5a2c
-
由 Siying Dong 提交于
Summary: A new LevelIterator was recently created. Rename the old one to make unity build happy. It's also not a good idea to have two classes in the same name anyway. Closes https://github.com/facebook/rocksdb/pull/3499 Differential Revision: D6979325 Pulled By: siying fbshipit-source-id: 3a032d93fe205650a08e92e5262594731ec726bb
-
由 Siying Dong 提交于
Summary: Now we suppress alignment UBSAN error as a whole. Suppressing 3-way CRC and murmurhash feels a better idea than turning off alignment check as a whole. Closes https://github.com/facebook/rocksdb/pull/3495 Differential Revision: D6971273 Pulled By: siying fbshipit-source-id: 080b59fed6df494b9f622ef7cb5d42d39e6a8cdf
-
由 Fosco Marotto 提交于
Summary: Closes https://github.com/facebook/rocksdb/pull/3464 Differential Revision: D6906184 Pulled By: gfosco fbshipit-source-id: 415934d7b1dd8dd226b6619bfb71781184d55cd9
-
- 13 2月, 2018 5 次提交
-
-
由 Siying Dong 提交于
Summary: Use a customzied BlockBasedTableIterator and LevelIterator to replace current implementations leveraging two-level-iterator. Hope the customized logic will make code easier to understand. As a side effect, BlockBasedTableIterator reduces the allocation for the data block iterator object, and avoid the virtual function call to it, because we can directly reference BlockIter, a final class. Similarly, LevelIterator reduces virtual function call to the dummy iterator iterating the file metadata. It also enabled further optimization. The upper bound check is also moved from index block to data block. This implementation fits this iterator better. After the change, forwared iterator is slightly optimized to ensure we trim those iterators. The two-level-iterator now is only used by partitioned index, so it is simplified. Closes https://github.com/facebook/rocksdb/pull/3406 Differential Revision: D6809041 Pulled By: siying fbshipit-source-id: 7da3b9b1d3c8e9d9405302c15920af1fcaf50ffa
-
由 Maysam Yabandeh 提交于
Summary: TransactionDB::Write can receive some optimization hints from the user. One is to skip the concurrency control mechanism. WritePreparedTxnDB is currently ignoring such hints. This patch optimizes WritePreparedTxnDB::Write for skip_concurrency_control and skip_duplicate_key_check hints. Closes https://github.com/facebook/rocksdb/pull/3496 Differential Revision: D6971784 Pulled By: maysamyabandeh fbshipit-source-id: cbab10ad538fa2b8bcb47e37c77724afe6e30f03
-
由 Andrew Kryczka 提交于
Summary: - Refactored logic for checking write stall condition to a helper function: `GetWriteStallConditionAndCause`. Now it is decoupled from the logic for updating WriteController / stats in `RecalculateWriteStallConditions`, so we can reuse it for predicting whether write stall will occur. - Updated `CompactRange` to first check whether the one additional immutable memtable / L0 file would cause stalling before it flushes. If so, it waits until that is no longer true. - Updated `bg_cv_` to be signaled on `SetOptions` calls. The stall conditions `CompactRange` cares about can change when (1) flush finishes, (2) compaction finishes, or (3) options dynamically change. The cv was already signaled for (1) and (2) but not yet for (3). Closes https://github.com/facebook/rocksdb/pull/3381 Differential Revision: D6754983 Pulled By: ajkr fbshipit-source-id: 5613e03f1524df7192dc6ae885d40fd8f091d972
-
由 Andrew Kryczka 提交于
Summary: Some workloads (like my current benchmarking) may want partitioned indexes without partitioned filters. Particularly, when `-optimize_filters_for_hits=true`, the total index size may be larger than the total filter size, so it can make sense to hold all filters in-memory but not all indexes. Closes https://github.com/facebook/rocksdb/pull/3492 Differential Revision: D6970092 Pulled By: ajkr fbshipit-source-id: b7fa1828e1d13829339aefb90fd56eb7c5337f61
-
由 Zhongyi Xie 提交于
Summary: Closes https://github.com/facebook/rocksdb/pull/3487 Differential Revision: D6967098 Pulled By: miasantreble fbshipit-source-id: 48e0accf2e3b3f589ddb797ff8083c8520269bf0
-
- 10 2月, 2018 5 次提交
-
-
由 Siying Dong 提交于
Summary: Right now, users will encounter unexpected bahavior if they use key or value larger than 4GB. We should explicitly fail the queriers. Closes https://github.com/facebook/rocksdb/pull/3484 Differential Revision: D6953895 Pulled By: siying fbshipit-source-id: b60491e1af064fc5d52971956661f6c18ceac24f
-
由 Yi Wu 提交于
Summary: CompactionIterator invoke MergeHelper::MergeUntil() to do partial merge between snapshot boundaries. Previously it only depend on sequence number to tell snapshot boundary, but we also need to make use of snapshot_checker to verify visibility of the merge operands to the snapshots. For example, say there is a snapshot with seq = 2 but only can see data with seq <= 1. There are three merges, each with seq = 1, 2, 3. A correct compaction output would be (1),(2+3). Without taking snapshot_checker into account when generating merge result, compaction will generate output (1+2),(3). By filtering uncommitted keys with read callback, the read path already take care of merges well and don't need additional updates. Closes https://github.com/facebook/rocksdb/pull/3475 Differential Revision: D6926087 Pulled By: yiwu-arbug fbshipit-source-id: 8f539d6f897cfe29b6dc27a8992f68c2a629d40a
-
由 Chinmay Kamat 提交于
Summary: This pull request contains miscellaneous compilation fixes. Thanks, Chinmay Closes https://github.com/facebook/rocksdb/pull/3462 Differential Revision: D6941424 Pulled By: sagar0 fbshipit-source-id: fe9c26507bf131221f2466740204bff40a15614a
-
由 jsteemann 提交于
Summary: Closes https://github.com/facebook/rocksdb/pull/3481 Differential Revision: D6939089 Pulled By: siying fbshipit-source-id: ccce3ae10cc5ff50a74b85804afd044b21a3c3e2
-
由 Zhongyi Xie 提交于
Summary: It's always a mystery from the logs why flush was triggered -- user triggered it manually, WriteBufferManager triggered it, logs were full, write buffer was full, etc. This PR logs Flush reason whenever a flush is scheduled. Closes https://github.com/facebook/rocksdb/pull/3401 Differential Revision: D6788142 Pulled By: miasantreble fbshipit-source-id: a867e54d493c06adf5172bd36a180fb3faae3511
-
- 08 2月, 2018 4 次提交
-
-
由 Andrew Kryczka 提交于
Summary: `ReadBlockFromFile` uses a stack buffer to hold small data blocks before passing them to the compression library, which outputs uncompressed data in a heap buffer. In the case of `kNoCompression` there is a `memcpy` to copy from stack buffer to heap buffer. This PR optimizes `ReadBlockFromFile` to skip the stack buffer for files whose blocks are known to be uncompressed. We determine this using the SST file property, "compression_name", if it's available. Closes https://github.com/facebook/rocksdb/pull/3472 Differential Revision: D6920848 Pulled By: ajkr fbshipit-source-id: 5c753e804efc178b9229ae5dbe6a4adc32031f07
-
由 Siying Dong 提交于
Summary: WritePreparedTransactionTest has the UBSAN error because the wrong order of its parent class construction. Fix it. Closes https://github.com/facebook/rocksdb/pull/3478 Differential Revision: D6928975 Pulled By: siying fbshipit-source-id: 13edfd5cb9cf73f1ac5ae3b6f53061d32783733d
-
由 Siying Dong 提交于
Summary: …db_test options_settable_test won't pass UBSAN so disable it. blob_db_test fails in UBSAN as SnapshotList doesn't initialize all the fields in dummy snapshot. Fix it. I don't understand why only blob_db_test fails though. Closes https://github.com/facebook/rocksdb/pull/3477 Differential Revision: D6928681 Pulled By: siying fbshipit-source-id: e31dd300fcdecdfd4f6af279a0987fd0cdec5122
-
由 Siying Dong 提交于
Summary: Disable alignment check in UBSAN for now. Now we can't get signals to meaningful failures. We can reenable it after we figure out how we can suppress failures in finer grain manner. Closes https://github.com/facebook/rocksdb/pull/3473 Differential Revision: D6925971 Pulled By: siying fbshipit-source-id: a0f1a242cde866abbc5c1eeee9ff8d1d7d582ac4
-
- 07 2月, 2018 3 次提交
-
-
由 Maysam Yabandeh 提交于
Summary: Compared to DB::Write, TransactionDB::Write has the additional overhead of creating and initializing an internal transaction object, as well as the overhead of locking/unlocking the keys. This patch extends the TransactionDB::Write with an skip_cc option to allow the users to indicate that the write batch do not conflict with others and the concurrency control and its overhead can be skipped. TransactionDB::Write by default calls DB::Write when skip_cc is set, which works for WriteCommitted WritePolicy. Any other flavor of TransactionDB that is not compatible with this default behavior (such as WritePreparedTxnDB) can extend ::Write and implement their own approach for taking into account the skip_cc optimization. Closes https://github.com/facebook/rocksdb/pull/3457 Differential Revision: D6877318 Pulled By: maysamyabandeh fbshipit-source-id: 56f4e21db87ff71492db4e376fb7c2b03dfeab6b
-
由 Maysam Yabandeh 提交于
Summary: Deletes the transaction object at the end of the test. Verified by: - COMPILE_WITH_ASAN=1 make -j32 transaction_test - ./transaction_test --gtest_filter="DBA**Duplicate*" Closes https://github.com/facebook/rocksdb/pull/3470 Differential Revision: D6916473 Pulled By: maysamyabandeh fbshipit-source-id: 8303df25408635d5d3ac2b25f309a3d15957c937
-
由 Yi Wu 提交于
Summary: Update compaction_iterator_test with write-prepared transaction DB related tests. Transaction related tests are group in CompactionIteratorWithSnapshotCheckerTest. The existing test are duplicated to make them also test with dummy SnapshotChecker that will say every key is visible to every snapshot (this is okay, we still compare sequence number to verify visibility). Merge related tests are disabled and will be revisit in another PR. Existing db_iterator_tests are also duplicated to test with dummy read_callback that will say every key is committed. Closes https://github.com/facebook/rocksdb/pull/3466 Differential Revision: D6909253 Pulled By: yiwu-arbug fbshipit-source-id: 2ae4656b843a55e2e9ff8beecf21f2832f96cd25
-