1. 16 2月, 2018 1 次提交
    • J
      fix wrong indentation · 6a30b98f
      jsteemann 提交于
      Summary:
      Somehow the indentation was incorrect in this file.
      The only change in this PR is to get it right again in order to make the code more readable.
      Please reject if you think it's not worth it.
      Closes https://github.com/facebook/rocksdb/pull/3504
      
      Differential Revision: D6996011
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 060514a3a8c910d34bad795b36eb4d278512b154
      6a30b98f
  2. 15 2月, 2018 1 次提交
  3. 14 2月, 2018 5 次提交
  4. 13 2月, 2018 5 次提交
    • S
      Customized BlockBasedTableIterator and LevelIterator · b555ed30
      Siying Dong 提交于
      Summary:
      Use a customzied BlockBasedTableIterator and LevelIterator to replace current implementations leveraging two-level-iterator. Hope the customized logic will make code easier to understand. As a side effect, BlockBasedTableIterator reduces the allocation for the data block iterator object, and avoid the virtual function call to it, because we can directly reference BlockIter, a final class. Similarly, LevelIterator reduces virtual function call to the dummy iterator iterating the file metadata. It also enabled further optimization.
      
      The upper bound check is also moved from index block to data block. This implementation fits this iterator better. After the change, forwared iterator is slightly optimized to ensure we trim those iterators.
      
      The two-level-iterator now is only used by partitioned index, so it is simplified.
      Closes https://github.com/facebook/rocksdb/pull/3406
      
      Differential Revision: D6809041
      
      Pulled By: siying
      
      fbshipit-source-id: 7da3b9b1d3c8e9d9405302c15920af1fcaf50ffa
      b555ed30
    • M
      WritePrepared Txn: use TransactionDBWriteOptimizations (2nd attempt) · 8a04ee4f
      Maysam Yabandeh 提交于
      Summary:
      TransactionDB::Write can receive some optimization hints from the user. One is to skip the concurrency control mechanism. WritePreparedTxnDB is currently ignoring such hints. This patch optimizes WritePreparedTxnDB::Write for skip_concurrency_control and skip_duplicate_key_check hints.
      Closes https://github.com/facebook/rocksdb/pull/3496
      
      Differential Revision: D6971784
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: cbab10ad538fa2b8bcb47e37c77724afe6e30f03
      8a04ee4f
    • A
      Add delay before flush in CompactRange to avoid write stalling · ee1c8026
      Andrew Kryczka 提交于
      Summary:
      - Refactored logic for checking write stall condition to a helper function: `GetWriteStallConditionAndCause`. Now it is decoupled from the logic for updating WriteController / stats in `RecalculateWriteStallConditions`, so we can reuse it for predicting whether write stall will occur.
      - Updated `CompactRange` to first check whether the one additional immutable memtable / L0 file would cause stalling before it flushes. If so, it waits until that is no longer true.
      - Updated `bg_cv_` to be signaled on `SetOptions` calls. The stall conditions `CompactRange` cares about can change when (1) flush finishes, (2) compaction finishes, or (3) options dynamically change. The cv was already signaled for (1) and (2) but not yet for (3).
      Closes https://github.com/facebook/rocksdb/pull/3381
      
      Differential Revision: D6754983
      
      Pulled By: ajkr
      
      fbshipit-source-id: 5613e03f1524df7192dc6ae885d40fd8f091d972
      ee1c8026
    • A
      db_bench separate options for partition index and filters · 0a0fad44
      Andrew Kryczka 提交于
      Summary:
      Some workloads (like my current benchmarking) may want partitioned indexes without partitioned filters. Particularly, when `-optimize_filters_for_hits=true`, the total index size may be larger than the total filter size, so it can make sense to hold all filters in-memory but not all indexes.
      Closes https://github.com/facebook/rocksdb/pull/3492
      
      Differential Revision: D6970092
      
      Pulled By: ajkr
      
      fbshipit-source-id: b7fa1828e1d13829339aefb90fd56eb7c5337f61
      0a0fad44
    • Z
      make flush_reason_ atomic to keep TSAN happy · 3f1bb073
      Zhongyi Xie 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/3487
      
      Differential Revision: D6967098
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 48e0accf2e3b3f589ddb797ff8083c8520269bf0
      3f1bb073
  5. 10 2月, 2018 5 次提交
  6. 08 2月, 2018 4 次提交
    • A
      Eliminate a memcpy for uncompressed blocks · e78715c2
      Andrew Kryczka 提交于
      Summary:
      `ReadBlockFromFile` uses a stack buffer to hold small data blocks before passing them to the compression library, which outputs uncompressed data in a heap buffer. In the case of `kNoCompression` there is a `memcpy` to copy from stack buffer to heap buffer.
      
      This PR optimizes `ReadBlockFromFile` to skip the stack buffer for files whose blocks are known to be uncompressed. We determine this using the SST file property, "compression_name", if it's available.
      Closes https://github.com/facebook/rocksdb/pull/3472
      
      Differential Revision: D6920848
      
      Pulled By: ajkr
      
      fbshipit-source-id: 5c753e804efc178b9229ae5dbe6a4adc32031f07
      e78715c2
    • S
      Fix UBSAN Error in WritePreparedTransactionTest · a0931b31
      Siying Dong 提交于
      Summary:
      WritePreparedTransactionTest has the UBSAN error because the wrong order of its parent class construction. Fix it.
      Closes https://github.com/facebook/rocksdb/pull/3478
      
      Differential Revision: D6928975
      
      Pulled By: siying
      
      fbshipit-source-id: 13edfd5cb9cf73f1ac5ae3b6f53061d32783733d
      a0931b31
    • S
      Disable options_settable_test in UBSAN and fix UBSAN failure in blob_… · 821e0b16
      Siying Dong 提交于
      Summary:
      …db_test
      
      options_settable_test won't pass UBSAN so disable it.
      blob_db_test fails in UBSAN as SnapshotList doesn't initialize all the fields in dummy snapshot. Fix it. I don't understand why only blob_db_test fails though.
      Closes https://github.com/facebook/rocksdb/pull/3477
      
      Differential Revision: D6928681
      
      Pulled By: siying
      
      fbshipit-source-id: e31dd300fcdecdfd4f6af279a0987fd0cdec5122
      821e0b16
    • S
      Disable alignment check in UBSAN · 1336a774
      Siying Dong 提交于
      Summary:
      Disable alignment check in UBSAN for now. Now we can't get signals to meaningful failures. We can reenable it after we figure out how we can suppress failures in finer grain manner.
      Closes https://github.com/facebook/rocksdb/pull/3473
      
      Differential Revision: D6925971
      
      Pulled By: siying
      
      fbshipit-source-id: a0f1a242cde866abbc5c1eeee9ff8d1d7d582ac4
      1336a774
  7. 07 2月, 2018 4 次提交
    • M
      Add skip_cc option to TransactionDB::Write · 8feee280
      Maysam Yabandeh 提交于
      Summary:
      Compared to DB::Write, TransactionDB::Write has the additional overhead of creating and initializing an internal transaction object, as well as the overhead of locking/unlocking the keys. This patch extends the TransactionDB::Write with an skip_cc option to allow the users to indicate that the write batch do not conflict with others and the concurrency control and its overhead can be skipped. TransactionDB::Write by default calls DB::Write when skip_cc is set, which works for WriteCommitted WritePolicy. Any other flavor of TransactionDB that is not compatible with this default behavior (such as WritePreparedTxnDB) can extend ::Write and implement their own approach for taking into account the skip_cc optimization.
      Closes https://github.com/facebook/rocksdb/pull/3457
      
      Differential Revision: D6877318
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 56f4e21db87ff71492db4e376fb7c2b03dfeab6b
      8feee280
    • M
      Fix leak report by asan on DuplicateKeys test · 8f8eb4f1
      Maysam Yabandeh 提交于
      Summary:
      Deletes the transaction object at the end of the test.
      Verified by:
      - COMPILE_WITH_ASAN=1 make -j32 transaction_test
      - ./transaction_test --gtest_filter="DBA**Duplicate*"
      Closes https://github.com/facebook/rocksdb/pull/3470
      
      Differential Revision: D6916473
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 8303df25408635d5d3ac2b25f309a3d15957c937
      8f8eb4f1
    • Y
      WritePrepared Txn: update compaction_iterator_test and db_iterator_test · 81736d8a
      Yi Wu 提交于
      Summary:
      Update compaction_iterator_test with write-prepared transaction DB related tests. Transaction related tests are group in CompactionIteratorWithSnapshotCheckerTest. The existing test are duplicated to make them also test with dummy SnapshotChecker that will say every key is visible to every snapshot (this is okay, we still compare sequence number to verify visibility). Merge related tests are disabled and will be revisit in another PR.
      
      Existing db_iterator_tests are also duplicated to test with dummy read_callback that will say every key is committed.
      Closes https://github.com/facebook/rocksdb/pull/3466
      
      Differential Revision: D6909253
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 2ae4656b843a55e2e9ff8beecf21f2832f96cd25
      81736d8a
    • Z
      split RandomizedHarnessTest more ways · 2f299917
      Zhongyi Xie 提交于
      Summary:
      RandomizedHarnessTest enumerates different combinations of test type, compression type, restart interval, etc. For some combinations it takes very long to finish, causing the test to time out in test infrastructure.
      This PR split the test input into smaller trunks in the hope that they will fit in the timeout window. Another possibility is to reduce `num_entries` of course
      Closes https://github.com/facebook/rocksdb/pull/3467
      
      Differential Revision: D6910235
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 717246ee5d21a8a48ad82d4d9c04f9051a66f07f
      2f299917
  8. 06 2月, 2018 2 次提交
    • M
      WritePrepared Txn: Duplicate Keys, Txn Part · 88d8b2a2
      Maysam Yabandeh 提交于
      Summary:
      This patch takes advantage of memtable being able to detect duplicate <key,seq> and returning TryAgain to handle duplicate keys in WritePrepared Txns. Through WriteBatchWithIndex's index it detects existence of at least a duplicate key in the write batch. If duplicate key was reported, it then pays the cost of counting the number of sub-patches by iterating over the write batch and pass it to DBImpl::Write. DB will make use of the provided batch_count to assign proper sequence numbers before sending them to the WAL. When later inserting the batch to the memtable, it increases the seq each time memtbale reports a duplicate (a sub-patch in our counting) and tries again.
      Closes https://github.com/facebook/rocksdb/pull/3455
      
      Differential Revision: D6873699
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: db8487526c3a5dc1ddda0ea49f0f979b26ae648d
      88d8b2a2
    • A
      Handle error return from WriteBuffer() · 4b124fb9
      Anand Ananthabhotla 提交于
      Summary:
      There are a couple of places where we swallow any error from
      WriteBuffer() - in SwitchMemtable() and DBImpl::CloseImpl(). Propagate
      the error up in those cases rather than ignoring it.
      Closes https://github.com/facebook/rocksdb/pull/3404
      
      Differential Revision: D6879954
      
      Pulled By: anand1976
      
      fbshipit-source-id: 2ef88b554be5286b0a8bad7384ba17a105395bdb
      4b124fb9
  9. 04 2月, 2018 1 次提交
  10. 03 2月, 2018 3 次提交
  11. 02 2月, 2018 4 次提交
    • P
      options: Fix coverity issues · 6e5b341e
      Prashant D 提交于
      Summary:
      options/cf_options.cc:
       77      memtable_insert_with_hint_prefix_extractor(
      
      CID 1396208 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
      2. uninit_member: Non-static class member info_log_level is not initialized in this constructor nor in any functions that it calls.
      Closes https://github.com/facebook/rocksdb/pull/3106
      
      Differential Revision: D6874689
      
      Pulled By: sagar0
      
      fbshipit-source-id: b5cd2d13915fd86d87260050f9c5d117615bbe30
      6e5b341e
    • J
      crc32: suppress -Wimplicit-fallthrough warnings · e502839e
      Jun Wu 提交于
      Summary:
      Workaround a bunch of "implicit-fallthrough" compiler errors, like:
      
      ```
      util/crc32c.cc:533:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
         crc = _mm_crc32_u64(crc, *(uint64_t*)(buf + offset));
             ^
      util/crc32c.cc:1016:9: note: in expansion of macro ‘CRCsinglet’
               CRCsinglet(crc0, next, -2 * 8);
               ^~~~~~~~~~
      util/crc32c.cc:1017:7: note: here
             case 1:
      ```
      Closes https://github.com/facebook/rocksdb/pull/3339
      
      Reviewed By: sagar0
      
      Differential Revision: D6874736
      
      Pulled By: quark-zju
      
      fbshipit-source-id: eec9f3bc135e12fca336928d01711006d5c3cb16
      e502839e
    • F
      Upgrade Appveyor to VS2017 · ba8aa8fd
      Fosco Marotto 提交于
      Summary:
      Per some discussions, this will switch our Appveyor testing to use Visual Studio 2017.
      Closes https://github.com/facebook/rocksdb/pull/3445
      
      Differential Revision: D6874918
      
      Pulled By: gfosco
      
      fbshipit-source-id: c5a0032ca9f37f0d3baeae35c59d850d528c3176
      ba8aa8fd
    • A
      fix ReadaheadRandomAccessFile/iterator prefetch bug · b78ed046
      Andrew Kryczka 提交于
      Summary:
      `ReadaheadRandomAccessFile` is used by iterators for file reads in several cases, like in compaction when `compaction_readahead_size > 0` or `use_direct_io_for_flush_and_compaction == true`, or in user iterator when `ReadOptions::readahead_size > 0`. `ReadaheadRandomAccessFile` maintains an internal buffer for readahead data. It assumes that, if the buffer's length is less than `ReadaheadRandomAccessFile::readahead_size_`, which is fixed in the constructor, then EOF has been reached so it doesn't try reading further.
      
      Recently, d938226a started calling `RandomAccessFile::Prefetch` with various lengths: 8KB, 16KB, etc. When the `RandomAccessFile` is a `ReadaheadRandomAccessFile`, it triggers the above condition and incorrectly determines EOF. If a block is partially in the readahead buffer and EOF is incorrectly decided, the result is a truncated data block.
      
      The problem is reproducible:
      
      ```
      TEST_TMPDIR=/data/compaction_bench ./db_bench -benchmarks=fillrandom -write_buffer_size=1048576 -target_file_size_base=1048576 -block_size=18384 -use_direct_io_for_flush_and_compaction=true
      ...
      put error: Corruption: truncated block read from /data/compaction_bench/dbbench/000014.sst offset 20245, expected 10143 bytes, got 8427
      ```
      Closes https://github.com/facebook/rocksdb/pull/3454
      
      Differential Revision: D6869405
      
      Pulled By: ajkr
      
      fbshipit-source-id: 87001c299e7600a37c0dcccbd0368e0954c929cf
      b78ed046
  12. 01 2月, 2018 5 次提交