1. 10 5月, 2018 1 次提交
  2. 09 5月, 2018 2 次提交
    • H
      Add missing options in BuildColumnfamilyOptions · cee138c7
      Huachao Huang 提交于
      Summary:
      soft_pending_compaction_bytes_limit and hard_pending_compaction_bytes_limit are added to BuildColumnfamilyOptions.
      Closes https://github.com/facebook/rocksdb/pull/3823
      
      Differential Revision: D7909246
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 89032efbf6b5bd302ea50cbd7a234977984a1fca
      cee138c7
    • A
      Disable readahead when using mmap for reads · 4bf169f0
      Andrew Kryczka 提交于
      Summary:
      `ReadaheadRandomAccessFile` had an unwritten assumption, which was that its wrapped file's `Read()` function always copies into the provided scratch buffer. Actually this was not true when the wrapped file was `PosixMmapReadableFile`, whose `Read()` implementation does no copying and instead returns a `Slice` pointing directly into the  `mmap`'d memory region. This PR:
      
      - prevents `ReadaheadRandomAccessFile` from ever wrapping mmap readable files
      - adds an assert for the assumption `ReadaheadRandomAccessFile` makes about the wrapped file's use of scratch buffer
      Closes https://github.com/facebook/rocksdb/pull/3813
      
      Differential Revision: D7891513
      
      Pulled By: ajkr
      
      fbshipit-source-id: dc64a55222d6af280c39a1852ee39e9e9d7cde7d
      4bf169f0
  3. 08 5月, 2018 4 次提交
    • T
      Link jemalloc · 1d9f24dc
      Tongliang Liao 提交于
      Summary:
      Fix undefined reference to `malloc_*` linking errors on Linux.
      Closes https://github.com/facebook/rocksdb/pull/3817
      
      Differential Revision: D7899066
      
      Pulled By: ajkr
      
      fbshipit-source-id: 18c46569a59608388d6240f1b8ec20c2d2557dec
      1d9f24dc
    • T
      Allows other cmake-specific "true" for USE_RTTI. · 9470ee45
      Tongliang Liao 提交于
      Summary:
      People also use ON/OFF, TRUE/FALSE and other switch options that is allowed by cmake.
      Closes https://github.com/facebook/rocksdb/pull/3814
      
      Differential Revision: D7899032
      
      Pulled By: ajkr
      
      fbshipit-source-id: b71511af59e0a78eedafb639b5002c47050bf3c2
      9470ee45
    • T
      Search paths provided by intel's "tbbvars.sh". · 6d6e01cd
      Tongliang Liao 提交于
      Summary:
      TBBROOT and LIBRARY_PATH are set in env by the script.
      
      With TBB 2018 the library path is $TBBROOT/lib/intel64/gcc4.7 for anything above gcc 4.7, which is both compiler and architecture related. We cannot simply do ${TBB_ROOT_DIR}/lib.
      Closes https://github.com/facebook/rocksdb/pull/3815
      
      Differential Revision: D7899006
      
      Pulled By: ajkr
      
      fbshipit-source-id: 159ab1f6a5c40452ed6aa8d79300206953d916c2
      6d6e01cd
    • M
      Split FaultInjectionTest.FaultTest to avoid timeout · d72a51e9
      Maysam Yabandeh 提交于
      Summary:
      tsan flavor of this test occasionally times out in our test infra. The patch split the test to two, each working on half of the option range.
      Before:
      [       OK ] FaultTest/FaultInjectionTest.FaultTest/0 (5918 ms)
      [       OK ] FaultTest/FaultInjectionTest.FaultTest/1 (5336 ms)
      After:
      [       OK ] FaultTest/FaultInjectionTestSplitted.FaultTest/0 (2930 ms)
      [       OK ] FaultTest/FaultInjectionTestSplitted.FaultTest/1 (2676 ms)
      [       OK ] FaultTest/FaultInjectionTestSplitted.FaultTest/2 (2759 ms)
      [       OK ] FaultTest/FaultInjectionTestSplitted.FaultTest/3 (2546 ms)
      Closes https://github.com/facebook/rocksdb/pull/3819
      
      Differential Revision: D7894975
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 809f1411cbcc27f8aa71a6b29a16b039f51b67c9
      d72a51e9
  4. 05 5月, 2018 6 次提交
    • daheiantian's avatar
      Recommit "Avoid adding tombstones of the same file to RangeDelAggregator multiple times" · 72942ad7
      daheiantian 提交于
      Summary:
      The origin commit #3635  will hurt performance for users who aren't using range deletions, because unneeded std::set operations, so it was reverted by commit 44653c7b. (see #3672)
      
      To fix this, move the set to  and add a check in , i.e., file will be added only if  is non-nullptr.
      
      The db_bench command which find the performance regression:
      > ./db_bench --benchmarks=fillrandom,seekrandomwhilewriting --threads=1 --num=1000000 --reads=150000 --key_size=66 > --value_size=1262 --statistics=0 --compression_ratio=0.5 --histogram=1 --seek_nexts=1 --stats_per_interval=1 > --stats_interval_seconds=600 --max_background_flushes=4 --num_multi_db=1 --max_background_compactions=16 --seed=1522388277 > -write_buffer_size=1048576 --level0_file_num_compaction_trigger=10000 --compression_type=none
      
      Before and after the modification, I re-run this command on the machine, the results of are as follows:
      
        **fillrandom**
       Table | P50 | P75 | P99 | P99.9 | P99.99 |
        ---- | --- | --- | --- | ----- | ------ |
       before commit | 5.92 | 8.57 | 19.63 | 980.97 | 12196.00 |
       after commit  | 5.91 | 8.55 | 19.34 | 965.56 | 13513.56 |
      
       **seekrandomwhilewriting**
        Table | P50 | P75 | P99 | P99.9 | P99.99 |
         ---- | --- | --- | --- | ----- | ------ |
       before commit | 1418.62 | 1867.01 | 3823.28 | 4980.99 | 9240.00 |
       after commit  | 1450.54 | 1880.61 | 3962.87 | 5429.60 | 7542.86 |
      Closes https://github.com/facebook/rocksdb/pull/3800
      
      Differential Revision: D7874245
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2e8bec781b3f7399246babd66395c88619534a17
      72942ad7
    • A
      Fix db_stress memory leak ASAN error · 4c5a3232
      Andrew Kryczka 提交于
      Summary:
      In case `--expected_values_path` is unset, we allocate a buffer internally to hold the expected DB state. This PR makes sure it is freed.
      Closes https://github.com/facebook/rocksdb/pull/3804
      
      Differential Revision: D7874694
      
      Pulled By: ajkr
      
      fbshipit-source-id: a8f7655e009507c4e639ceebfc3525d69c856e3b
      4c5a3232
    • M
      Evenly split HarnessTest.Randomized · fc522bdb
      Maysam Yabandeh 提交于
      Summary:
      Currently HarnessTest.Randomized is already split but some of the splits are faster than the others. The reason is that each split takes a continuous range of the generated args and the test with later args takes longer to finish. The patch evenly split the args among splits in a round robin fashion.
      Before:
      ```
      [       OK ] HarnessTest.Randomized1n2 (2278 ms)
      [       OK ] HarnessTest.Randomized3n4 (1095 ms)
      [       OK ] HarnessTest.Randomized5 (658 ms)
      [       OK ] HarnessTest.Randomized6 (1258 ms)
      [       OK ] HarnessTest.Randomized7 (6476 ms)
      [       OK ] HarnessTest.Randomized8 (8182 ms)
      ```
      After
      ```
      [       OK ] HarnessTest.Randomized1 (2649 ms)
      [       OK ] HarnessTest.Randomized2 (2645 ms)
      [       OK ] HarnessTest.Randomized3 (2577 ms)
      [       OK ] HarnessTest.Randomized4 (2490 ms)
      [       OK ] HarnessTest.Randomized5 (2553 ms)
      [       OK ] HarnessTest.Randomized6 (2560 ms)
      [       OK ] HarnessTest.Randomized7 (2501 ms)
      [       OK ] HarnessTest.Randomized8 (2574 ms)
      ```
      Closes https://github.com/facebook/rocksdb/pull/3808
      
      Differential Revision: D7882663
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 09b749a9684b6d7d65466aa4b00c5334a49e833e
      fc522bdb
    • M
      Rename vars to satisfy unity built · 171f415b
      Maysam Yabandeh 提交于
      Summary:
      Tested by "make unity_test"
      Closes https://github.com/facebook/rocksdb/pull/3807
      
      Differential Revision: D7882657
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 84862c18d7f2fc762bd96ad070eaeb6936e45159
      171f415b
    • F
      Add USE_RTTI and default behavior to CMakeLists · 4d40b10e
      Fosco Marotto 提交于
      Summary:
      Proposed fix for #3701
      Closes https://github.com/facebook/rocksdb/pull/3801
      
      Differential Revision: D7868264
      
      Pulled By: gfosco
      
      fbshipit-source-id: 013963ed3d172c8dc2abd1dd5982580082ca5d2d
      4d40b10e
    • A
      Fix crash test allocation error under TSAN · 6fc1bcce
      Andrew Kryczka 提交于
      Summary:
      We were seeing the following error: "ThreadSanitizer: DenseSlabAllocator overflow. Dying."
      
      It is fixable by mmap'ing a smaller region for keys' expected values, which this PR achieves by reducing the number of keys.
      Closes https://github.com/facebook/rocksdb/pull/3803
      
      Differential Revision: D7874478
      
      Pulled By: ajkr
      
      fbshipit-source-id: 433939f5cb92410ab4777d540cb0cc2ee0fe6c2e
      6fc1bcce
  5. 04 5月, 2018 4 次提交
    • Z
      MaxFileSizeForLevel: adjust max_file_size for dynamic level compaction · a7034328
      Zhongyi Xie 提交于
      Summary:
      `MutableCFOptions::RefreshDerivedOptions` always assume base level is L1, which is not true when `level_compaction_dynamic_level_bytes=true` and Level based compaction is used.
      This PR fixes this by recomputing `max_file_size` at query time (in `MaxFileSizeForLevel`)
      Fixes https://github.com/facebook/rocksdb/issues/3229
      
      In master:
      
      ```
      Level Files Size(MB)
      --------------------
        0       14      846
        1        0        0
        2        0        0
        3        0        0
        4        0        0
        5       15      366
        6       11      481
      Cumulative compaction: 3.83 GB write, 2.27 GB read
      ```
      In branch:
      ```
      Level Files Size(MB)
      --------------------
        0        9      544
        1        0        0
        2        0        0
        3        0        0
        4        0        0
        5        0        0
        6      445      935
      Cumulative compaction: 2.91 GB write, 1.46 GB read
      ```
      
      db_bench command used:
      ```
      ./db_bench --benchmarks="fillrandom,deleterandom,fillrandom,levelstats,stats" --statistics -deletes=5000 -db=tmp -compression_type=none --num=20000 -value_size=100000 -level_compaction_dynamic_level_bytes=true -target_file_size_base=2097152 -target_file_size_multiplier=2
      ```
      Closes https://github.com/facebook/rocksdb/pull/3755
      
      Differential Revision: D7721381
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 39afb8503190bac3b466adf9bbf2a9b3655789f8
      a7034328
    • D
      Better destroydb · 934f96de
      Dmitri Smirnov 提交于
      Summary:
      Delete archive directory before WAL folder
        since archive may be contained as a subfolder.
        Also improve loop readability.
      Closes https://github.com/facebook/rocksdb/pull/3797
      
      Differential Revision: D7866378
      
      Pulled By: riversand963
      
      fbshipit-source-id: 0c45d97677ce6fbefa3f8d602ef5e2a2a925e6f5
      934f96de
    • M
      Speedup ManualCompactionTest.Test · a8d77ca3
      Maysam Yabandeh 提交于
      Summary:
      ManualCompactionTest.Test occasionally times out in tsan flavor of our test infra. The patch reduces the number of keys to make the test run faster. The change does not seem to negatively impact the coverage of the test.
      Closes https://github.com/facebook/rocksdb/pull/3802
      
      Differential Revision: D7865596
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: b4f60e32c3ae1677e25506f71c766e33fa985785
      a8d77ca3
    • S
      Skip deleted WALs during recovery · d5954929
      Siying Dong 提交于
      Summary:
      This patch record min log number to keep to the manifest while flushing SST files to ignore them and any WAL older than them during recovery. This is to avoid scenarios when we have a gap between the WAL files are fed to the recovery procedure. The gap could happen by for example out-of-order WAL deletion. Such gap could cause problems in 2PC recovery where the prepared and commit entry are placed into two separate WAL and gap in the WALs could result into not processing the WAL with the commit entry and hence breaking the 2PC recovery logic.
      
      Before the commit, for 2PC case, we determined which log number to keep in FindObsoleteFiles(). We looked at the earliest logs with outstanding prepare entries, or prepare entries whose respective commit or abort are in memtable. With the commit, the same calculation is done while we apply the SST flush. Just before installing the flush file, we precompute the earliest log file to keep after the flush finishes using the same logic (but skipping the memtables just flushed), record this information to the manifest entry for this new flushed SST file. This pre-computed value is also remembered in memory, and will later be used to determine whether a log file can be deleted. This value is unlikely to change until next flush because the commit entry will stay in memtable. (In WritePrepared, we could have removed the older log files as soon as all prepared entries are committed. It's not yet done anyway. Even if we do it, the only thing we loss with this new approach is earlier log deletion between two flushes, which does not guarantee to happen anyway because the obsolete file clean-up function is only executed after flush or compaction)
      
      This min log number to keep is stored in the manifest using the safely-ignore customized field of AddFile entry, in order to guarantee that the DB generated using newer release can be opened by previous releases no older than 4.2.
      Closes https://github.com/facebook/rocksdb/pull/3765
      
      Differential Revision: D7747618
      
      Pulled By: siying
      
      fbshipit-source-id: d00c92105b4f83852e9754a1b70d6b64cb590729
      d5954929
  6. 03 5月, 2018 2 次提交
    • M
      WritePrepared Txn: enable rollback in stress test · cfb86659
      Maysam Yabandeh 提交于
      Summary:
      Rollback was disabled in stress test since there was a concurrency issue in WritePrepared rollback algorithm. The issue is fixed by caching the column family handles in WritePrepared to skip getting them from the db when needed for rollback.
      
      Tested by running transaction stress test under tsan.
      Closes https://github.com/facebook/rocksdb/pull/3785
      
      Differential Revision: D7793727
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d81ab6fda0e53186ca69944cfe0712ce4869451e
      cfb86659
    • M
      WritePrepared Txn: split SeqAdvanceConcurrentTest · 5bed8a00
      Maysam Yabandeh 提交于
      Summary:
      The tsan flavor of SeqAdvanceConcurrentTest times out in our test infra. The patch splits it into 10 tests.
      On my vm before:
      [       OK ] WritePreparedTransactionTest/WritePreparedTransactionTest.SeqAdvanceConcurrentTest/0 (5194 ms)
      after:
      [       OK ] OneWriteQueue/SeqAdvanceConcurrentTest.SeqAdvanceConcurrentTest/0 (1906 ms)
      Closes https://github.com/facebook/rocksdb/pull/3799
      
      Differential Revision: D7854515
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 4fbac42a1f974326cbc237f8cb9d6232d379c431
      5bed8a00
  7. 02 5月, 2018 3 次提交
  8. 01 5月, 2018 2 次提交
    • A
      Second attempt at db_stress crash-recovery verification · 46152d53
      Andrew Kryczka 提交于
      Summary:
      - Original commit: a4fb1f8c
      - Revert commit (we reverted as a quick fix to get crash tests passing): 6afe22db
      
      This PR includes the contents of the original commit plus two bug fixes, which are:
      
      - In whitebox crash test, only set `--expected_values_path` for `db_stress` runs in the first half of the crash test's duration. In the second half, a fresh DB is created for each `db_stress` run, so we cannot maintain expected state across `db_stress` runs.
      - Made `Exists()` return true for `UNKNOWN_SENTINEL` values. I previously had an assert in `Exists()` that value was not `UNKNOWN_SENTINEL`. But it is possible for post-crash-recovery expected values to be `UNKNOWN_SENTINEL` (i.e., if the crash happens in the middle of an update), in which case this assertion would be tripped. The effect of returning true in this case is there may be cases where a `SingleDelete` deletes no data. But if we had returned false, the effect would be calling `SingleDelete` on a key with multiple older versions, which is not supported.
      Closes https://github.com/facebook/rocksdb/pull/3793
      
      Differential Revision: D7811671
      
      Pulled By: ajkr
      
      fbshipit-source-id: 67e0295bfb1695ff9674837f2e05bb29c50efc30
      46152d53
    • V
      fix missing perfcontext destroy declare in C API · 282099fc
      Vincent Lee 提交于
      Summary:
      `rocksdb_perfcontext_destroy` declare is missing in C API.
      Closes https://github.com/facebook/rocksdb/pull/3787
      
      Differential Revision: D7816490
      
      Pulled By: ajkr
      
      fbshipit-source-id: 3a488607bfc897c7ce846a1b3c2b7af693134d0d
      282099fc
  9. 28 4月, 2018 5 次提交
  10. 27 4月, 2018 7 次提交
    • Y
      Rename pending_flush_ to queued_for_flush_. · 513b5ce6
      Yanqin Jin 提交于
      Summary:
      With ColumnFamilyData::pending_flush_, we have the following code snippet in DBImpl::ScheedulePendingFlush
      
      ```
      if (!cfd->pending_flush() && cfd->imm()->IsFlushPending()) {
      ...
      }
      ```
      
      `Pending` is ambiguous, and I feel `queued_for_flush` is a better name,
      especially for the sake of readability.
      Closes https://github.com/facebook/rocksdb/pull/3777
      
      Differential Revision: D7783066
      
      Pulled By: riversand963
      
      fbshipit-source-id: f1bd8c8bfe5eafd2c94da0d8566c9b2b6bb57229
      513b5ce6
    • N
      Add virtual Truncate method to Env · 37cd617b
      Nathan VanBenschoten 提交于
      Summary:
      This change adds a virtual `Truncate` method to `Env`, which truncates
      the named file to the specified size. At the moment, this is only
      supported for `MockEnv`, but other `Env's` could be extended to override
      the method too. This is the same approach that methods like `LinkFile` and
      `AreSameFile` have taken.
      
      This is useful for any user of the in-memory `Env`. The implementation's
      header is not exported, so before this change, it was impossible to
      access it's already existing `Truncate` method.
      Closes https://github.com/facebook/rocksdb/pull/3779
      
      Differential Revision: D7785789
      
      Pulled By: ajkr
      
      fbshipit-source-id: 3bcdaeea7b7180529f7d9b496dc67b791a00bbf0
      37cd617b
    • A
      Allow options file in db_stress and db_crashtest · db36f222
      Andrew Kryczka 提交于
      Summary:
      - When options file is provided to db_stress, take supported options from the file instead of from flags
      - Call `BuildOptionsTable` after `Open` so it can use `options_` once it has been populated either from flags or from file
      - Allow options filename to be passed via `db_crashtest.py`
      Closes https://github.com/facebook/rocksdb/pull/3768
      
      Differential Revision: D7755331
      
      Pulled By: ajkr
      
      fbshipit-source-id: 5205cc5deb0d74d677b9832174153812bab9a60a
      db36f222
    • A
      Remove block-based table assertion for non-empty filter block · 7004e454
      Andrew Kryczka 提交于
      Summary:
      7a6353bd prevents empty filter blocks from being written for SST files containing range deletions only. However the assertion this PR removes is still a problem as we could be reading from a DB generated by a RocksDB build without the 7a6353bd patch. So remove the assertion. We already don't do this check when `cache_index_and_filter_blocks=false`, so it should be safe.
      Closes https://github.com/facebook/rocksdb/pull/3773
      
      Differential Revision: D7769964
      
      Pulled By: ajkr
      
      fbshipit-source-id: 7285762446f2cd2ccf16efd7a988a106fbb0d8d3
      7004e454
    • S
      Sync parent directory after deleting a file in delete scheduler · 63c965cd
      Siying Dong 提交于
      Summary:
      sync parent directory after deleting a file in delete scheduler. Otherwise, trim speed may not be as smooth as what we want.
      Closes https://github.com/facebook/rocksdb/pull/3767
      
      Differential Revision: D7760136
      
      Pulled By: siying
      
      fbshipit-source-id: ec131d53b61953f09c60d67e901e5eeb2716b05f
      63c965cd
    • M
      Fix the bloom filter skipping empty prefixes · 7e4e3814
      Maysam Yabandeh 提交于
      Summary:
      bc0da4b5 optimized bloom filters by skipping duplicate entires when the whole key and prefixes are both added to the bloom. It however used empty string as the initial value of the last entry added to the bloom. This is incorrect since empty key/prefix are valid entires by themselves. This patch fixes that.
      Closes https://github.com/facebook/rocksdb/pull/3776
      
      Differential Revision: D7778803
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d5a065daebee17f9403cac51e9d5626aac87bfbc
      7e4e3814
    • M
      WritePrepared Txn: disable rollback in stress test · e5a4dacf
      Maysam Yabandeh 提交于
      Summary:
      WritePrepared rollback implementation is not ready to be invoked in the middle of workload. This is due the lack of synchronization to obtain the cf handle from db. Temporarily disabling this until the problem with rollback is fixed.
      Closes https://github.com/facebook/rocksdb/pull/3772
      
      Differential Revision: D7769041
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 0e3b0ce679bc2afba82e653a40afa3f045722754
      e5a4dacf
  11. 26 4月, 2018 4 次提交