1. 03 10月, 2017 2 次提交
  2. 30 9月, 2017 2 次提交
  3. 29 9月, 2017 7 次提交
    • M
      Fix for when block.cache_handle is nullptr · ab0542f5
      Maysam Yabandeh 提交于
      Summary:
      When using with compressed cache it is possible that the status is ok but the block is not actually added to the block cache. The patch takes this case into account.
      Closes https://github.com/facebook/rocksdb/pull/2945
      
      Differential Revision: D5937613
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 5428cf1115e5046b3d01ab78d26cb181122af4c6
      ab0542f5
    • A
      fix deletion-triggered compaction in table builder · 5df172da
      Andrew Kryczka 提交于
      Summary:
      It was broken when `NotifyCollectTableCollectorsOnFinish` was introduced. That function called `Finish` on each of the `TablePropertiesCollector`s, and `CompactOnDeletionCollector::Finish()` was resetting all its internal state. Then, when we checked whether compaction is necessary, the flag had already been cleared.
      
      Fixed above issue by avoiding resetting internal state during `Finish()`. Multiple calls to `Finish()` are allowed, but callers cannot invoke `AddUserKey()` on the collector after any finishes.
      Closes https://github.com/facebook/rocksdb/pull/2936
      
      Differential Revision: D5918659
      
      Pulled By: ajkr
      
      fbshipit-source-id: 4f05e9d80e50ee762ba1e611d8d22620029dca6b
      5df172da
    • M
      WritePrepared Txn: Recovery · 385049ba
      Maysam Yabandeh 提交于
      Summary:
      Recover txns from the WAL. Also added some unit tests.
      Closes https://github.com/facebook/rocksdb/pull/2901
      
      Differential Revision: D5859596
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 6424967b231388093b4effffe0a3b1b7ec8caeb0
      385049ba
    • Y
      Default one to rocksdb:x64-windows · 8c724f5c
      Yu Shu 提交于
      Summary:
      The default one will try to install rocksdb:x86-windows, which would lead to failing of the build at the last step (CMake Error, Rocksdb only supports x64). Because it will try to install a serials of x86 version package, and those cannot proceed to rocksdb:x86-windows building. By using rocksdb:x64-windows, we can make sure to install x64 version.
      Tested on Win10 x64.
      Closes https://github.com/facebook/rocksdb/pull/2941
      
      Differential Revision: D5937139
      
      Pulled By: sagar0
      
      fbshipit-source-id: 15637fe23df59326a0e607bd4d5c48733e20bae3
      8c724f5c
    • S
      Introduce conditional merge-operator invocation in point lookups · 93c2b917
      Sagar Vemuri 提交于
      Summary:
      For every merge operand encountered for a key in the read path we now have the ability to decide whether to look further (to retrieve more merge operands for the key) or stop and invoke the merge operator to return the value. The user needs to override `ShouldMerge()` method with a condition to terminate search when true to avail this facility.
      
      This has a couple of advantages:
      1. It helps in limiting the number of merge operands that are looked at to compute a value as part of a user Get operation.
      2. It allows to peek at a merge key-value to see if further merge operands need to look at.
      
      Example: Limiting the number of merge operands that are looked at: Lets say you have 10 merge operands for a key spread over various levels. If you only want RocksDB to look at the latest two merge operands instead of all 10 to compute the value, it is now possible with this PR. You can set the condition in `ShouldMerge()` to return true when the size of the operand list is 2. Look at the example implementation in the unit test. Without this PR, a Get might look at all the 10 merge operands in different levels before invoking the merge-operator.
      
      Added a new unit test.
      Made sure that there is no perf regression by running benchmarks.
      
      Command line to Load data:
      ```
      TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="mergerandom" --merge_operator="uint64add" --num=10000000
      ...
      mergerandom  :      12.861 micros/op 77757 ops/sec;    8.6 MB/s ( updates:10000000)
      ```
      
      **ReadRandomMergeRandom bechmark results:**
      Command line:
      ```
      TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="readrandommergerandom" --merge_operator="uint64add" --num=10000000
      ```
      
      Base -- Without this code change (on commit fc7476be):
      ```
      readrandommergerandom :      38.586 micros/op 25916 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8)
      ```
      
      With this code change:
      ```
      readrandommergerandom :      38.653 micros/op 25870 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8)
      ```
      Closes https://github.com/facebook/rocksdb/pull/2923
      
      Differential Revision: D5898239
      
      Pulled By: sagar0
      
      fbshipit-source-id: daefa325019f77968639a75c851d46352c2303ef
      93c2b917
    • A
      Use RAII instead of pointers in cf_info_map · a48a398e
      Aliaksei Sandryhaila 提交于
      Summary:
      There is no need for smart pointers in cf_info_map, so use RAII. This should also placate valgrind.
      Closes https://github.com/facebook/rocksdb/pull/2943
      
      Differential Revision: D5932941
      
      Pulled By: asandryh
      
      fbshipit-source-id: 2c37df88573a9df2557880a31193926e4425e054
      a48a398e
    • M
      Blog post for 5.8 release · c7058662
      Maysam Yabandeh 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/2942
      
      Differential Revision: D5932858
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: e11f52a0b08d65149bb49d99d1dbc82cb5a96fa0
      c7058662
  4. 28 9月, 2017 3 次提交
  5. 27 9月, 2017 2 次提交
  6. 26 9月, 2017 1 次提交
  7. 23 9月, 2017 3 次提交
    • Z
      Add test kPointInTimeRecoveryCFConsistency · 1d6700f9
      Zhongyi Xie 提交于
      Summary:
      Context/problem:
      
      - CFs may be flushed at different times
      - A WAL can only be deleted after all CFs have flushed beyond end of that WAL.
      - Point-in-time recovery might stop upon reaching the first corruption.
      - Some CFs may have already flushed beyond that point, while others haven't. We should fail the Open() instead of proceeding with inconsistent CFs.
      Closes https://github.com/facebook/rocksdb/pull/2900
      
      Differential Revision: D5863281
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 180dbaf83d96c804cff49b3c406312a4ae61313e
      1d6700f9
    • Y
      Fix WritePreparedTransactionTest::SeqAdvanceTest ASAN failure · be97dbb1
      Yi Wu 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/2922
      
      Differential Revision: D5895310
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 52c635a25d22478ec1eca49b6817551202babac2
      be97dbb1
    • A
      Repair DBs with trailing slash in name · 4708a687
      Andrew Kryczka 提交于
      Summary:
      Problem:
      
      - `DB::SanitizeOptions` strips trailing slash from `wal_dir` but not `dbname`
      - We check whether `wal_dir` and `dbname` refer to the same directory using string equality: https://github.com/facebook/rocksdb/blob/master/db/repair.cc#L258
      - Providing `dbname` with trailing slash causes default `wal_dir` to be misidentified as a separate directory.
      - Then the repair tries to add all SST files to the `VersionEdit` twice (once for `dbname` dir, once for `wal_dir`) and fails with coredump.
      
      Solution:
      
      - Add a new `Env` function, `AreFilesSame`, which uses device and inode number to check whether files are the same. It's currently only implemented in `PosixEnv`.
      - Migrate repair to use `AreFilesSame` to check whether `dbname` and `wal_dir` are same. If unsupported, falls back to string comparison.
      Closes https://github.com/facebook/rocksdb/pull/2827
      
      Differential Revision: D5761349
      
      Pulled By: ajkr
      
      fbshipit-source-id: c839d548678b742af1166d60b09abd94e5476238
      4708a687
  8. 22 9月, 2017 6 次提交
  9. 21 9月, 2017 1 次提交
  10. 20 9月, 2017 3 次提交
  11. 19 9月, 2017 2 次提交
    • P
      collecting kValue type tombstone · e4234fbd
      Pengchao Wang 提交于
      Summary:
      In our testing cluster, we found large amount tombstone has been promoted to kValue type from kMerge after reaching the top level of compaction. Since we used to only collecting tombstone in merge operator, those tombstones can never be collected.
      
      This PR addresses the issue by adding a GC step in compaction filter, which is only for kValue type records. Since those record already reached the top of compaction (no earlier data exists) we can safely remove them in compaction filter without worrying old data appears.
      
      This PR also removes an old optimization in cassandra merge operator for single merge operands.  We need to do GC even on a single operand, so the optimation does not make sense anymore.
      Closes https://github.com/facebook/rocksdb/pull/2855
      
      Reviewed By: sagar0
      
      Differential Revision: D5806445
      
      Pulled By: wpc
      
      fbshipit-source-id: 6eb25629d4ce917eb5e8b489f64a6aa78c7d270b
      e4234fbd
    • M
      WritePrepared Txn: Advance seq one per batch · 60beefd6
      Maysam Yabandeh 提交于
      Summary:
      By default the seq number in DB is increased once per written key. WritePrepared txns requires the seq to be increased once per the entire batch so that the seq would be used as the prepare timestamp by which the transaction is identified. Also we need to increase seq for the commit marker since it would give a unique id to the commit timestamp of transactions.
      
      Two unit tests are added to verify our understanding of how the seq should be increased. The recovery path requires much more work and is left to another patch.
      Closes https://github.com/facebook/rocksdb/pull/2885
      
      Differential Revision: D5837843
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: a08960b93d727e1cf438c254d0c2636fb133cc1c
      60beefd6
  12. 16 9月, 2017 4 次提交
  13. 15 9月, 2017 4 次提交
    • B
      JNI support for ReadOptions::iterate_upper_bound · 382277d0
      Ben Clay 提交于
      Summary:
      Plumbed ReadOptions::iterate_upper_bound through JNI.
      
      Made the following design choices:
      * Used Slice instead of AbstractSlice due to the anticipated usecase (key / key prefix). Can change this if anyone disagrees.
      * Used Slice instead of raw byte[] which seemed cleaner but necessitated the package-private handle-based Slice constructor. Followed WriteBatch as an example.
      * We need a copy constructor for ReadOptions, as we create one base ReadOptions for a particular usecase and clone -> change the iterate_upper_bound on each slice operation. Shallow copy seemed cleanest.
      * Hold a reference to the upper bound slice on ReadOptions, in contrast to Snapshot.
      
      Signed a Facebook CLA this morning.
      Closes https://github.com/facebook/rocksdb/pull/2872
      
      Differential Revision: D5824446
      
      Pulled By: sagar0
      
      fbshipit-source-id: 74fc51313a10a81ecd348625e2a50ca5b7766888
      382277d0
    • S
      Three code-level optimization to Iterator::Next() · edcbb369
      Siying Dong 提交于
      Summary:
      Three small optimizations:
      (1) iter_->IsKeyPinned() shouldn't be called if read_options.pin_data is not true. This may trigger function call all the way down the iterator tree.
      (2) reuse the iterator key object in DBIter::FindNextUserEntryInternal(). The constructor of the class has some overheads.
      (3) Move the switching direction logic in MergingIterator::Next() to a separate function.
      
      These three in total improves readseq performance by about 3% in my benchmark setting.
      Closes https://github.com/facebook/rocksdb/pull/2880
      
      Differential Revision: D5829252
      
      Pulled By: siying
      
      fbshipit-source-id: 991aea10c6d6c3b43769cb4db168db62954ad1e3
      edcbb369
    • S
      Two small refactoring for better inlining · 885b1c68
      Siying Dong 提交于
      Summary:
      Move uncommon code paths in RangeDelAggregator::ShouldDelete() and IterKey::EnlargeBufferIfNeeded() to a separate function, so that the inlined strcuture can be more optimized.
      
      Optimize it because these places show up in CPU profiling, though minimum. The performance is really hard measure. I ran db_bench with readseq benchmark against in-memory DB many times. The variation is big, but it seems to show 1% improvements.
      Closes https://github.com/facebook/rocksdb/pull/2877
      
      Differential Revision: D5828123
      
      Pulled By: siying
      
      fbshipit-source-id: 41a49e229f91e9f8409f85cc6f0dc70e31334e4b
      885b1c68
    • O
      Added save points for transactions C API · ffac6836
      Oleksandr Anyshchenko 提交于
      Summary:
      Added possibility to set save points in transactions and then rollback to them
      Closes https://github.com/facebook/rocksdb/pull/2876
      
      Differential Revision: D5825829
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 62168992340bbcddecdaea3baa2a678475d1429d
      ffac6836