1. 01 3月, 2019 2 次提交
  2. 27 2月, 2019 1 次提交
    • M
      WritePrepared: optimize read path by avoiding virtual (#5018) · a661c0d2
      Maysam Yabandeh 提交于
      Summary:
      The read path includes a callback function, ReadCallback, which would eventually calls IsInSnapshot to figure if a particular seq is in the reading snapshot or not. This callback is virtual, which adds the cost of multiple virtual function call to each read. The first few checks in IsInSnapshot, however, are quite trivial and take care of majority of the cases. The patch moves those to a non-virtual function in the the parent class, ReadCallback, to lower the virtual callback cost.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5018
      
      Differential Revision: D14226562
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 6feed5b34f3b082e52092c5ef143e29b49c46b44
      a661c0d2
  3. 23 2月, 2019 1 次提交
  4. 22 2月, 2019 1 次提交
  5. 21 2月, 2019 2 次提交
    • Z
      add GetStatsHistory to retrieve stats snapshots (#4748) · c4f5d0aa
      Zhongyi Xie 提交于
      Summary:
      This PR adds public `GetStatsHistory` API to retrieve stats history in the form of an std map. The key of the map is the timestamp in microseconds when the stats snapshot is taken, the value is another std map from stats name to stats value (stored in std string). Two DBOptions are introduced: `stats_persist_period_sec` (default 10 minutes) controls the intervals between two snapshots are taken; `max_stats_history_count` (default 10) controls the max number of history snapshots to keep in memory. RocksDB will stop collecting stats snapshots if `stats_persist_period_sec` is set to 0.
      
      (This PR is the in-memory part of https://github.com/facebook/rocksdb/pull/4535)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4748
      
      Differential Revision: D13961471
      
      Pulled By: miasantreble
      
      fbshipit-source-id: ac836d401ecb84ea92216bf9966f969dedf4ad04
      c4f5d0aa
    • F
      Update version and history for 6.0 · 48c8d844
      Fosco Marotto 提交于
      48c8d844
  6. 20 2月, 2019 7 次提交
    • M
      Change random seed for txn stress tests on each run (#5004) · cf98df34
      Maysam Yabandeh 提交于
      Summary:
      Currently the transaction stress tests use thread id as the seed. Since the thread ids are likely to be the same across multiple runs, the seed is thus going to be the same. The patch includes time in calculating the seed to help covering a very different part of state space in each run of the stress tests. To be able to reproduce the bug in case the stress tests failed, it also prints out the time that was used to calculate the seed value.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5004
      
      Differential Revision: D14144356
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 728ed522f550fc8b4f5f9f373259c05fe9a54556
      cf98df34
    • M
      WritePrepared: Improve stress tests with slow threads (#4974) · 0f4244fe
      Maysam Yabandeh 提交于
      Summary:
      The transaction stress tests, stress a high concurrency scenario. In WritePrepared/WriteUnPrepared we need to also stress the scenarios where an inserting/reading transaction is very slow. This would stress the corner cases that the caching is not sufficient and other slower data structures are engaged. To emulate such cases we make use of slow inserter/verifier threads and also reduce the size of cache data structures.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4974
      
      Differential Revision: D14143070
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 81eb674678faf9fae0f654cd60ebcc74e26aeee7
      0f4244fe
    • M
      WritePrepared: max_evicted_seq_ update during commit cache lookup (#4955) · bcdc8c8b
      Maysam Yabandeh 提交于
      Summary:
      max_evicted_seq_ could be updated in the middle of the read in ::IsInSnapshot. The code to be correct in presence of this update would be complicated. The patch simplifies it by checking the value of max_evicted_seq_ before and after looking into commit_cache_ and retries in the unlucky case that it was changed.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4955
      
      Differential Revision: D13999556
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 7a1bdfa95ea8b5d8d73ddff3263ed31d7297b39c
      bcdc8c8b
    • S
      Temporarily Disable DBTest2.PresetCompressionDict (#5003) · 93f7e7a4
      Siying Dong 提交于
      Summary:
      DBTest2.PresetCompressionDict is flaky. Temparily disable it for now.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5003
      
      Differential Revision: D14139505
      
      Pulled By: siying
      
      fbshipit-source-id: ebf1872d364b76b2cb021b489ea2f17ee997116a
      93f7e7a4
    • Y
      Separate crash test with atomic flush (#4945) · 7d232102
      Yanqin Jin 提交于
      Summary:
      Currently crash test covers cases with and without atomic flush, but takes too
      long to finish. Therefore it may be a better idea to put crash test with atomic
      flush in a separate set of tests.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4945
      
      Differential Revision: D13947548
      
      Pulled By: riversand963
      
      fbshipit-source-id: 177c6de865290fd650b0103408339eaa3f801d8c
      7d232102
    • M
      Apply modernize-use-override (3) · 3c5d1b16
      Michael Liu 提交于
      Summary:
      Use C++11’s override and remove virtual where applicable.
      Change are automatically generated.
      
      bypass-lint
      drop-conflicts
      
      Reviewed By: igorsugak
      
      Differential Revision: D14131816
      
      fbshipit-source-id: f20e7f7cecf2e699d70f5fa036f72c0e3f59b50e
      3c5d1b16
    • Z
      add whole key bloom filter support in memtables (#4985) · ed995c6a
      Zhongyi Xie 提交于
      Summary:
      MyRocks calls `GetForUpdate` on `INSERT`, for unique key check, and in almost all cases GetForUpdate returns empty result. For such cases, whole key bloom filter is helpful.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4985
      
      Differential Revision: D14118257
      
      Pulled By: miasantreble
      
      fbshipit-source-id: d35cb7109c62fd5ad541a26968e3a3e16d3e85ea
      ed995c6a
  7. 16 2月, 2019 4 次提交
    • S
      Header logger should call LogHeader() (#4980) · c2affccc
      Siying Dong 提交于
      Summary:
      The info log header feature never worked well, because log level Header was not
      translated to Logger::LogHeader() call. Fix it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4980
      
      Differential Revision: D14087283
      
      Pulled By: siying
      
      fbshipit-source-id: 7e7d03ce35fa8d13d4ee549f46f7326f7bc0006d
      c2affccc
    • S
      flush_job logs data size too (#4979) · 26a33ee5
      Siying Dong 提交于
      Summary:
      Right now when a flush is triggered, the memory consumption is logged but data size is not.
      It's useful to log both when we debug unexpected small flushed file size.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4979
      
      Differential Revision: D14071979
      
      Pulled By: siying
      
      fbshipit-source-id: 0cd60449c5205eb00e0fbc299084418f609904ed
      26a33ee5
    • S
      Fix LITE Build (#4989) · 4db46aa2
      Siying Dong 提交于
      Summary:
      LITE mode has EventListener to be an empty class. However in db_bench,
      it is used. When "override" is added to the functions, the build breaks. Fix it
      by keeping the listener empty in LITE mode.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4989
      
      Differential Revision: D14108132
      
      Pulled By: siying
      
      fbshipit-source-id: 80121aab35b1120e502b37b782301dd700692697
      4db46aa2
    • A
      Deprecate ttl option from CompactionOptionsFIFO (#4965) · 3231a2e5
      Aubin Sanyal 提交于
      Summary:
      We introduced ttl option in CompactionOptionsFIFO when ttl-based file
      deletion (compaction) was supported only as part of FIFO Compaction. But
      with the extension of ttl semantics even to Level compaction,
      CompactionOptionsFIFO.ttl can now be deprecated. Instead we will start
      using ColumnFamilyOptions.ttl for FIFO compaction as well.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4965
      
      Differential Revision: D14072960
      
      Pulled By: sagar0
      
      fbshipit-source-id: c98cc2ae695a28136295787cd88d36a220fc219e
      3231a2e5
  8. 15 2月, 2019 2 次提交
    • M
      Apply modernize-use-override (2nd iteration) · ca89ac2b
      Michael Liu 提交于
      Summary:
      Use C++11’s override and remove virtual where applicable.
      Change are automatically generated.
      
      Reviewed By: Orvid
      
      Differential Revision: D14090024
      
      fbshipit-source-id: 1e9432e87d2657e1ff0028e15370a85d1739ba2a
      ca89ac2b
    • A
      Dictionary compression for files written by SstFileWriter (#4978) · c8c8104d
      Andrew Kryczka 提交于
      Summary:
      If `CompressionOptions::max_dict_bytes` and/or `CompressionOptions::zstd_max_train_bytes` are set, `SstFileWriter` will now generate files respecting those options.
      
      I refactored the logic a bit for deciding when to use dictionary compression. Previously we plumbed `is_bottommost_level` down to the table builder and used that. However it was kind of confusing in `SstFileWriter`'s context since we don't know what level the file will be ingested to. Instead, now the higher-level callers (e.g., flush, compaction, file writer) are responsible for building the right `CompressionOptions` to give the table builder.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4978
      
      Differential Revision: D14060763
      
      Pulled By: ajkr
      
      fbshipit-source-id: dc802c327896df2b319dc162d6acc82b9cdb452a
      c8c8104d
  9. 14 2月, 2019 4 次提交
  10. 13 2月, 2019 5 次提交
  11. 12 2月, 2019 8 次提交
    • A
      Reduce scope of compression dictionary to single SST (#4952) · 62f70f6d
      Andrew Kryczka 提交于
      Summary:
      Our previous approach was to train one compression dictionary per compaction, using the first output SST to train a dictionary, and then applying it on subsequent SSTs in the same compaction. While this was great for minimizing CPU/memory/I/O overhead, it did not achieve good compression ratios in practice. In our most promising potential use case, moderate reductions in a dictionary's scope make a major difference on compression ratio.
      
      So, this PR changes compression dictionary to be scoped per-SST. It accepts the tradeoff during table building to use more memory and CPU. Important changes include:
      
      - The `BlockBasedTableBuilder` has a new state when dictionary compression is in-use: `kBuffered`. In that state it accumulates uncompressed data in-memory whenever `Add` is called.
      - After accumulating target file size bytes or calling `BlockBasedTableBuilder::Finish`, a `BlockBasedTableBuilder` moves to the `kUnbuffered` state. The transition (`EnterUnbuffered()`) involves sampling the buffered data, training a dictionary, and compressing/writing out all buffered data. In the `kUnbuffered` state, a `BlockBasedTableBuilder` behaves the same as before -- blocks are compressed/written out as soon as they fill up.
      - Samples are now whole uncompressed data blocks, except the final sample may be a partial data block so we don't breach the user's configured `max_dict_bytes` or `zstd_max_train_bytes`. The dictionary trainer is supposed to work better when we pass it real units of compression. Previously we were passing 64-byte KV samples which was not realistic.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4952
      
      Differential Revision: D13967980
      
      Pulled By: ajkr
      
      fbshipit-source-id: 82bea6f7537e1529c7a1a4cdee84585f5949300f
      62f70f6d
    • P
      Increment NUMBER_BLOCK_NOT_COMPRESSED when !GoodCompressionRatio (#4929) · 79496d71
      Peter (Stig) Edwards 提交于
      Summary:
      See https://github.com/facebook/rocksdb/issues/4884
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4929
      
      Differential Revision: D14028333
      
      Pulled By: sagar0
      
      fbshipit-source-id: eed12bceae85385a34aaa6dd303bf0f53c4c7b06
      79496d71
    • M
      Enhance transaction_test_util with delays (#4970) · d6b9b3b8
      Maysam Yabandeh 提交于
      Summary:
      Enhance ::Insert and ::Verify test functions to add artificial delay between prepare and commit, and take snapshot and reads respectively.  A future PR will make use of these to improve stress tests to test against long-running transactions as well as long-running backup jobs. Also randomly sets set_snapshot to false for inserters to skip setting the snapshot in the initialization phase and let the snapshot be taken later explicitly.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4970
      
      Differential Revision: D14031342
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: b52b453751f0b25b81b23c48892bc1d152464cab
      d6b9b3b8
    • M
      WritePrepared: relax assert in compaction iterator (#4969) · 576d2d6c
      Maysam Yabandeh 提交于
      Summary:
      If IsInSnapshot(seq2, snapshot) determines that the snapshot is released, the future queries IsInSnapshot(seq1, snapshot) could still return a definitive answer of true if for example seq1 is too old that is determined visible in all snapshots. This violates a recently added assert statement to compaction iterator. The patch relaxes the assert.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4969
      
      Differential Revision: D14030998
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 6db53db0e37d0a20e8997ef2c1004b8627614ab9
      576d2d6c
    • A
      Fix `compression_zstd_max_train_bytes` coverage in stress test (#4957) · 1218704b
      Andrew Kryczka 提交于
      Summary:
      Previously `finalize_and_sanitize` function was always zeroing out `compression_zstd_max_train_bytes`. It was only supposed to do that when non-ZSTD compression was used. But since `--compression_type` was an unknown argument (i.e., one that `db_crashtest.py` does not recognize and blindly forwards to `db_stress`), `finalize_and_sanitize` could not tell whether ZSTD was used. This PR fixes it simply by making `--compression_type` a known argument with snappy as default (same as `db_stress`).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4957
      
      Differential Revision: D13994302
      
      Pulled By: ajkr
      
      fbshipit-source-id: 1b0baea7331397822830970d3698642eb7a7df65
      1218704b
    • M
      WritePrepared: add private options to TransactionDBOptions (#4966) · 9144d1f1
      Maysam Yabandeh 提交于
      Summary:
      WritePreparedTransactionDB operates with more options which should not be configurable to avoid complicating it for the users. For testing purposes however we need to change the default value of this parameters. This patch makes these parameters private fields in TransactionDBOptions so that the existing ::Open API could use them seamlessly without however exposing them to the users.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4966
      
      Differential Revision: D14015986
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 13037efa7dfdd6f73ec7a19414b66571e044c633
      9144d1f1
    • Y
      Checksum properties block for block-based table (#4956) · 2d049ab7
      Yanqin Jin 提交于
      Summary:
      Always enable properties block checksum verification for block-based table. For external SST file ingested with 'write_global_seqno==true', we use 'DecodeEntrySlow' to parse its blocks' contents so that the process will not die upon failing the assertion possibly caused by corruption.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4956
      
      Differential Revision: D14012741
      
      Pulled By: riversand963
      
      fbshipit-source-id: 8b766e6f54b36f8f9e074c0e19e0926ec3cce186
      2d049ab7
    • S
      Add a unit test to Ignorable manfiest record (#4964) · 5d9a623e
      Siying Dong 提交于
      Summary:
      https://github.com/facebook/rocksdb/pull/4960 introduced ignorable manfiest
      record. Adding a test to it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4964
      
      Differential Revision: D14012667
      
      Pulled By: siying
      
      fbshipit-source-id: e5f10ecc68dec2716e178d44f0fe2b76c3d857ef
      5d9a623e
  12. 09 2月, 2019 3 次提交
    • T
      Implement trace sampling (#4963) · 08809f5e
      tang-jianfeng 提交于
      Summary:
      Implement trace sampling to allow user to specify the sampling frequency, i.e. save one per how many requests, so that a user does not need to log all if he/she is interested in only a sampled set.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4963
      
      Differential Revision: D14011190
      
      Pulled By: tang-jianfeng
      
      fbshipit-source-id: 078b631d9319b67cb089dd2c30e21d0df8dc406a
      08809f5e
    • M
      WritePrepared: fix ValidateSnapshot with long-running txn (#4961) · 10d14693
      Maysam Yabandeh 提交于
      Summary:
      ValidateSnapshot checks if another txn has committed a value to about-to-be-locked key since a particular snapshot. It applies an optimization of looking into only the memtable if snapshot seq is larger than the earliest seq in the memtables. With a long-running txn in WritePrepared, the prepared value might be flushed out to the disk and yet it commits after the snapshot, which breaks this optimization. The patch fixes that by disabling this optimization when the min_uncomitted seq at the time the snapshot was taken is lower than earliest seq in the memtables.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4961
      
      Differential Revision: D14009947
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 1d11679950326f7c4094b433e6b821b729f08850
      10d14693
    • M
      Reset size_ to 0 in PinnableSlice::Reset (#4962) · 39fb88f1
      Maysam Yabandeh 提交于
      Summary:
      It would avoid bugs if the reused PinnableSlice is not actually reassigned and yet the programmer makes conclusions based on the size of the Slice.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4962
      
      Differential Revision: D14012710
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 23f4e173386b5461fd5650f44cde470805f4e816
      39fb88f1