1. 02 3月, 2019 4 次提交
    • A
      Run db_bench on database generated externally (#5017) · 18d2e4be
      Andrew Kryczka 提交于
      Summary:
      Added an option, `-use_existing_keys`, which can be set to run
      benchmarks against an arbitrary existing database. Now users can
      benchmark against their actual database rather than synthetic data.
      
      Before the run begins, it loads all the keys into memory, then uses that
      set of keys rather than synthesizing new ones in `GenerateKeyFromInt`.
      This is mainly intended for small-scale DBs where the memory consumption
      is not a concern.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5017
      
      Differential Revision: D14270303
      
      Pulled By: riversand963
      
      fbshipit-source-id: 6328df9dffb5e19170270dd00a69f4bbe424e5ed
      18d2e4be
    • S
      Make statistics's stats_level change thread-safe (#5030) · aef763b6
      Siying Dong 提交于
      Summary:
      Right now, users can change statistics.stats_level while DB is running, but TSAN may report
      data race. We make stats_level_ to be atomic, and access them using accessors.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5030
      
      Differential Revision: D14267519
      
      Pulled By: siying
      
      fbshipit-source-id: 37d7ebeff7a43a406230143422a16af899163f73
      aef763b6
    • F
      Merge pull request #5031 from gfosco/defsbzl · 916e5241
      Fosco Marotto 提交于
      [sync fix] Add defs.bzl
      916e5241
    • M
      WritePrepared: script to analyze stress test failures (#5033) · 0b80f6b3
      Maysam Yabandeh 提交于
      Summary:
      This the hackish script we used to find the root cause of failures in transaction stress tests. It is not well-written and does not require rigorous reviewing but it is better than starting from scratch each time we observe an issue. The stress tests would just say that at which snapshots the sum of all the keys in a set is inconsistent with another set. To help debugging one need to know which key exactly returned inconsistent results. The script looks at the transactions between two conflicting snapshots, and performs thee changes manually to see for which key the read value was inconsistent.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5033
      
      Differential Revision: D14280362
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d5826055c46711460ba81480d96cb5ea082814a5
      0b80f6b3
  2. 01 3月, 2019 6 次提交
    • M
      Call PreReleaseCallback between WAL and memtable write (#5015) · 77ebc82b
      Maysam Yabandeh 提交于
      Summary:
      PreReleaseCallback meant to be called before the writes are visible to the readers. Since the sequence number is known after the WAL write, there is no reason to delay calling PreReleaseCallback to after the memtable write, which would complicates the reader's logic in presence of our memtable writes that are made visible by the other write thread.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5015
      
      Differential Revision: D14221670
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: a504dd665cf923226d7af09cc8e9c7739a25edc6
      77ebc82b
    • M
      WritePrepared: commit only from the 2nd queue (#5014) · 68a2f94d
      Maysam Yabandeh 提交于
      Summary:
      When two_write_queues is enabled we call ::AddPrepared only from the main queue, which writes to both WAL and memtable, and call ::AddCommitted from the 2nd queue, which writes only to WAL. This simplifies the logic by avoiding concurrency between AddPrepared and also between AddCommitted. The patch fixes one case that did not conform with the rule above. This would allow future refactoring. For example AdvaneMaxEvictedSeq, which is invoked by AddCommitted, can be simplified by assuming lack of concurrent calls to it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5014
      
      Differential Revision: D14210493
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 6db5ba372a294a568a14caa010576460917a4eab
      68a2f94d
    • S
      Fix DefaultEnvTest.incBackgroundThreadsIfNeeded test (#5021) · 06ea73d6
      Sagar Vemuri 提交于
      Summary:
      `DefaultEnvTest.incBackgroundThreadsIfNeeded` jtest should assert that the number of threads is greater than or equal to the minimum number of threads.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5021
      
      Differential Revision: D14268311
      
      Pulled By: sagar0
      
      fbshipit-source-id: 01fb32b5b3ce636451d162fa1a2bbc5bd1974682
      06ea73d6
    • L
      Introduce an enum for flag types in LRUHandle (#5024) · f83eecff
      Levi Tamasi 提交于
      Summary:
      Replace the integers used for setting and querying the various
      flags in LRUHandle with enum values to improve readability.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5024
      
      Differential Revision: D14263429
      
      Pulled By: ltamasi
      
      fbshipit-source-id: b1b9ba95635265f122c2b40da73850eaac18227a
      f83eecff
    • F
      [sync fix] Add defs.bzl · b157d3d1
      Fosco Marotto 提交于
      b157d3d1
    • S
      Add two more StatsLevel (#5027) · 5e298f86
      Siying Dong 提交于
      Summary:
      Statistics cost too much CPU for some use cases. Add two stats levels
      so that people can choose to skip two types of expensive stats, timers and
      histograms.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5027
      
      Differential Revision: D14252765
      
      Pulled By: siying
      
      fbshipit-source-id: 75ecec9eaa44c06118229df4f80c366115346592
      5e298f86
  3. 27 2月, 2019 1 次提交
    • M
      WritePrepared: optimize read path by avoiding virtual (#5018) · a661c0d2
      Maysam Yabandeh 提交于
      Summary:
      The read path includes a callback function, ReadCallback, which would eventually calls IsInSnapshot to figure if a particular seq is in the reading snapshot or not. This callback is virtual, which adds the cost of multiple virtual function call to each read. The first few checks in IsInSnapshot, however, are quite trivial and take care of majority of the cases. The patch moves those to a non-virtual function in the the parent class, ReadCallback, to lower the virtual callback cost.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5018
      
      Differential Revision: D14226562
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 6feed5b34f3b082e52092c5ef143e29b49c46b44
      a661c0d2
  4. 23 2月, 2019 1 次提交
  5. 22 2月, 2019 1 次提交
  6. 21 2月, 2019 2 次提交
    • Z
      add GetStatsHistory to retrieve stats snapshots (#4748) · c4f5d0aa
      Zhongyi Xie 提交于
      Summary:
      This PR adds public `GetStatsHistory` API to retrieve stats history in the form of an std map. The key of the map is the timestamp in microseconds when the stats snapshot is taken, the value is another std map from stats name to stats value (stored in std string). Two DBOptions are introduced: `stats_persist_period_sec` (default 10 minutes) controls the intervals between two snapshots are taken; `max_stats_history_count` (default 10) controls the max number of history snapshots to keep in memory. RocksDB will stop collecting stats snapshots if `stats_persist_period_sec` is set to 0.
      
      (This PR is the in-memory part of https://github.com/facebook/rocksdb/pull/4535)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4748
      
      Differential Revision: D13961471
      
      Pulled By: miasantreble
      
      fbshipit-source-id: ac836d401ecb84ea92216bf9966f969dedf4ad04
      c4f5d0aa
    • F
      Update version and history for 6.0 · 48c8d844
      Fosco Marotto 提交于
      48c8d844
  7. 20 2月, 2019 7 次提交
    • M
      Change random seed for txn stress tests on each run (#5004) · cf98df34
      Maysam Yabandeh 提交于
      Summary:
      Currently the transaction stress tests use thread id as the seed. Since the thread ids are likely to be the same across multiple runs, the seed is thus going to be the same. The patch includes time in calculating the seed to help covering a very different part of state space in each run of the stress tests. To be able to reproduce the bug in case the stress tests failed, it also prints out the time that was used to calculate the seed value.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5004
      
      Differential Revision: D14144356
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 728ed522f550fc8b4f5f9f373259c05fe9a54556
      cf98df34
    • M
      WritePrepared: Improve stress tests with slow threads (#4974) · 0f4244fe
      Maysam Yabandeh 提交于
      Summary:
      The transaction stress tests, stress a high concurrency scenario. In WritePrepared/WriteUnPrepared we need to also stress the scenarios where an inserting/reading transaction is very slow. This would stress the corner cases that the caching is not sufficient and other slower data structures are engaged. To emulate such cases we make use of slow inserter/verifier threads and also reduce the size of cache data structures.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4974
      
      Differential Revision: D14143070
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 81eb674678faf9fae0f654cd60ebcc74e26aeee7
      0f4244fe
    • M
      WritePrepared: max_evicted_seq_ update during commit cache lookup (#4955) · bcdc8c8b
      Maysam Yabandeh 提交于
      Summary:
      max_evicted_seq_ could be updated in the middle of the read in ::IsInSnapshot. The code to be correct in presence of this update would be complicated. The patch simplifies it by checking the value of max_evicted_seq_ before and after looking into commit_cache_ and retries in the unlucky case that it was changed.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4955
      
      Differential Revision: D13999556
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 7a1bdfa95ea8b5d8d73ddff3263ed31d7297b39c
      bcdc8c8b
    • S
      Temporarily Disable DBTest2.PresetCompressionDict (#5003) · 93f7e7a4
      Siying Dong 提交于
      Summary:
      DBTest2.PresetCompressionDict is flaky. Temparily disable it for now.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5003
      
      Differential Revision: D14139505
      
      Pulled By: siying
      
      fbshipit-source-id: ebf1872d364b76b2cb021b489ea2f17ee997116a
      93f7e7a4
    • Y
      Separate crash test with atomic flush (#4945) · 7d232102
      Yanqin Jin 提交于
      Summary:
      Currently crash test covers cases with and without atomic flush, but takes too
      long to finish. Therefore it may be a better idea to put crash test with atomic
      flush in a separate set of tests.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4945
      
      Differential Revision: D13947548
      
      Pulled By: riversand963
      
      fbshipit-source-id: 177c6de865290fd650b0103408339eaa3f801d8c
      7d232102
    • M
      Apply modernize-use-override (3) · 3c5d1b16
      Michael Liu 提交于
      Summary:
      Use C++11’s override and remove virtual where applicable.
      Change are automatically generated.
      
      bypass-lint
      drop-conflicts
      
      Reviewed By: igorsugak
      
      Differential Revision: D14131816
      
      fbshipit-source-id: f20e7f7cecf2e699d70f5fa036f72c0e3f59b50e
      3c5d1b16
    • Z
      add whole key bloom filter support in memtables (#4985) · ed995c6a
      Zhongyi Xie 提交于
      Summary:
      MyRocks calls `GetForUpdate` on `INSERT`, for unique key check, and in almost all cases GetForUpdate returns empty result. For such cases, whole key bloom filter is helpful.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4985
      
      Differential Revision: D14118257
      
      Pulled By: miasantreble
      
      fbshipit-source-id: d35cb7109c62fd5ad541a26968e3a3e16d3e85ea
      ed995c6a
  8. 16 2月, 2019 4 次提交
    • S
      Header logger should call LogHeader() (#4980) · c2affccc
      Siying Dong 提交于
      Summary:
      The info log header feature never worked well, because log level Header was not
      translated to Logger::LogHeader() call. Fix it.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4980
      
      Differential Revision: D14087283
      
      Pulled By: siying
      
      fbshipit-source-id: 7e7d03ce35fa8d13d4ee549f46f7326f7bc0006d
      c2affccc
    • S
      flush_job logs data size too (#4979) · 26a33ee5
      Siying Dong 提交于
      Summary:
      Right now when a flush is triggered, the memory consumption is logged but data size is not.
      It's useful to log both when we debug unexpected small flushed file size.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4979
      
      Differential Revision: D14071979
      
      Pulled By: siying
      
      fbshipit-source-id: 0cd60449c5205eb00e0fbc299084418f609904ed
      26a33ee5
    • S
      Fix LITE Build (#4989) · 4db46aa2
      Siying Dong 提交于
      Summary:
      LITE mode has EventListener to be an empty class. However in db_bench,
      it is used. When "override" is added to the functions, the build breaks. Fix it
      by keeping the listener empty in LITE mode.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4989
      
      Differential Revision: D14108132
      
      Pulled By: siying
      
      fbshipit-source-id: 80121aab35b1120e502b37b782301dd700692697
      4db46aa2
    • A
      Deprecate ttl option from CompactionOptionsFIFO (#4965) · 3231a2e5
      Aubin Sanyal 提交于
      Summary:
      We introduced ttl option in CompactionOptionsFIFO when ttl-based file
      deletion (compaction) was supported only as part of FIFO Compaction. But
      with the extension of ttl semantics even to Level compaction,
      CompactionOptionsFIFO.ttl can now be deprecated. Instead we will start
      using ColumnFamilyOptions.ttl for FIFO compaction as well.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4965
      
      Differential Revision: D14072960
      
      Pulled By: sagar0
      
      fbshipit-source-id: c98cc2ae695a28136295787cd88d36a220fc219e
      3231a2e5
  9. 15 2月, 2019 2 次提交
    • M
      Apply modernize-use-override (2nd iteration) · ca89ac2b
      Michael Liu 提交于
      Summary:
      Use C++11’s override and remove virtual where applicable.
      Change are automatically generated.
      
      Reviewed By: Orvid
      
      Differential Revision: D14090024
      
      fbshipit-source-id: 1e9432e87d2657e1ff0028e15370a85d1739ba2a
      ca89ac2b
    • A
      Dictionary compression for files written by SstFileWriter (#4978) · c8c8104d
      Andrew Kryczka 提交于
      Summary:
      If `CompressionOptions::max_dict_bytes` and/or `CompressionOptions::zstd_max_train_bytes` are set, `SstFileWriter` will now generate files respecting those options.
      
      I refactored the logic a bit for deciding when to use dictionary compression. Previously we plumbed `is_bottommost_level` down to the table builder and used that. However it was kind of confusing in `SstFileWriter`'s context since we don't know what level the file will be ingested to. Instead, now the higher-level callers (e.g., flush, compaction, file writer) are responsible for building the right `CompressionOptions` to give the table builder.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4978
      
      Differential Revision: D14060763
      
      Pulled By: ajkr
      
      fbshipit-source-id: dc802c327896df2b319dc162d6acc82b9cdb452a
      c8c8104d
  10. 14 2月, 2019 4 次提交
  11. 13 2月, 2019 5 次提交
  12. 12 2月, 2019 3 次提交
    • A
      Reduce scope of compression dictionary to single SST (#4952) · 62f70f6d
      Andrew Kryczka 提交于
      Summary:
      Our previous approach was to train one compression dictionary per compaction, using the first output SST to train a dictionary, and then applying it on subsequent SSTs in the same compaction. While this was great for minimizing CPU/memory/I/O overhead, it did not achieve good compression ratios in practice. In our most promising potential use case, moderate reductions in a dictionary's scope make a major difference on compression ratio.
      
      So, this PR changes compression dictionary to be scoped per-SST. It accepts the tradeoff during table building to use more memory and CPU. Important changes include:
      
      - The `BlockBasedTableBuilder` has a new state when dictionary compression is in-use: `kBuffered`. In that state it accumulates uncompressed data in-memory whenever `Add` is called.
      - After accumulating target file size bytes or calling `BlockBasedTableBuilder::Finish`, a `BlockBasedTableBuilder` moves to the `kUnbuffered` state. The transition (`EnterUnbuffered()`) involves sampling the buffered data, training a dictionary, and compressing/writing out all buffered data. In the `kUnbuffered` state, a `BlockBasedTableBuilder` behaves the same as before -- blocks are compressed/written out as soon as they fill up.
      - Samples are now whole uncompressed data blocks, except the final sample may be a partial data block so we don't breach the user's configured `max_dict_bytes` or `zstd_max_train_bytes`. The dictionary trainer is supposed to work better when we pass it real units of compression. Previously we were passing 64-byte KV samples which was not realistic.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4952
      
      Differential Revision: D13967980
      
      Pulled By: ajkr
      
      fbshipit-source-id: 82bea6f7537e1529c7a1a4cdee84585f5949300f
      62f70f6d
    • P
      Increment NUMBER_BLOCK_NOT_COMPRESSED when !GoodCompressionRatio (#4929) · 79496d71
      Peter (Stig) Edwards 提交于
      Summary:
      See https://github.com/facebook/rocksdb/issues/4884
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4929
      
      Differential Revision: D14028333
      
      Pulled By: sagar0
      
      fbshipit-source-id: eed12bceae85385a34aaa6dd303bf0f53c4c7b06
      79496d71
    • M
      Enhance transaction_test_util with delays (#4970) · d6b9b3b8
      Maysam Yabandeh 提交于
      Summary:
      Enhance ::Insert and ::Verify test functions to add artificial delay between prepare and commit, and take snapshot and reads respectively.  A future PR will make use of these to improve stress tests to test against long-running transactions as well as long-running backup jobs. Also randomly sets set_snapshot to false for inserters to skip setting the snapshot in the initialization phase and let the snapshot be taken later explicitly.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4970
      
      Differential Revision: D14031342
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: b52b453751f0b25b81b23c48892bc1d152464cab
      d6b9b3b8