1. 27 6月, 2018 5 次提交
    • N
      Add table property tracking number of range deletions (#4016) · 17339dc2
      Nikhil Benesch 提交于
      Summary:
      Add a new table property, rocksdb.num.range-deletions, which tracks the
      number of range deletions in a block-based table. Range deletions are no
      longer counted in rocksdb.num.entries; as discovered in PR #3778, there
      are various code paths that implicitly assume that rocksdb.num.entries
      counts only true keys, not range deletions.
      
      /cc ajkr nvanbenschoten
      Closes https://github.com/facebook/rocksdb/pull/4016
      
      Differential Revision: D8527575
      
      Pulled By: ajkr
      
      fbshipit-source-id: 92e7edbe78fda53756a558013c9fb496e7764fd7
      17339dc2
    • Z
      use user_key and iterate_upper_bound to determine compatibility of bloom filters (#3899) · 408205a3
      Zhongyi Xie 提交于
      Summary:
      Previously in https://github.com/facebook/rocksdb/pull/3601 bloom filter will only be checked if `prefix_extractor` in the mutable_cf_options matches the one found in the SST file.
      This PR relaxes the requirement by checking if all keys in the range [user_key, iterate_upper_bound) all share the same prefix after transforming using the BF in the SST file. If so, the bloom filter is considered compatible and will continue to be looked at.
      Closes https://github.com/facebook/rocksdb/pull/3899
      
      Differential Revision: D8157459
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 18d17cba56a1005162f8d5db7a27aba277089c41
      408205a3
    • B
      Create lgtm.yml for LGTM.com C/C++ analysis (#4058) · 967aa815
      Bas van Schaik 提交于
      Summary:
      As discussed with thatsafunnyname [here](https://discuss.lgtm.com/t/c-c-lang-missing-for-facebook-rocksdb/1079): this configuration enables C/C++ analysis for RocksDB on LGTM.com.
      
      The initial commit will contain a build command (simple `make`) that previously resulted in a build error. The build log will then be available on LGTM.com for you to investigate (if you like). I'll immediately add a second commit to this PR to correct the build command to `make static_lib`, which worked when I tested it earlier today.
      
      If you like you can also enable automatic code review in pull requests. This will alert you to any new code issues before they actually get merged into `master`. Here's an example of how that works for the AMPHTML project: https://github.com/ampproject/amphtml/pull/13060. You can enable it yourself here: https://lgtm.com/projects/g/facebook/rocksdb/ci/.
      
      I'll also add a badge to your README.md in a separate commit — feel free to remove that from this PR if you don't like it.
      
      (Full disclosure: I'm part of the LGTM.com team 🙂. Ping samlanning)
      Closes https://github.com/facebook/rocksdb/pull/4058
      
      Differential Revision: D8648410
      
      Pulled By: ajkr
      
      fbshipit-source-id: 98d55fc19cff1b07268ac8425b63e764806065aa
      967aa815
    • P
      Remove unused imports, from python scripts. (#4057) · 2694b6dc
      Peter (Stig) Edwards 提交于
      Summary:
      Also remove redefined variable.
      As reported on https://lgtm.com/projects/g/facebook/rocksdb/
      Closes https://github.com/facebook/rocksdb/pull/4057
      
      Differential Revision: D8648342
      
      Pulled By: ajkr
      
      fbshipit-source-id: afd2ba84d1364d316010179edd44777e64ca9183
      2694b6dc
    • A
      Fix universal compaction scheduling conflict with CompactFiles (#4055) · a8e503e5
      Andrew Kryczka 提交于
      Summary:
      Universal size-amp-triggered compaction was pulling the final sorted run into the compaction without checking whether any of its files are already being compacted. When all compactions are automatic, it is safe since it verifies the second-last sorted run is not already being compacted, which implies the last sorted run is also not being compacted (in automatic compaction multiple sorted runs are always compacted together). But with manual compaction, files in the last sorted run can be compacted independently, so the last sorted run also must be checked.
      
      We were seeing the below assertion failure in `db_stress`. Also the test case included in this PR repros the failure.
      
      ```
      db_universal_compaction_test: db/compaction.cc:312: void rocksdb::Compaction::MarkFilesBeingCompacted(bool): Assertion `mark_as_compacted ? !inputs_[i][j]->being_compacted : inputs_[i][j]->being_compacted' failed.
      Aborted (core dumped)
      ```
      Closes https://github.com/facebook/rocksdb/pull/4055
      
      Differential Revision: D8630094
      
      Pulled By: ajkr
      
      fbshipit-source-id: ac3b30a874678b76e113d4f6c42c1260411b08f8
      a8e503e5
  2. 26 6月, 2018 3 次提交
  3. 24 6月, 2018 2 次提交
  4. 23 6月, 2018 3 次提交
  5. 22 6月, 2018 6 次提交
    • Z
      option for timing measurement of non-blocking ops during compaction (#4029) · 795e663d
      Zhongyi Xie 提交于
      Summary:
      For example calling CompactionFilter is always timed and gives the user no way to disable.
      This PR will disable the timer if `Statistics::stats_level_` (which is part of DBOptions) is `kExceptDetailedTimers`
      Closes https://github.com/facebook/rocksdb/pull/4029
      
      Differential Revision: D8583670
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 913be9fe433ae0c06e88193b59d41920a532307f
      795e663d
    • A
      Cleanup staging directory at start of checkpoint (#4035) · 0a5b16c7
      Andrew Kryczka 提交于
      Summary:
      - Attempt to clean the checkpoint staging directory before starting a checkpoint. It was already cleaned up at the end of checkpoint. But it wasn't cleaned up in the edge case where the process crashed while staging checkpoint files.
      - Attempt to clean the checkpoint directory before calling `Checkpoint::Create` in `db_stress`. This handles the case where checkpoint directory was created by a previous `db_stress` run but the process crashed before cleaning it up.
      - Use `DestroyDB` for cleaning checkpoint directory since a checkpoint is a DB.
      Closes https://github.com/facebook/rocksdb/pull/4035
      
      Reviewed By: yiwu-arbug
      
      Differential Revision: D8580223
      
      Pulled By: ajkr
      
      fbshipit-source-id: 28c667400e249fad0fdedc664b349031b7b61599
      0a5b16c7
    • S
      Assert for Direct IO at the beginning in PositionedRead (#3891) · 645e57c2
      Sagar Vemuri 提交于
      Summary:
      Moved the direct-IO assertion to the top in `PosixSequentialFile::PositionedRead`, as it doesn't make sense to check for sector alignments before checking for direct IO.
      Closes https://github.com/facebook/rocksdb/pull/3891
      
      Differential Revision: D8267972
      
      Pulled By: sagar0
      
      fbshipit-source-id: 0ecf77c0fb5c35747a4ddbc15e278918c0849af7
      645e57c2
    • Y
      Update TARGETS file (#4028) · 58c22144
      Yi Wu 提交于
      Summary:
      -Wshorten-64-to-32 is invalid flag in fbcode. Changing it to -Warrowing.
      Closes https://github.com/facebook/rocksdb/pull/4028
      
      Differential Revision: D8553694
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 1523cbcb4c76cf1d2b10a4d28b5f58c78e6cb876
      58c22144
    • Y
      Fix a warning (treated as error) caused by type mismatch. · 39749596
      Yanqin Jin 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/4032
      
      Differential Revision: D8573061
      
      Pulled By: riversand963
      
      fbshipit-source-id: 112324dcb35956d6b3ec891073f4f21493933c8b
      39749596
    • S
      Improve direct IO range scan performance with readahead (#3884) · 7103559f
      Sagar Vemuri 提交于
      Summary:
      This PR extends the improvements in #3282 to also work when using Direct IO.
      We see **4.5X performance improvement** in seekrandom benchmark doing long range scans, when using direct reads, on flash.
      
      **Description:**
      This change improves the performance of iterators doing long range scans (e.g. big/full index or table scans in MyRocks) by using readahead and prefetching additional data on each disk IO, and storing in a local buffer. This prefetching is automatically enabled on noticing more than 2 IOs for the same table file during iteration. The readahead size starts with 8KB and is exponentially increased on each additional sequential IO, up to a max of 256 KB. This helps in cutting down the number of IOs needed to complete the range scan.
      
      **Implementation Details:**
      - Used `FilePrefetchBuffer` as the underlying buffer to store the readahead data. `FilePrefetchBuffer` can now take file_reader, readahead_size and max_readahead_size as input to the constructor, and automatically do readahead.
      - `FilePrefetchBuffer::TryReadFromCache` can now call `FilePrefetchBuffer::Prefetch` if readahead is enabled.
      - `AlignedBuffer` (which is the underlying store for `FilePrefetchBuffer`) now takes a few additional args in `AlignedBuffer::AllocateNewBuffer` to allow copying data from the old buffer.
      - Made sure not to re-read partial chunks of data that were already available in the buffer, from device again.
      - Fixed a couple of cases where `AlignedBuffer::cursize_` was not being properly kept up-to-date.
      
      **Constraints:**
      - Similar to #3282, this gets currently enabled only when ReadOptions.readahead_size = 0 (which is the default value).
      - Since the prefetched data is stored in a temporary buffer allocated on heap, this could increase the memory usage if you have many iterators doing long range scans simultaneously.
      - Enabled only for user reads, and disabled for compactions. Compaction reads are controlled by the options `use_direct_io_for_flush_and_compaction` and `compaction_readahead_size`, and the current feature takes precautions not to mess with them.
      
      **Benchmarks:**
      I used the same benchmark as used in #3282.
      Data fill:
      ```
      TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=fillrandom -num=1000000000 -compression_type="none" -level_compaction_dynamic_level_bytes
      ```
      
      Do a long range scan: Seekrandom with large number of nexts
      ```
      TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=seekrandom -use_direct_reads -duration=60 -num=1000000000 -use_existing_db -seek_nexts=10000 -statistics -histogram
      ```
      
      ```
      Before:
      seekrandom   :   37939.906 micros/op 26 ops/sec;   29.2 MB/s (1636 of 1999 found)
      With this change:
      seekrandom   :   8527.720 micros/op 117 ops/sec;  129.7 MB/s (6530 of 7999 found)
      ```
      ~4.5X perf improvement. Taken on an average of 3 runs.
      Closes https://github.com/facebook/rocksdb/pull/3884
      
      Differential Revision: D8082143
      
      Pulled By: sagar0
      
      fbshipit-source-id: 4d7a8561cbac03478663713df4d31ad2620253bb
      7103559f
  6. 21 6月, 2018 2 次提交
    • Y
      Add file name info to SequentialFileReader. (#4026) · 524c6e6b
      Yanqin Jin 提交于
      Summary:
      We potentially need this information for tracing, profiling and diagnosis.
      Closes https://github.com/facebook/rocksdb/pull/4026
      
      Differential Revision: D8555214
      
      Pulled By: riversand963
      
      fbshipit-source-id: 4263e06c00b6d5410b46aa46eb4e358ff2161dd2
      524c6e6b
    • A
      Support file ingestion in stress test (#4018) · 14cee194
      Andrew Kryczka 提交于
      Summary:
      Once per `ingest_external_file_one_in` operations, uses SstFileWriter to create a file containing `ingest_external_file_width` consecutive keys. The file is named containing the thread ID to avoid clashes. The file is then added to the DB using `IngestExternalFile`.
      
      We can't enable it by default in crash test because `nooverwritepercent` and `test_batches_snapshot` both must be zero for the DB's whole lifetime. Perhaps we should setup a separate test with that config as range deletion also requires it.
      Closes https://github.com/facebook/rocksdb/pull/4018
      
      Differential Revision: D8507698
      
      Pulled By: ajkr
      
      fbshipit-source-id: 1437ea26fd989349a9ce8b94117241c65e40f10f
      14cee194
  7. 20 6月, 2018 4 次提交
  8. 19 6月, 2018 5 次提交
  9. 18 6月, 2018 1 次提交
  10. 16 6月, 2018 9 次提交