1. 01 2月, 2018 1 次提交
  2. 31 1月, 2018 4 次提交
  3. 30 1月, 2018 10 次提交
    • A
      fix for checkpoint directory with trailing slash(es) · f3fe6f88
      Andrew Kryczka 提交于
      Summary:
      previously if `checkpoint_dir` contained a trailing slash, we'd attempt to create the `.tmp` directory under `checkpoint_dir` due to simply concatenating `checkpoint_dir + ".tmp"`. This failed because `checkpoint_dir` hadn't been created yet and our directory creation is non-recursive. This PR fixes the issue by always creating the `.tmp` directory in the same parent as `checkpoint_dir` by stripping trailing slashes before concatenating.
      Closes https://github.com/facebook/rocksdb/pull/3275
      
      Differential Revision: D6574952
      
      Pulled By: ajkr
      
      fbshipit-source-id: a6daa6777a901eac2460cd0140c9515f7241aefc
      f3fe6f88
    • Y
      Fix DBFlushTest::ManualFlushWithMinWriteBufferNumberToMerge dead lock · 4bdf06e7
      Yi Wu 提交于
      Summary:
      In the test, there can be a dead lock between background flush thread and foreground main thread as following:
      * background flush thread:
        - holding db mutex, while
        - waiting on "DBImpl::FlushMemTableToOutputFile:BeforeInstallSV" sync point.
      * foreground thread:
        - waiting for db mutex to write "key2"
      
      Fixing by let background flush thread wait without holding db mutex.
      Closes https://github.com/facebook/rocksdb/pull/3436
      
      Differential Revision: D6841334
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: b020768ac94e166e40953c5d09e505515a5f244d
      4bdf06e7
    • M
      Split SnapshotConcurrentAccessTest into 20 sub tests · 3073b1c5
      Maysam Yabandeh 提交于
      Summary:
      SnapshotConcurrentAccessTest sometimes times out when running on the test infra. This patch splits the test into smaller sub-tests to avoid the timeout. It also benefits from lower run-time of each sub-test and increases the coverage of the test. The overall run-time of each final sub-test is at most half of the original test so we should no longer see a timeout.
      Closes https://github.com/facebook/rocksdb/pull/3435
      
      Differential Revision: D6839427
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: d53fdb157109e2438ca7fe447d0cf4b71f304bd8
      3073b1c5
    • S
      Tests for dynamic universal compaction options · e6605e53
      Sagar Vemuri 提交于
      Summary:
      Added a test for three dynamic universal compaction options, in the realm of read amplification:
      - size_ratio
      - min_merge_width
      - max_merge_width
      
      Also updated DynamicUniversalCompactionSizeAmplification by adding a check on compaction reason.
      Found a bug in compaction reason setting while working on this PR, and fixed in #3412 .
      
      TODO for later: Still to add tests for these options: compression_size_percent, stop_style and trivial_move.
      Closes https://github.com/facebook/rocksdb/pull/3419
      
      Differential Revision: D6822217
      
      Pulled By: sagar0
      
      fbshipit-source-id: 074573fca6389053cbac229891a0163f38bb56c4
      e6605e53
    • Z
      Use block cache to track memory usage when ReadOptions.fill_cache=false · 3fe09371
      Zhongyi Xie 提交于
      Summary:
      ReadOptions.fill_cache is set in compaction inputs and can be set by users in their queries too. It tells RocksDB not to put a data block used to block cache.
      
      The memory used by the data block is, however, not trackable by users.
      
      To make the system more manageable, we can cost the block to block cache while using it, and then release it after using.
      Closes https://github.com/facebook/rocksdb/pull/3333
      
      Differential Revision: D6670230
      
      Pulled By: miasantreble
      
      fbshipit-source-id: ab848d3ed286bd081a13ee1903de357b56cbc308
      3fe09371
    • S
      db_bench: sanity check CuckooTable with mmap_read option · e2d4b0ef
      Siying Dong 提交于
      Summary:
      This is to avoid run time error. Fail the db_bench immediately if cuckoo table is used but mmap_read is not specified.
      Closes https://github.com/facebook/rocksdb/pull/3420
      
      Differential Revision: D6838284
      
      Pulled By: siying
      
      fbshipit-source-id: 20893fa28d40fadc31e4ff154bed02f5a1bad341
      e2d4b0ef
    • M
      Suppress lint in old files · b8eb32f8
      Mark Isaacson 提交于
      Summary: Grandfather in super old lint issues to make a clean slate for moving forward that allows us to have stronger enforcement on new issues.
      
      Reviewed By: yiwu-arbug
      
      Differential Revision: D6821806
      
      fbshipit-source-id: 22797d31ec58e9eb0255d3b66fedfcfcb0dc127c
      b8eb32f8
    • A
      fix db_bench filluniquerandom key count assertion · 9f7ccc84
      Andrew Kryczka 提交于
      Summary:
      It failed every time. I guess people usually ran with assertions disabled.
      Closes https://github.com/facebook/rocksdb/pull/3422
      
      Differential Revision: D6822984
      
      Pulled By: ajkr
      
      fbshipit-source-id: 2e90db75618b26ac1c46ddfa9e03c095c7bf16e3
      9f7ccc84
    • M
      Add Nim to the list of language bindings · 3f666f79
      Mamy Ratsimbazafy 提交于
      Summary: Closes https://github.com/facebook/rocksdb/pull/3428
      
      Differential Revision: D6834061
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: edca5b5b8330e0fee646c7434b9631da76670240
      3f666f79
    • B
      Rewrite comments on use_fsync option · 65cd6cd4
      Ben Darnell 提交于
      Summary:
      This replaces a vague warning about the mostly-obsolete ext3 filesystem with
      a more detailed note about a historical bug in the still-relevant ext4.
      
      Fixes #3410
      Closes https://github.com/facebook/rocksdb/pull/3421
      
      Differential Revision: D6834881
      
      Pulled By: siying
      
      fbshipit-source-id: 7771ef5c89a54c0ac17821680779c48178d0b400
      65cd6cd4
  4. 29 1月, 2018 1 次提交
  5. 27 1月, 2018 4 次提交
  6. 26 1月, 2018 3 次提交
    • S
      Improve performance of long range scans with readahead · d938226a
      Sagar Vemuri 提交于
      Summary:
      This change improves the performance of iterators doing long range scans (e.g. big/full table scans in MyRocks) by using readahead and prefetching additional data on each disk IO. This prefetching is automatically enabled on noticing more than 2 IOs for the same table file during iteration. The readahead size starts with 8KB and is exponentially increased on each additional sequential IO, up to a max of 256 KB. This helps in cutting down the number of IOs needed to complete the range scan.
      
      Constraints:
      - The prefetched data is stored by the OS in page cache. So this currently works only for non direct-reads use-cases i.e applications which use page cache. (Direct-I/O support will be enabled in a later PR).
      - This gets currently enabled only when ReadOptions.readahead_size = 0 (which is the default value).
      
      Thanks to siying for the original idea and implementation.
      
      **Benchmarks:**
      Data fill:
      ```
      TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=fillrandom -num=1000000000 -compression_type="none" -level_compaction_dynamic_level_bytes
      ```
      Do a long range scan: Seekrandom with large number of nexts
      ```
      TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=seekrandom -duration=60 -num=1000000000 -use_existing_db -seek_nexts=10000 -statistics -histogram
      ```
      
      Page cache was cleared before each experiment with the command:
      ```
      sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
      ```
      ```
      Before:
      seekrandom   :   34020.945 micros/op 29 ops/sec;   32.5 MB/s (1636 of 1999 found)
      With this change:
      seekrandom   :    8726.912 micros/op 114 ops/sec;  126.8 MB/s (5702 of 6999 found)
      ```
      ~3.9X performance improvement.
      
      Also verified with strace and gdb that the readahead size is increasing as expected.
      ```
      strace -e readahead -f -T -t -p <db_bench process pid>
      ```
      Closes https://github.com/facebook/rocksdb/pull/3282
      
      Differential Revision: D6586477
      
      Pulled By: sagar0
      
      fbshipit-source-id: 8a118a0ed4594fbb7f5b1cafb242d7a4033cb58c
      d938226a
    • B
      Update comments about default WALRecoveryMode · 65d43163
      Ben Darnell 提交于
      Summary:
      The default changed in 6a14f7a9 but this comment was not updated.
      Closes https://github.com/facebook/rocksdb/pull/3409
      
      Differential Revision: D6808264
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 0d7e2a054eb181e9a144fcb783cf0b2c77219bc0
      65d43163
    • S
      BlockBasedTable::NewDataBlockIterator to always return BlockIter · 1039133f
      Siying Dong 提交于
      Summary:
      This is a pre-cleaning up before a major block based table iterator refactoring. BlockBasedTable::NewDataBlockIterator() will always return BlockIter. This simplifies the logic and code and enable further refactoring and optimization.
      Closes https://github.com/facebook/rocksdb/pull/3398
      
      Differential Revision: D6780165
      
      Pulled By: siying
      
      fbshipit-source-id: 273f7dc896724f682c0118fb69a359d9cc4418b4
      1039133f
  7. 24 1月, 2018 7 次提交
  8. 23 1月, 2018 3 次提交
  9. 20 1月, 2018 4 次提交
  10. 19 1月, 2018 3 次提交
    • Y
      Fix Flush() keep waiting after flush finish · f1cb83fc
      Yi Wu 提交于
      Summary:
      Flush() call could be waiting indefinitely if min_write_buffer_number_to_merge is used. Consider the sequence:
      1. User call Flush() with flush_options.wait = true
      2. The manual flush started in the background
      3. New memtable become immutable because of writes. The new memtable will not trigger flush if min_write_buffer_number_to_merge is not reached.
      4. The manual flush finish.
      
      Because of the new memtable created at step 3 not being flush, previous logic of WaitForFlushMemTable() keep waiting, despite the memtables it intent to flush has been flushed.
      
      Here instead of checking if there are any more memtables to flush, WaitForFlushMemTable() also check the id of the earliest memtable. If the id is larger than that of latest memtable at the time flush was initiated, it means all the memtable at the time of flush start has all been flush.
      Closes https://github.com/facebook/rocksdb/pull/3378
      
      Differential Revision: D6746789
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 35e698f71c7f90b06337a93e6825f4ea3b619bfa
      f1cb83fc
    • T
      Fixed get version on windows, moved throwing exceptions into cc file. · b9873162
      topilski 提交于
      Summary:
      Fixes for msys2 and mingw, hide exceptions into cpp  file.
      Closes https://github.com/facebook/rocksdb/pull/3377
      
      Differential Revision: D6746707
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 456b38df80bc48b8386a2cf87f669b5a4f9999a4
      b9873162
    • J
      Add possibility to change ttl on open DB · 4decff6f
      jonasf 提交于
      Summary:
      We have seen cases where it could be good to change TTL on already open DB.
      Change ttl in TtlCompactionFilterFactory on open db.
      Next time a filter is created, it will filter accroding to the set TTL.
      
      Is this something that could be useful for others?
      Any downsides?
      Closes https://github.com/facebook/rocksdb/pull/3292
      
      Differential Revision: D6731993
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 73b94d69237b11e8730734389052429d621a6b1e
      4decff6f