1. 27 7月, 2018 1 次提交
  2. 26 7月, 2018 1 次提交
  3. 25 7月, 2018 2 次提交
    • Y
      Increase version number to 5.16 (#4176) · 18f53803
      Yanqin Jin 提交于
      Summary:
      Given that we have cut 5.15, we should bump the version number to the next
      version, i.e. 5.16.
      Also update HISTORY.md
      cc sagar0
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4176
      
      Differential Revision: D8977965
      
      Pulled By: riversand963
      
      fbshipit-source-id: 481d75d2f446946f0eb2afb7e94ef894c8c87e1e
      18f53803
    • F
      DataBlockHashIndex: Standalone Implementation with Unit Test (#4139) · 8805ec2f
      Fenggang Wu 提交于
      Summary:
      The first step of the `DataBlockHashIndex` implementation. A string based hash table is implemented and unit-tested.
      
      `DataBlockHashIndexBuilder`: `Add()` takes pairs of `<key, restart_index>`, and formats it into a string when `Finish()` is called.
      `DataBlockHashIndex`: initialized by the formatted string, and can interpret it as a hash table. Lookup for a key is supported by iterator operation.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4139
      
      Reviewed By: sagar0
      
      Differential Revision: D8866764
      
      Pulled By: fgwu
      
      fbshipit-source-id: 7f015f0098632c65979a22898a50424384730b10
      8805ec2f
  4. 24 7月, 2018 4 次提交
    • M
      WriteUnPrepared: Implement unprepared batches for transactions (#4104) · ea212e53
      Manuel Ung 提交于
      Summary:
      This adds support for writing unprepared batches based on size defined in `TransactionOptions::max_write_batch_size`. This is done by overriding methods that modify data (Put/Delete/SingleDelete/Merge) and checking first if write batch size has exceeded threshold. If so, the write batch is written to DB as an unprepared batch.
      
      Support for Commit/Rollback for unprepared batch is added as well. This has been done by simply extending the WritePrepared Commit/Rollback logic to take care of all unprep_seq numbers either when updating prepare heap, or adding to commit map. For updating the commit map, this logic exists inside `WriteUnpreparedCommitEntryPreReleaseCallback`.
      
      A test change was also made to have transactions unregister themselves when committing without prepare. This is because with write unprepared, there may be unprepared entries (which act similarly to prepared entries) already when a commit is done without prepare.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4104
      
      Differential Revision: D8785717
      
      Pulled By: lth
      
      fbshipit-source-id: c02006e281ec1ce00f628e2a7beec0ee73096a91
      ea212e53
    • C
      move static msgs out of Status class (#4144) · 374c37da
      Chang Su 提交于
      Summary:
      The member msgs of class Status contains all types of status messages.
      When users dump a Status object, msgs will confuse users. So move it out
      of class Status by making it as file-local static variable.
      
      Closes #3831 .
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4144
      
      Differential Revision: D8941419
      
      Pulled By: sagar0
      
      fbshipit-source-id: 56b0510258465ff26db15aa6b04e01532e053e3d
      374c37da
    • A
      Build improvements: Split docker targets and parallelize java builds · c6d2a7f8
      Adam Retter 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4165
      
      Differential Revision: D8955531
      
      Pulled By: sagar0
      
      fbshipit-source-id: 97d5a1375e200bde3c6414f94703504a4ed7536a
      c6d2a7f8
    • S
      db_stress to cover upper bound in iterators (#4162) · 4b0a4357
      Siying Dong 提交于
      Summary:
      db_stress doesn't cover upper or lower bound in iterators. Try to cover it by randomly assigning a random one. Also in prefix scan tests, with 50% of the chance, set next prefix as the upper bound.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4162
      
      Differential Revision: D8953507
      
      Pulled By: siying
      
      fbshipit-source-id: f0f04e9cb6c07cbebbb82b892ca23e0daeea708b
      4b0a4357
  5. 21 7月, 2018 5 次提交
    • Z
      Avoid unnecessary big for-loop when reporting ticker stats stored in GetContext (#3490) · f95a5b24
      Zhongyi Xie 提交于
      Summary:
      Currently in `Version::Get` when reporting ticker stats stored in `GetContext`, there is a big for-loop through all `Ticker` which adds unnecessary cost to overall CPU usage. We can optimize by storing only ticker values that are used in `Get()` calls in a new struct `GetContextStats` since only a small fraction of all tickers are used in `Get()` calls. For comparison, with the new approach we only need to visit 17 values while old approach will require visiting 100+ `Ticker`
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/3490
      
      Differential Revision: D6969154
      
      Pulled By: miasantreble
      
      fbshipit-source-id: fc27072965a3a94125a3e6883d20dafcf5b84029
      f95a5b24
    • Z
      Fixed the db_bench MergeRandom only access CF_default (#4155) · 6811fb06
      Zhichao Cao 提交于
      Summary:
      When running the tracing and analyzing, I found that MergeRandom benchmark in db_bench only access the default column family even the -num_column_families is specified > 1.
      
      changes: Using the db_with_cfh as DB to randomly select the column family to execute the Merge operation if -num_column_families is specified > 1.
      
      Tested with make asan_check and verified in tracing
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4155
      
      Differential Revision: D8907888
      
      Pulled By: zhichao-cao
      
      fbshipit-source-id: 2b4bc8fe0e99c8f262f5be6b986c7025d62cf850
      6811fb06
    • S
      Reformatting some recent changes (#4161) · a5e851e1
      Siying Dong 提交于
      Summary:
      Lint is not happy with some new code recently committed. Format them.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4161
      
      Differential Revision: D8940582
      
      Pulled By: siying
      
      fbshipit-source-id: c9b43b1ef8c88b5e923911058b44eb77234b36b7
      a5e851e1
    • S
      BlockBasedTableReader: automatically adjust tail prefetch size (#4156) · 8425c8bd
      Siying Dong 提交于
      Summary:
      Right now we use one hard-coded prefetch size to prefetch data from the tail of the SST files. However, this may introduce a waste for some use cases, while not efficient for others.
      Introduce a way to adjust this prefetch size by tracking 32 recent times, and pick a value with which the wasted read is less than 10%
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4156
      
      Differential Revision: D8916847
      
      Pulled By: siying
      
      fbshipit-source-id: 8413f9eb3987e0033ed0bd910f83fc2eeaaf5758
      8425c8bd
    • A
      Write properties metablock last in block-based tables (#4158) · ab35505e
      Andrew Kryczka 提交于
      Summary:
      The properties meta-block should come at the end since we always need to
      read it when opening a file, unlike index/filter/other meta-blocks, which
      are sometimes read depending on the user's configuration. This ordering
      will allow us to (in a future PR) do a small readahead on the end of the file
      to read properties and meta-index blocks with one I/O.
      
      The bulk of this PR is a refactoring of the `BlockBasedTableBuilder::Finish`
      function. It was previously too large with inconsistent error handling, which
      made it difficult to change. So I broke it up into one function per meta-block
      write, and tried to make error handling consistent within those functions.
      Then reordering the metablocks was trivial -- just reorder the calls to these
      helper functions.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4158
      
      Differential Revision: D8921705
      
      Pulled By: ajkr
      
      fbshipit-source-id: 96c9cc3182eb1adf11af46adab79dbeba7b12fcc
      ab35505e
  6. 20 7月, 2018 3 次提交
    • Y
      Fix a bug in MANIFEST group commit (#4157) · 2736752b
      Yanqin Jin 提交于
      Summary:
      PR #3944 introduces group commit of `VersionEdit` in MANIFEST. The
      implementation has a bug. When updating the log file number of each column
      family, we must consider only `VersionEdit`s that operate on the same column
      family. Otherwise, a column family may accidentally set its log file number
      higher than actual value, indicating that log files with smaller file number
      will be ignored, thus causing some updates to be lost.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4157
      
      Differential Revision: D8916650
      
      Pulled By: riversand963
      
      fbshipit-source-id: 8f456cf688f17bf35ad87b38e30e899aa162f201
      2736752b
    • A
      Smaller tail readahead when not reading index/filters (#4159) · b5613227
      Andrew Kryczka 提交于
      Summary:
      In all cases during `BlockBasedTable::Open`, we issue at least three read requests to the file's tail: (1) footer, (2) metaindex block, and (3) properties block. Depending on the config, we may also read other metablocks like filter and index.
      
      This PR issues smaller readahead when we expect to do only the three necessary reads mentioned above. Then, 4KB should be enough (ignoring the case where there are lots of user-defined properties). We can keep doing 512KB readahead when additional reads are expected.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4159
      
      Differential Revision: D8924002
      
      Pulled By: ajkr
      
      fbshipit-source-id: cfc713275de4d05ce11f18571f1d72e27ccd3356
      b5613227
    • D
      Return new operator for Status allocations for Windows (#4128) · 78ab11cd
      Dmitri Smirnov 提交于
      Summary: Windows requires new/delete for memory allocations to be overriden. Refactor to be less intrusive.
      
      Differential Revision: D8878047
      
      Pulled By: siying
      
      fbshipit-source-id: 35f2b5fec2f88ea48c9be926539c6469060aab36
      78ab11cd
  7. 19 7月, 2018 4 次提交
  8. 18 7月, 2018 5 次提交
    • Y
      Release 5.15. (#4148) · 79f009f2
      Yanqin Jin 提交于
      Summary:
      Cut 5.15.fb
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4148
      
      Differential Revision: D8886802
      
      Pulled By: riversand963
      
      fbshipit-source-id: 6b6427ce97f5b323a7eebf92458fda8b24b0cece
      79f009f2
    • S
      DBSSTTest.DeleteSchedulerMultipleDBPaths data race (#4146) · 37e0fdc8
      Siying Dong 提交于
      Summary:
      Fix a minor data race in DBSSTTest.DeleteSchedulerMultipleDBPaths reported by TSAN
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4146
      
      Differential Revision: D8880945
      
      Pulled By: siying
      
      fbshipit-source-id: 25c632f685757735c59ad4ff26b2f346a443a446
      37e0fdc8
    • Y
      Fix write get stuck when pipelined write is enabled (#4143) · d538ebdf
      Yi Wu 提交于
      Summary:
      Fix the issue when pipelined write is enabled, writers can get stuck indefinitely and not able to finish the write. It can show with the following example: Assume there are 4 writers W1, W2, W3, W4 (W1 is the first, W4 is the last).
      
      T1: all writers pending in WAL writer queue:
      WAL writer queue: W1, W2, W3, W4
      memtable writer queue: empty
      
      T2. W1 finish WAL writer and move to memtable writer queue:
      WAL writer queue: W2, W3, W4,
      memtable writer queue: W1
      
      T3. W2 and W3 finish WAL write as a batch group. W2 enter ExitAsBatchGroupLeader and move the group to memtable writer queue, but before wake up next leader.
      WAL writer queue: W4
      memtable writer queue: W1, W2, W3
      
      T4. W1, W2, W3 finish memtable write as a batch group. Note that W2 still in the previous ExitAsBatchGroupLeader, although W1 have done memtable write for W2.
      WAL writer queue: W4
      memtable writer queue: empty
      
      T5. The thread corresponding to W3 create another writer W3' with the same address as W3.
      WAL writer queue: W4, W3'
      memtable writer queue: empty
      
      T6. W2 continue with ExitAsBatchGroupLeader. Because the address of W3' is the same as W3, the last writer in its group, it thinks there are no pending writers, so it reset newest_writer_ to null, emptying the queue. W4 and W3' are deleted from the queue and will never be wake up.
      
      The issue exists since pipelined write was introduced in 5.5.0.
      
      Closes #3704
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4143
      
      Differential Revision: D8871599
      
      Pulled By: yiwu-arbug
      
      fbshipit-source-id: 3502674e51066a954a0660257e24ac588f815e2a
      d538ebdf
    • S
      Remove managed iterator · ddc07b40
      Siying Dong 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4124
      
      Differential Revision: D8829910
      
      Pulled By: siying
      
      fbshipit-source-id: f3e952ccf3a631071a5d77c48e327046f8abb560
      ddc07b40
    • S
      Pending output file number should be released after bulkload failure (#4145) · 995fcf75
      Siying Dong 提交于
      Summary:
      If bulkload fails for an input error, the pending output file number wasn't released. This bug can cause all future files with larger number than the current number won't be deleted, even they are compacted. This commit fixes the bug.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4145
      
      Differential Revision: D8877900
      
      Pulled By: siying
      
      fbshipit-source-id: 080be92a23d43305ca1e13fe1c06eb4cd0b01466
      995fcf75
  9. 17 7月, 2018 5 次提交
  10. 14 7月, 2018 10 次提交
    • N
      Support range deletion tombstones in IngestExternalFile SSTs (#3778) · ef7815b8
      Nathan VanBenschoten 提交于
      Summary:
      Fixes #3391.
      
      This change adds a `DeleteRange` method to `SstFileWriter` and adds
      support for ingesting SSTs with range deletion tombstones. This is
      important for applications that need to atomically ingest SSTs while
      clearing out any existing keys in a given key range.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/3778
      
      Differential Revision: D8821836
      
      Pulled By: anand1976
      
      fbshipit-source-id: ca7786c1947ff129afa703dab011d524c7883844
      ef7815b8
    • Z
      Exclude time waiting for rate limiter from rocksdb.sst.read.micros (#4102) · 91d7c03c
      Zhongyi Xie 提交于
      Summary:
      Our "rocksdb.sst.read.micros" stat includes time spent waiting for rate limiter. It probably only affects people rate limiting compaction reads, which is fairly rare.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4102
      
      Differential Revision: D8848506
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 01258ac5ae56e4eee372978cfc9143a6869f8bfc
      91d7c03c
    • P
      Relax VersionStorageInfo::GetOverlappingInputs check (#4050) · 90fc4069
      Peter Mattis 提交于
      Summary:
      Do not consider the range tombstone sentinel key as causing 2 adjacent
      sstables in a level to overlap. When a range tombstone's end key is the
      largest key in an sstable, the sstable's end key is so to a "sentinel"
      value that is the smallest key in the next sstable with a sequence
      number of kMaxSequenceNumber. This "sentinel" is guaranteed to not
      overlap in internal-key space with the next sstable. Unfortunately,
      GetOverlappingFiles uses user-keys to determine overlap and was thus
      considering 2 adjacent sstables in a level to overlap if they were
      separated by this sentinel key. This in turn would cause compactions to
      be larger than necessary.
      
      Note that this conflicts with
      https://github.com/facebook/rocksdb/pull/2769 and cases
      `DBRangeDelTest.CompactionTreatsSplitInputLevelDeletionAtomically` to
      fail.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4050
      
      Differential Revision: D8844423
      
      Pulled By: ajkr
      
      fbshipit-source-id: df3f9f1db8f4cff2bff77376b98b83c2ae1d155b
      90fc4069
    • Y
      Reduce execution time of IngestFileWithGlobalSeqnoRandomized (#4131) · 21171615
      Yanqin Jin 提交于
      Summary:
      Make `ExternalSSTFileTest.IngestFileWithGlobalSeqnoRandomized` run faster.
      
      `make format`
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4131
      
      Differential Revision: D8839952
      
      Pulled By: riversand963
      
      fbshipit-source-id: 4a7e842fde1cde4dc902e928a1cf511322578521
      21171615
    • M
      Per-thread unique test db names (#4135) · 8581a93a
      Maysam Yabandeh 提交于
      Summary:
      The patch makes sure that two parallel test threads will operate on different db paths. This enables using open source tools such as gtest-parallel to run the tests of a file in parallel.
      Example: ``` ~/gtest-parallel/gtest-parallel ./table_test```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4135
      
      Differential Revision: D8846653
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 799bad1abb260e3d346bcb680d2ae207a852ba84
      8581a93a
    • Z
      db_bench: enable setting cache_size when loading options file · 23b76252
      Zhongyi Xie 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4118
      
      Differential Revision: D8845554
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 13bd3c1259a7c30bad762a413fe3bb24eea650ba
      23b76252
    • F
      Converted db/merge_test.cc to use gtest (#4114) · 8527012b
      Fosco Marotto 提交于
      Summary:
      Picked up a task to convert this to use the gtest framework.  It can't be this simple, can it?
      
      It works, but should all the std::cout be removed?
      
      ```
      [$] ~/git/rocksdb [gft !]: ./merge_test
      [==========] Running 2 tests from 1 test case.
      [----------] Global test environment set-up.
      [----------] 2 tests from MergeTest
      [ RUN      ] MergeTest.MergeDbTest
      Test read-modify-write counters...
      a: 3
      1
      2
      a: 3
      b: 1225
      3
      Compaction started ...
      Compaction ended
      a: 3
      b: 1225
      Test merge-based counters...
      a: 3
      1
      2
      a: 3
      b: 1225
      3
      Test merge in memtable...
      a: 3
      1
      2
      a: 3
      b: 1225
      3
      Test Partial-Merge
      Test merge-operator not set after reopen
      [       OK ] MergeTest.MergeDbTest (93 ms)
      [ RUN      ] MergeTest.MergeDbTtlTest
      Opening database with TTL
      Test read-modify-write counters...
      a: 3
      1
      2
      a: 3
      b: 1225
      3
      Compaction started ...
      Compaction ended
      a: 3
      b: 1225
      Test merge-based counters...
      a: 3
      1
      2
      a: 3
      b: 1225
      3
      Test merge in memtable...
      Opening database with TTL
      a: 3
      1
      2
      a: 3
      b: 1225
      3
      Test Partial-Merge
      Opening database with TTL
      Opening database with TTL
      Opening database with TTL
      Opening database with TTL
      Test merge-operator not set after reopen
      [       OK ] MergeTest.MergeDbTtlTest (97 ms)
      [----------] 2 tests from MergeTest (190 ms total)
      
      [----------] Global test environment tear-down
      [==========] 2 tests from 1 test case ran. (190 ms total)
      [  PASSED  ] 2 tests.
      ```
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4114
      
      Differential Revision: D8822886
      
      Pulled By: gfosco
      
      fbshipit-source-id: c299d008e883c3bb911d2b357a2e9e4423f8e91a
      8527012b
    • M
      Exclude StackableDB from transaction stress tests (#4132) · 537a2339
      Maysam Yabandeh 提交于
      Summary:
      The transactions are currently tested with and without using StackableDB. This is mostly to check that the code path is consistent with stackable db as well. Slow, stress tests however do not benefit from being run again with StackableDB. The patch excludes StackableDB from such tests.
      On a single core it reduced the runtime of transaction_test from 199s to 135s.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4132
      
      Differential Revision: D8841655
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 7b9aaba2673b542b195439dfb306cef26bd63b19
      537a2339
    • A
      Re-enable kUniversalSubcompactions option_config (#4125) · e3eba52a
      Anand Ananthabhotla 提交于
      Summary:
      1. Move kUniversalSubcompactions up before kEnd in db_test_util.h, so
      tests that cycle through all the option_configs include this
      2. Skip kUniversalSubcompactions wherever kUniversalCompaction and
      kUniversalCompactionMultilevel are skipped
      
      Related to #3935
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4125
      
      Differential Revision: D8828637
      
      Pulled By: anand1976
      
      fbshipit-source-id: 650dee15fd27d85281cf9bb4ca8ab460e04cac6f
      e3eba52a
    • T
      Add GCC 8 to Travis (#3433) · 7bee48bd
      Tamir Duberstein 提交于
      Summary:
      - Avoid `strdup` to use jemalloc on Windows
      - Use `size_t` for consistency
      - Add GCC 8 to Travis
      - Add CMAKE_BUILD_TYPE=Release to Travis
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/3433
      
      Differential Revision: D6837948
      
      Pulled By: sagar0
      
      fbshipit-source-id: b8543c3a4da9cd07ee9a33f9f4623188e233261f
      7bee48bd