1. 03 4月, 2019 1 次提交
  2. 02 4月, 2019 2 次提交
    • X
      Add LevelDB repository link in the Readme · fa1b5582
      xinbenlv 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5134
      
      Differential Revision: D14719068
      
      Pulled By: siying
      
      fbshipit-source-id: c09a544f06ff414dbe2f90792aaf2bb5b8550bee
      fa1b5582
    • M
      Add DBOptions. avoid_unnecessary_blocking_io to defer file deletions (#5043) · 120bc471
      Mike Kolupaev 提交于
      Summary:
      Just like ReadOptions::background_purge_on_iterator_cleanup but for ColumnFamilyHandle instead of Iterator.
      
      In our use case we sometimes call ColumnFamilyHandle's destructor from low-latency threads, and sometimes it blocks the thread for a few seconds deleting the files. To avoid that, we can either offload ColumnFamilyHandle's destruction to a background thread on our side, or add this option on rocksdb side. This PR does the latter, to be consistent with how we solve exactly the same problem for iterators using background_purge_on_iterator_cleanup option.
      
      (EDIT: It's avoid_unnecessary_blocking_io now, and affects both CF drops and iterator destructors.)
      I'm not quite comfortable with having two separate options (background_purge_on_iterator_cleanup and background_purge_on_cf_cleanup) for such a rarely used thing. Maybe we should merge them? Rename background_purge_on_cf_cleanup to something like delete_files_on_background_threads_only or avoid_blocking_io_in_unexpected_places, and make iterators use it instead of the one in ReadOptions? I can do that here if you guys think it's better.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5043
      
      Differential Revision: D14339233
      
      Pulled By: al13n321
      
      fbshipit-source-id: ccf7efa11c85c9a5b91d969bb55627d0fb01e7b8
      120bc471
  3. 30 3月, 2019 4 次提交
    • R
      Fix arena allocation size in NewEmptyInternalIterator (#4905) · 127a850b
      Remington Brasga 提交于
      Summary:
      NewEmptyInternalIterator with arena mistakenly used EmptyIterator to allocate the size from area but then initialized it to a totally different object: EmptyInternalIterator. The patch fixes that.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4905
      
      Differential Revision: D14689840
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: af64fd8ee93d5a4ad54691c792e5ecc5efabc887
      127a850b
    • M
      WriteUnPrepared: Enable auto-compaction after max_evicted_seq_ init (#5128) · a703f16d
      Maysam Yabandeh 提交于
      Summary:
      Compaction would depend on max_evicted_seq_ value. The ::Initialize method should do that after max_evicted_seq_ is properly initialized. The patch also back ports #4853 from WritePrepared txn to WriteUnPrepared.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5128
      
      Differential Revision: D14686562
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: b2355025712a72676ac3b20a95258adcf4774490
      a703f16d
    • Y
      Avoid per-key upper bound check in BlockBasedTableIterator (#5101) · f29dc1b9
      Yi Wu 提交于
      Summary:
      `BlockBasedTableIterator` avoid reading next block on `Next()` if it detects the iterator will be out of bound, by checking against index key. The optimization was added in #2239, and by the time it only check the bound per block. It seems later change make it a per-key check, which introduce unnecessary key comparisons.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5101
      
      Differential Revision: D14678707
      
      Pulled By: siying
      
      fbshipit-source-id: 2372446116753c7892ea4cec7b4b49ef87ba463e
      f29dc1b9
    • Y
      Update RepeatableThreadTest with MockTimeEnv (#5107) · 09957ded
      Yanqin Jin 提交于
      Summary:
      **This PR updates RepeatableThread::wait, breaking some tests on OS X. The rest of the PR fixes the tests on OS X.**
      `RepeatableThreadTest.MockEnvTest` uses `MockTimeEnv` and `RepeatableThread`. If `RepeatableThread::wait` calls `TimedWait` with a time smaller than or equal to the current (real) time, `TimedWait` returns immediately on certain platforms, e.g. OS X. #4560 addresses this issue by replacing `TimedWait` with `Wait` in test. This fixes the test but makes test/production code diverge, which is not optimal for test coverage. This PR proposes an alternative fix which unifies test and production code path for `RepeatableThread::wait`. We obtain the current (real) time in seconds and add 10 extra seconds to ensure that `RepeatableThread::wait` invokes `TimedWait` with a time greater than (real) current time. This is to prevent the `TimedWait` function from returning immediately without sleeping and releasing the mutex. If `TimedWait` returns immediately, the mutex will not be released, and `RepeatableThread::TEST_WaitForRun` never has a chance to execute the callback which, in this case, updates the result returned by `mock_env->NowMicros()`. Consequently, `RepeatableThread::wait` cannot break out of the loop, causing test to hang. The extra 10 seconds is a best-effort approach because there seems no reliable and deterministic way to provide the aforementioned guarantee. By the time `RepeatableThread::wait` is called, there is no guarantee that the `delay + mock_env->NowMicros()` will be greater than the current real time. However, 10 seconds should be sufficient in most cases. We will keep an eye for possible flakiness of this test.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5107
      
      Differential Revision: D14680885
      
      Pulled By: riversand963
      
      fbshipit-source-id: d1ecbe10e1dacd110bd464cd01e188bfee72b89e
      09957ded
  4. 29 3月, 2019 4 次提交
    • Y
      Fix db_stress for custom env (#5122) · d77476ef
      Yanqin Jin 提交于
      Summary:
      Fix some hdfs-related code so that it can compile and run 'db_stress'
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5122
      
      Differential Revision: D14675495
      
      Pulled By: riversand963
      
      fbshipit-source-id: cac280479efcf5451982558947eac1732e8bc45a
      d77476ef
    • A
      Smooth the deletion of WAL files (#5116) · dae3b554
      anand76 提交于
      Summary:
      WAL files are currently not subject to deletion rate limiting by DeleteScheduler. If the size of the WAL files is significant, this can cause a high delete rate on SSDs that may affect other operations. To fix it, force WAL file deletions to go through the SstFileManager. Original PR for this is #2768
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5116
      
      Differential Revision: D14669437
      
      Pulled By: anand1976
      
      fbshipit-source-id: c5f62d0640cebaa1574de841a1d01e4ce2faadf0
      dae3b554
    • S
      Option string/map can set merge operator from object registry (#5123) · a98317f5
      Siying Dong 提交于
      Summary:
      Allow customized merge operator to be loaded from option file/map/string
      by allowing users to pre-regiester merge operators to object registry.
      
      Also update HISTORY.md and header files for the same feature for comparator.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5123
      
      Differential Revision: D14658488
      
      Pulled By: siying
      
      fbshipit-source-id: 86ea2fbd2a0a04632d8ea9fceaffefd041f6ae61
      a98317f5
    • S
      Improve obsolete_files_test (#5125) · 106a94af
      Siying Dong 提交于
      Summary:
      We see a failure of obsolete_files_test but aren't able to identify
      the issue. Improve the test in following way and hope we can debug
      better next time:
      1. Place sync point before automatic compaction runs so race condition
         will always trigger.
      2. Disable sync point before test finishes.
      3. ASSERT_OK() instead of ASSERT_TRUE(status.ok())
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5125
      
      Differential Revision: D14669456
      
      Pulled By: siying
      
      fbshipit-source-id: dccb7648e334501ad651eb212880096eef1f4ab2
      106a94af
  5. 28 3月, 2019 7 次提交
  6. 27 3月, 2019 7 次提交
    • Y
      Support for single-primary, multi-secondary instances (#4899) · 9358178e
      Yanqin Jin 提交于
      Summary:
      This PR allows RocksDB to run in single-primary, multi-secondary process mode.
      The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary.
      Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`.
      
      This PR has several components:
      1. (Originally in #4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary.
      
      2. (Similar to #4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue.
      
      3. (Originally in #4710 and #4820). Add implementation of the secondary, i.e. `DBImplSecondary`.
      3.1 Tail the primary's MANIFEST during recovery.
      3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`.
      3.3 Tailing WAL will be in a future PR.
      
      4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4899
      
      Differential Revision: D14510945
      
      Pulled By: riversand963
      
      fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886
      9358178e
    • J
      remove bundled but unused fbson library (#5108) · 2a5463ae
      jsteemann 提交于
      Summary:
      fbson library is still included in `third-party` directory, but is not needed by RocksDB anymore.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5108
      
      Differential Revision: D14622272
      
      Pulled By: siying
      
      fbshipit-source-id: 52b24ed17d8d870a71364f85e5bac4eafb192df5
      2a5463ae
    • S
      Introduce CPU timers for iterator seek and next (#5076) · 01e6badb
      Shi Feng 提交于
      Summary:
      Introduce CPU timers for iterator seek and next operations. Seek
      counter includes SeekToFirst, SeekToLast and SeekForPrev, w/ the
      caveat that SeekToLast timer doesn't include some post processing
      time if upper bound is defined.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5076
      
      Differential Revision: D14525218
      
      Pulled By: fredfsh
      
      fbshipit-source-id: 03ba25df3b22b06c072621e4de0eacfa1445f0d9
      01e6badb
    • S
      Allow option string to get comparator from object registry (#5106) · 4774a940
      Siying Dong 提交于
      Summary:
      Even customized ldb may not be able to read data from some databases if
      comparator is not standard. We modify option helper to get comparator from
      object registry so that we can use customized ldb to read non-standard
      comparator.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5106
      
      Differential Revision: D14622107
      
      Pulled By: siying
      
      fbshipit-source-id: 151dcb295a35a4c7d54f919cd4e322a89dc601c9
      4774a940
    • S
      BlobDB::Open() should put all existing trash files to delete scheduler (#5103) · fe2bd190
      Siying Dong 提交于
      Summary:
      Right now, BlobDB::Open() fails to put all trash files to delete scheduler,
      which causes some trash files permanently untracked.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5103
      
      Differential Revision: D14606095
      
      Pulled By: siying
      
      fbshipit-source-id: 41a9437a2948abb235c0ed85f9a04612d0e50183
      fe2bd190
    • Y
      Fix SstFileReader not able to open ingested file (#5097) · 75133b1b
      Yi Wu 提交于
      Summary:
      Since `SstFileReader` don't know largest seqno of a file, it will fail this check when it open a file with global seqno: https://github.com/facebook/rocksdb/blob/ca89ac2ba997dfa0e135bd75d4ccf6f5774a7eff/table/block_based_table_reader.cc#L730
      Changes:
      * Pass largest_seqno=kMaxSequenceNumber from `SstFileReader` and allow it to bypass the above check.
      * `BlockBasedTable::VerifyChecksum` also double check if checksum will match when excluding global seqno (this is to make the new test in sst_table_reader_test pass).
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5097
      
      Differential Revision: D14607434
      
      Pulled By: riversand963
      
      fbshipit-source-id: 9008599227c5fccbf9b73fee46b3bf4a1523f023
      75133b1b
    • Y
      Fix BlockBasedTableIterator construction missing index_key_is_full parameter · 7ca9eb75
      Yi Wu 提交于
      Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5104
      
      Differential Revision: D14619000
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: c2895794a3f31b826c149dcb698c1952dacc2332
      7ca9eb75
  7. 26 3月, 2019 3 次提交
  8. 22 3月, 2019 3 次提交
    • R
      Make it easier for users to load options from option file and set shared block cache. (#5063) · a4396f92
      Rashmi Sharma 提交于
      Summary:
      [RocksDB] Make it easier for users to load options from option file and set shared block cache.
      Right now, it requires several dynamic casting for users to set the shared block cache to their option struct cast from the option file.
      If people don't do that, every CF of every DB will generate its own 8MB block cache. It's not a usable setting. So we are dragging every user who loads options from the file into such a mess.
      Instead, we should allow them to pass their cache object to LoadLatestOptions() and LoadOptionsFromFile(), so that those loaded option structs will have the shared block cache.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5063
      
      Differential Revision: D14518584
      
      Pulled By: rashmishrm
      
      fbshipit-source-id: c91430ff9425a0e67d76fc67931d755f491ca5aa
      a4396f92
    • B
      fix NowNanos overflow (#5062) · 88d85b68
      Burton Li 提交于
      Summary:
      The original implementation of WinEnvIO::NowNanos() has a constant data overflow by:
      li.QuadPart *= std::nano::den;
      As a result, the api provides a incorrect result.
      e.g.:
      li.QuadPart=13477844301545
      std::nano::den=1e9
      
      The fix uses pre-computed nano_seconds_per_period_ to present the nano seconds per performance counter period, in the case if nano::den is divisible by perf_counter_frequency_. Otherwise it falls back to use high_resolution_clock.
      siying ajkr
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5062
      
      Differential Revision: D14426842
      
      Pulled By: anand1976
      
      fbshipit-source-id: 127f1daf423dd4b30edd0dcf8ea0466f468bec12
      88d85b68
    • M
      Reorder DBIter fields to reduce memory usage (#5078) · c84fad7a
      Maysam Yabandeh 提交于
      Summary:
      The patch reorders DBIter fields to put 1-byte fields together and let the compiler optimize the memory usage by using less 64-bit allocations for bools and enums.
      
      This might have a negative side effect of putting the variables that are accessed together into different cache lines and hence increasing the cache misses. Not sure what benchmark would verify that thought. I ran simple, single-threaded seekrandom benchmarks but the variance in the results is too much to be conclusive.
      
      ./db_bench --benchmarks=fillrandom --use_existing_db=0 --num=1000000 --db=/dev/shm/dbbench
      ./db_bench --benchmarks=seekrandom[X10] --use_existing_db=1 --db=/dev/shm/dbbench --num=1000000 --duration=60 --seek_nexts=100
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5078
      
      Differential Revision: D14562676
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: 2284655d46e079b6e9a860e94be5defb6f482167
      c84fad7a
  9. 21 3月, 2019 3 次提交
  10. 20 3月, 2019 4 次提交
  11. 19 3月, 2019 2 次提交
    • S
      Feature for sampling and reporting compressibility (#4842) · b45b1cde
      Shobhit Dayal 提交于
      Summary:
      This is a feature to sample data-block compressibility and and report them as stats. 1 in N (tunable) blocks is sampled for compressibility using two algorithms:
      1. lz4 or snappy for fast compression
      2. zstd or zlib for slow but higher compression.
      
      The stats are reported to the caller as raw-bytes and compressed-bytes. The block continues to be compressed for storage using the specified CompressionType.
      
      The db_bench_tool how has a command line option for specifying the sampling rate. It's default value is 0 (no sampling). To test the overhead for a certain value, users can compare the performance of db_bench_tool, varying the sampling rate. It is unlikely to have a noticeable impact for high values like 20.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4842
      
      Differential Revision: D13629011
      
      Pulled By: shobhitdayal
      
      fbshipit-source-id: 14ca668bcab6499b2a1734edf848eb62a4f4fafa
      b45b1cde
    • H
      utilities: Fix build failure with -Werror=maybe-uninitialized (#5074) · 20d49da9
      He Zhe 提交于
      Summary:
      Initialize magic_number to zero to avoid such failure.
      utilities/blob_db/blob_log_format.cc:91:3: error: 'magic_number' may be used
      uninitialized in this function [-Werror=maybe-uninitialized]
         if (magic_number != kMagicNumber) {
         ^~
      Signed-off-by: NHe Zhe <zhe.he@windriver.com>
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/5074
      
      Differential Revision: D14505514
      
      Pulled By: miasantreble
      
      fbshipit-source-id: 4334462958c2b9c5a7c68c6ab24dadf94ad70902
      20d49da9