1. 18 7月, 2015 1 次提交
    • I
      Don't let flushes preempt compactions · 35ca5936
      Igor Canadi 提交于
      Summary:
      When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.
      
      We have a special case that when you set max_background_flushes to 0, we
      schedule the flush to execute on the compaction thread.
      
      Test Plan: make check (there might be some unit tests that depend on this behavior)
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41931
      35ca5936
  2. 16 7月, 2015 2 次提交
    • A
      move convenience.h out of utilities · 81d07262
      agiardullo 提交于
      Summary: Moved convenience.h out of utilities to remove a dependency on utilities in db.
      
      Test Plan: unit tests.  Also compiled a link to the old location to verify the _Pragma works.
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42201
      81d07262
    • P
      Fixing delete files in Trivial move of universal compaction · beb19ad0
      Poornima Chozhiyath Raman 提交于
      Summary:
      Trvial move in universal compaction was failing when trying to move files from levels other than 0.
      This was because the DeleteFile while trivially moving, was only deleting files of level 0 which caused duplication of same file in different levels.
      This is fixed by passing the right level as argument in the call of DeleteFile while doing trivial move.
      
      Test Plan: ./db_test ran successfully with the new test cases.
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42135
      beb19ad0
  3. 15 7月, 2015 4 次提交
  4. 14 7月, 2015 6 次提交
    • I
      Deprecate purge_redundant_kvs_while_flush · a9c51095
      Igor Canadi 提交于
      Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.
      
      Test Plan: none
      
      Reviewers: sdong, yhchiang, anthony, dhruba
      
      Reviewed By: dhruba
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42063
      a9c51095
    • I
      Deprecate WriteOptions::timeout_hint_us · 5aea98dd
      Igor Canadi 提交于
      Summary:
      In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
      * Less code == better icache hit rate, smaller builds, simpler code
      * The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
      * Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.
      
      Test Plan: make check
      
      Reviewers: anthony, kradhakrishnan, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: sdong, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41937
      5aea98dd
    • Y
      Move DynamicLevel related db-tests to db_dynamic_level_test.cc · b10cf4e2
      Yueh-Hsuan Chiang 提交于
      Summary: Move DynamicLevel related db-tests to db_dynamic_level_test.cc
      
      Test Plan:
      db_dynamic_level_test
      db_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42039
      b10cf4e2
    • Y
      Move global static functions in db_test_util to DBTestBase · 49f42ad0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Move global static functions in db_test_util to DBTestBase.
      This is to prevent unused function warning when decoupling
      db_test.cc into multiple files.
      
      Test Plan: db_test
      
      Reviewers: igor, sdong, anthony, IslamAbdelRahman, kradhakrishnan
      
      Reviewed By: kradhakrishnan
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D42009
      49f42ad0
    • Y
      Move reusable part of db_test.cc to util/db_test_util.h · 625467a0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Move reusable part of db_test.cc to util/db_test_util.h.
      This makes it more possible to partition db_test.cc into
      multiple smaller test files.
      
      Also, fixed many old lint errors in db_test.
      
      Test Plan: db_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong, kradhakrishnan
      
      Reviewed By: sdong, kradhakrishnan
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41973
      625467a0
    • S
      "make format" against last 10 commits · f9728640
      sdong 提交于
      Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.
      
      Test Plan: Build it.
      
      Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41961
      f9728640
  5. 10 7月, 2015 1 次提交
    • K
      Fix a noisy unit test. · 7189e90c
      krad 提交于
      Summary: The t/DBTest.DropWrites test still fails under certain gcc version in release unit test.
      
      I unfortunately cannot repro the failure (since the compilers have mapped library which I am not able to map to correctly). I am suspecting the clock skew.
      
      Test Plan: Run make check
      
      Reviewers:
      
      CC: sdong igore
      
      Task ID: #7312624
      
      Blame Rev:
      7189e90c
  6. 09 7月, 2015 3 次提交
    • D
      All of these are in the new code added past 3.10 · d8586ab2
      Dmitri Smirnov 提交于
           1) Crash in env_win.cc that prevented db_test run to completion and some new tests
           2) Fix new corruption tests in DBTest by allowing a shared trunction of files. Note that this is generally needed ONLY for tests.
           3) Close database so WAL is closed prior to inducing corruption similar to what we did within Corruption tests.
      d8586ab2
    • P
      Fix function name format according to google style · 4bed00a4
      Poornima Chozhiyath Raman 提交于
      Summary: Change the naming style of getter and setters according to Google C++ style in compaction.h file
      
      Test Plan: Compilation success
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D41265
      4bed00a4
    • K
      Added multi WAL log testing to recovery tests. · e2e3d84b
      krad 提交于
      Summary: Currently there is no test in the suite to test the case where
      there are multiple WAL files and there is a corruption in one of them. We have
      tests for single WAL file corruption scenarios. Added tests to mock
      the scenarios for all combinations of recovery modes and corruption in
      specified file locations.
      
      Test Plan: Run make check
      
      Reviewers: sdong igor
      
      CC: leveldb@
      
      Task ID: #7501229
      
      Blame Rev:
      e2e3d84b
  7. 08 7月, 2015 4 次提交
  8. 03 7月, 2015 1 次提交
    • M
      [wal changes 1/3] fixed unbounded wal growth in some workloads · 218487d8
      Mike Kolupaev 提交于
      Summary:
      This fixes the following scenario we've hit:
       - we reached max_total_wal_size, created a new wal and scheduled flushing all memtables corresponding to the old one,
       - before the last of these flushes started its column family was dropped; the last background flush call was a no-op; no one removed the old wal from alive_logs_,
       - hours have passed and no flushes happened even though lots of data was written; data is written to different column families, compactions are disabled; old column families are dropped before memtable grows big enough to trigger a flush; the old wal still sits in alive_logs_ preventing max_total_wal_size limit from kicking in,
       - a few more hours pass and we run out disk space because of one huge .log file.
      
      Test Plan: `make check`; backported the new test, checked that it fails without this diff
      
      Reviewers: igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40893
      218487d8
  9. 02 7月, 2015 1 次提交
    • D
      Windows Port from Microsoft · 18285c1e
      Dmitri Smirnov 提交于
       Summary: Make RocksDb build and run on Windows to be functionally
       complete and performant. All existing test cases run with no
       regressions. Performance numbers are in the pull-request.
      
       Test plan: make all of the existing unit tests pass, obtain perf numbers.
      
       Co-authored-by: Praveen Rao praveensinghrao@outlook.com
       Co-authored-by: Sherlock Huang baihan.huang@gmail.com
       Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
       Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
      18285c1e
  10. 01 7月, 2015 1 次提交
    • K
      Increasing timeout for drop writes. · b0f1927d
      krad 提交于
      Summary: We have a race in the way test works. We avoided the race by adding the
      wait to the counter. I thought 1s was eternity, but that is not true in some
      scenarios. Increasing the timeout to 10s and adding warnings.
      
      Also, adding nosleep to avoid the case where the wakeup thread is waiting behind
      the sleeping thread for scheduling.
      
      Test Plan: Run make check
      
      Reviewers: siying igorcanadi
      
      CC: leveldb@
      
      Task ID: #7312624
      
      Blame Rev:
      b0f1927d
  11. 30 6月, 2015 2 次提交
    • T
      Fix a comparison in DBIter::FindPrevUserKey() · ec70fea4
      Tomislav Novak 提交于
      Summary:
      When seek target is a merge key (`kTypeMerge`), `DBIter::FindNextUserEntry()`
      advances the underlying iterator _past_ the current key (`saved_key_`); see
      `MergeValuesNewToOld()`. However, `FindPrevUserKey()` assumes that `iter_`
      points to an entry with the same user key as `saved_key_`. As a result,
      `it->Seek(key) && it->Prev()` can cause the iterator to be positioned at the
      _next_, instead of the previous, entry (new test, written by @lovro, reproduces
      the bug).
      
      This diff changes `FindPrevUserKey()` to also skip keys that are _greater_ than
      `saved_key_`.
      
      Test Plan: db_test
      
      Reviewers: igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, dhruba, lovro
      
      Differential Revision: https://reviews.facebook.net/D40791
      ec70fea4
    • K
      Fix race in unit test. · 6199cba9
      krad 提交于
      Summary: Avoid falling victim to race condition.
      
      Test Plan: Run the unit test
      
      Reviewers: sdong igor
      
      CC: leveldb@
      
      Task ID: #7312624
      
      Blame Rev:
      6199cba9
  12. 26 6月, 2015 2 次提交
  13. 24 6月, 2015 2 次提交
    • I
      Bottommost level compaction option · 674b1181
      Islam AbdelRahman 提交于
      Summary: Replace force_bottommost_level_compaction in CompactRangeOption with an option that allow the user to (always skip, always compact, compact if compaction filter is present) the bottommost level for level based compaction.
      
      Test Plan: make check
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40527
      674b1181
    • G
      Implement a table-level row cache · 782a1590
      Giuseppe Ottaviano 提交于
      Summary:
      Implementation of a table-level row cache.
      It only caches point queries done through the `DB::Get` interface, queries done through the `Iterator` interface will completely skip the cache.
      
      Supports snapshots and merge operations.
      
      Test Plan: Ran `make valgrind_check commit-prereq`
      
      Reviewers: igor, philipp, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D39849
      782a1590
  14. 23 6月, 2015 1 次提交
    • K
      Introduce WAL recovery consistency levels · de85e4ca
      krad 提交于
      Summary:
      The "one size fits all" approach with WAL recovery will only introduce inconvenience for our varied clients as we go forward. The current recovery is a bit heuristic. We introduce the following levels of consistency while replaying the WAL.
      
      1. RecoverAfterRestart (kTolerateCorruptedTailRecords)
      
      This mocks the current recovery mode.
      
      2. RecoverAfterCleanShutdown (kAbsoluteConsistency)
      
      This is ideal for unit test and cases where the store is shutdown cleanly. We tolerate no corruption or incomplete writes.
      
      3. RecoverPointInTime (kPointInTimeRecovery)
      
      This is ideal when using devices with controller cache or file systems which can loose data on restart. We recover upto the point were is no corruption or incomplete write.
      
      4. RecoverAfterDisaster (kSkipAnyCorruptRecord)
      
      This is ideal mode to recover data. We tolerate corruption and incomplete writes, and we hop over those sections that we cannot make sense of salvaging as many records as possible.
      
      Test Plan:
      (1) Run added unit test to cover all levels.
      (2) Run make check.
      
      Reviewers: leveldb, sdong, igor
      
      Subscribers: yoshinorim, dhruba
      
      Differential Revision: https://reviews.facebook.net/D38487
      de85e4ca
  15. 20 6月, 2015 1 次提交
    • V
      Add wal files to Checkpoint for multiple column families. · 04251e1e
      Venkatesh Radhakrishnan 提交于
      Summary:
      When there are multiple column families, the flush in
      GetLiveFiles is not atomic, so that there are entries in the wal files
      which are needed to get a consisten RocksDB. We now add the log files to
      the checkpoint.
      
      Test Plan:
      CheckpointCF - This test forces more data to be written to
      the other column families after the flush of the first column family but
      before the second.
      
      Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D40323
      04251e1e
  16. 19 6月, 2015 3 次提交
    • I
      Disable CompressLevelCompaction() if Zlib is not supported · bf03f59c
      Igor Canadi 提交于
      Summary: CompressLevelCompaction() depends on Zlib. We should skip it when zlib is not present.
      
      Test Plan: `make check` without zlib
      
      Reviewers: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D40401
      bf03f59c
    • I
      Fail DB::Open() when the requested compression is not available · 760e9a94
      Igor Canadi 提交于
      Summary:
      Currently RocksDB silently ignores this issue and doesn't compress the data. Based on discussion, we agree that this is pretty bad because it can cause confusion for our users.
      
      This patch fails DB::Open() if we don't support the compression that is specified in the options.
      
      Test Plan: make check with LZ4 not present. If Snappy is not present all tests will just fail because Snappy is our default library. We should make Snappy the requirement, since without it our default DB::Open() fails.
      
      Reviewers: sdong, MarkCallaghan, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D39687
      760e9a94
    • I
      Skip bottommost level compaction if possible · 4eabbdb7
      Islam AbdelRahman 提交于
      Summary:
      This is https://reviews.facebook.net/D39999 but after introducing an option to force compaction the bottom most level
      
      Changes in this patch
      - Introduce force_bottommost_level_compaction to CompactRangeOptions that force compacting bottommost level during compaction
      - Skip bottommost level compaction if we dont have a compaction filter and force_bottommost_level_compaction options is not set
      
      Although tests pass on my machine but I suspect that there maybe some tests that I am not aware of that  should use force_bottommost_level_compaction to pass in a deterministic way
      
      Test Plan:
      make check
      adding new tests
      
      Reviewers: igor, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40059
      4eabbdb7
  17. 18 6月, 2015 1 次提交
    • I
      Use CompactRangeOptions for CompactRange · 12e030a9
      Islam AbdelRahman 提交于
      Summary:
      This diff update DB::CompactRange to use RangeCompactionOptions instead of using multiple parameters
      Old CompactRange is still available but deprecated
      
      Test Plan:
      make all check
      make rocksdbjava
      USE_CLANG=1 make all
      OPT=-DROCKSDB_LITE make release
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40209
      12e030a9
  18. 17 6月, 2015 1 次提交
  19. 13 6月, 2015 2 次提交
  20. 12 6月, 2015 1 次提交
    • S
      Slow down writes by bytes written · 7842920b
      sdong 提交于
      Summary:
      We slow down data into the database to the rate of options.delayed_write_rate (a new option) with this patch.
      
      The thread synchronization approach I take is to still synchronize write controller by DB mutex and GetDelay() is inside DB mutex. Try to minimize the frequency of getting time in GetDelay(). I verified it through db_bench and it seems to work
      
      hard_rate_limit is deprecated.
      
      options.delayed_write_rate is still not dynamically changeable. Need to work on it as a follow-up.
      
      Test Plan: Add new unit tests in db_test
      
      Reviewers: yhchiang, rven, kradhakrishnan, anthony, MarkCallaghan, igor
      
      Reviewed By: igor
      
      Subscribers: ikabiljo, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D36351
      7842920b