1. 22 7月, 2015 3 次提交
    • M
      [wal changes 2/3] write with sync=true syncs previous unsynced wals to prevent illegal data loss · fe09a6da
      Mike Kolupaev 提交于
      Summary:
      I'll just copy internal task summary here:
      
      "
      This sequence will cause data loss in the middle after an sync write:
      
      non-sync write key 1
      flush triggered, not yet scheduled
      sync write key 2
      system crash
      
      After rebooting, users might see key 2 but not key 1, which violates the API of sync write.
      
      This can be reproduced using unit test FaultInjectionTest::DISABLED_WriteOptionSyncTest.
      
      One way to fix it is for a sync write, if there is outstanding unsynced log files, we need to syc them too.
      "
      
      This diff should be considered together with the next diff D40905; in isolation this fix probably could be a little simpler.
      
      Test Plan: `make check`; added a test for that (DBTest.SyncingPreviousLogs) before noticing FaultInjectionTest.WriteOptionSyncTest (keeping both since mine asserts a bit more); both tests fail without this diff; for D40905 stacked on top of this diff, ran tests with ASAN, TSAN and valgrind
      
      Reviewers: rven, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D40899
      fe09a6da
    • A
      Report live data size estimate · 06aebca5
      Andres Notzli 提交于
      Summary:
      Fixes T6548822. Added a new function for estimating the size of the live data
      as proposed in the task. The value can be accessed through the property
      rocksdb.estimate-live-data-size.
      
      Test Plan:
      There are two unit tests in version_set_test and a simple test in db_test.
      make version_set_test && ./version_set_test;
      make db_test && ./db_test gtest_filter=GetProperty
      
      Reviewers: rven, igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41493
      06aebca5
    • S
      Fix undeterministic failure of DBTest.GetPropertiesOfAllTablesTest · 02b635fa
      sdong 提交于
      Summary: DBTest.GetPropertiesOfAllTablesTest generates four files and expects four files there, but a L0->L1 comapction can trigger to compact to one single file. Fix it by raising level 0 number of file compaction trigger
      
      Test Plan: Run it many times and see it never fails.
      
      Reviewers: kradhakrishnan, IslamAbdelRahman, yhchiang, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42789
      02b635fa
  2. 21 7月, 2015 14 次提交
  3. 18 7月, 2015 5 次提交
    • S
      Move rate_limiter, write buffering, most perf context instrumentation and most... · 6e9fbeb2
      sdong 提交于
      Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env
      
      Summary: We want to keep Env a think layer for better portability. Less platform dependent codes should be moved out of Env. In this patch, I create a wrapper of file readers and writers, and put rate limiting, write buffering, as well as most perf context instrumentation and random kill out of Env. It will make it easier to maintain multiple Env in the future.
      
      Test Plan: Run all existing unit tests.
      
      Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42321
      6e9fbeb2
    • I
      Don't let flushes preempt compactions · 35ca5936
      Igor Canadi 提交于
      Summary:
      When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.
      
      We have a special case that when you set max_background_flushes to 0, we
      schedule the flush to execute on the compaction thread.
      
      Test Plan: make check (there might be some unit tests that depend on this behavior)
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41931
      35ca5936
    • A
      Added JSON manifest dump option to ldb command · 74c755c5
      Ari Ekmekji 提交于
      Summary:
      Added a new flag --json to the ldb manifest_dump command
      that prints out the version edits as JSON objects for easier
      reading and parsing of information.
      
      Test Plan:
      **Sample usage: **
      ```
      ./ldb manifest_dump --json --path=path/to/manifest/file
      ```
      
      **Sample output:**
      ```
      {"EditNumber": 0, "Comparator": "leveldb.BytewiseComparator", "ColumnFamily": 0}
      {"EditNumber": 1, "LogNumber": 0, "ColumnFamily": 0}
      {"EditNumber": 2, "LogNumber": 4, "PrevLogNumber": 0, "NextFileNumber": 7, "LastSeq": 35356, "AddedFiles": [{"Level": 0, "FileNumber": 5, "FileSize": 1949284, "SmallestIKey": "'", "LargestIKey": "'"}], "ColumnFamily": 0}
      ...
      {"EditNumber": 13, "PrevLogNumber": 0, "NextFileNumber": 36, "LastSeq": 290994, "DeletedFiles": [{"Level": 0, "FileNumber": 17}, {"Level": 0, "FileNumber": 20}, {"Level": 0, "FileNumber": 22}, {"Level": 0, "FileNumber": 24}, {"Level": 1, "FileNumber": 13}, {"Level": 1, "FileNumber": 14}, {"Level": 1, "FileNumber": 15}, {"Level": 1, "FileNumber": 18}], "AddedFiles": [{"Level": 1, "FileNumber": 25, "FileSize": 2114340, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 26, "FileSize": 2115213, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 27, "FileSize": 2114807, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 30, "FileSize": 2115271, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 31, "FileSize": 2115165, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 32, "FileSize": 2114683, "SmallestIKey": "'", "LargestIKey": "'"}, {"Level": 1, "FileNumber": 35, "FileSize": 1757512, "SmallestIKey": "'", "LargestIKey": "'"}], "ColumnFamily": 0}
      ...
      ```
      
      Reviewers: sdong, anthony, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D41727
      74c755c5
    • I
      Deprecate CompactionFilterV2 · a96fcd09
      Igor Canadi 提交于
      Summary: It has been around for a while and it looks like it never found any uses in the wild. It's also complicating our compaction_job code quite a bit. We're deprecating it in 3.13, but will put it back in 3.14 if we actually find users that need this feature.
      
      Test Plan: make check
      
      Reviewers: noetzli, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42405
      a96fcd09
    • A
      Fixed and simplified merge_helper · 1d20fa9d
      Andres Notzli 提交于
      Summary:
      MergeUntil was not reporting a success when merging an operand with
      a Value/Deletion despite the comments in MergeHelper and CompactionJob
      indicating otherwise. This lead to operands being written to the compaction
      output unnecessarily:
      
      M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M2 M3 M4 M5 (before the diff)
      M1 M2 M3 P M4 M5 --> (P+M1+M2+M3) M4 M5 (after the diff)
      
      In addition, the code handling Values/Deletion was basically identical.
      This patch unifies the code. Finally, this patch also adds testing for
      merge_helper.
      
      Test Plan: make && make check
      
      Reviewers: sdong, rven, yhchiang, tnovak, igor
      
      Reviewed By: igor
      
      Subscribers: tnovak, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42351
      1d20fa9d
  4. 17 7月, 2015 3 次提交
    • S
      Fix a typo in variable · ac2b9367
      sdong 提交于
      Summary: Typo seqeuntial -> sequential in db/fault_injection_test.cc
      
      Test Plan: Build it
      
      Reviewers: ott, igor, anthony, kradhakrishnan, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42417
      ac2b9367
    • S
      Fix data loss after DB recovery by not allowing flush/compaction to be scheduled until DB opened · 6c0c8dee
      sdong 提交于
      Summary:
      Previous run may leave some SST files with higher file numbers than manifest indicates.
      Compaction or flush may start to run while DB::Open() is still going on. SST file garbage collection may happen interleaving with compaction or flush, and overwrite files generated by compaction of flushes after they are generated. This might cause data loss. This possibility of interleaving is recently introduced.
      Fix it by not allowing compaction or flush to be scheduled before DB::Open() finishes.
      
      Test Plan: Add a unit test. This verification will have a chance to fail without the fix but doesn't fix without the fix.
      
      Reviewers: kradhakrishnan, anthony, yhchiang, IslamAbdelRahman, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42399
      6c0c8dee
    • A
      Test for compaction of corrupted keys · e4af3bfb
      Andres Notzli 提交于
      Summary:
      Fixes T7697334. Adds a simple test to check whether CompactionJob deals
      with corrupted keys correctly. Right now, we preserve corrupted keys.
      Note: depending on the type of corruption and options like comparators,
      CompactionJob fails. This test just checks whether corrupted keys that
      do not fail CompactionJob are preserved.
      
      Test Plan:
      `make compaction_job_test && ./compaction_job_test` -> Tests pass.
      Add `input->Next(); continue;` in CompactionJob::ProcessKeyValueCompaction()
      inside then-branch of `!ParseInternalKey(key, &ikey)` -> Tests fail.
      
      Reviewers: sdong, rven, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42237
      e4af3bfb
  5. 16 7月, 2015 5 次提交
    • A
      move convenience.h out of utilities · 81d07262
      agiardullo 提交于
      Summary: Moved convenience.h out of utilities to remove a dependency on utilities in db.
      
      Test Plan: unit tests.  Also compiled a link to the old location to verify the _Pragma works.
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42201
      81d07262
    • P
      Fixing delete files in Trivial move of universal compaction · beb19ad0
      Poornima Chozhiyath Raman 提交于
      Summary:
      Trvial move in universal compaction was failing when trying to move files from levels other than 0.
      This was because the DeleteFile while trivially moving, was only deleting files of level 0 which caused duplication of same file in different levels.
      This is fixed by passing the right level as argument in the call of DeleteFile while doing trivial move.
      
      Test Plan: ./db_test ran successfully with the new test cases.
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D42135
      beb19ad0
    • K
      Build fix. · c6139606
      krad 提交于
      Summary: gcc-4.9-glibc-2.20 complains about uninitialized variable.
      
      db/compaction_picker.cc: In member function 'bool
      rocksdb::CompactionPicker::IsInputNonOverlapping(rocksdb::Compaction*)':
      db/compaction_picker.cc:1174:17: error:
      'prev.rocksdb::{anonymous}::InputFileInfo::f' may be used uninitialized in this
      function [-Werror=maybe-uninitialized]
         InputFileInfo prev, curr, next;
      
      Test Plan: pmake on local environment
      
      Reviewers: sdong igor
      
      CC: leveldb@
      
      Task ID: #
      
      Blame Rev:
      c6139606
    • A
      Update --help message in db_bench. · 2c8de0ec
      Aaron Feldman 提交于
      Summary:
      Remove --help entry for readhot.
      
      Update read_random_exp_range flag description: The distribution is num *
      exp(-r), not num * exp(r).
      
      Test Plan: Run ./db_bench --help
      
      Reviewers: sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D42303
      2c8de0ec
    • A
      Refactoring of writing key/value pairs · 6b2d44b2
      Andres Notzli 提交于
      Summary:
      Before, writing key/value pairs out to files was done inside
      ProcessKeyValueCompaction(). To make ProcessKeyValueCompaction()
      more understandable, this patch moves the writing part to a separate
      function. This is intended to be a stepping stone for additional
      changes.
      
      Test Plan: make && make check
      
      Reviewers: sdong, rven, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42243
      6b2d44b2
  6. 15 7月, 2015 8 次提交
  7. 14 7月, 2015 2 次提交
    • I
      Deprecate purge_redundant_kvs_while_flush · a9c51095
      Igor Canadi 提交于
      Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.
      
      Test Plan: none
      
      Reviewers: sdong, yhchiang, anthony, dhruba
      
      Reviewed By: dhruba
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42063
      a9c51095
    • I
      Deprecate WriteOptions::timeout_hint_us · 5aea98dd
      Igor Canadi 提交于
      Summary:
      In one of our recent meetings, we discussed deprecating features that are not being actively used. One of those features, at least within Facebook, is timeout_hint. The feature is really nicely implemented, but if nobody needs it, we should remove it from our code-base (until we get a valid use-case). Some arguments:
      * Less code == better icache hit rate, smaller builds, simpler code
      * The motivation for adding timeout_hint_us was to work-around RocksDB's stall issue. However, we're currently addressing the stall issue itself (see @sdong's recent work on stall write_rate), so we should never see sharp lock-ups in the future.
      * Nobody is using the feature within Facebook's code-base. Googling for `timeout_hint_us` also doesn't yield any users.
      
      Test Plan: make check
      
      Reviewers: anthony, kradhakrishnan, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: sdong, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41937
      5aea98dd