1. 15 8月, 2013 1 次提交
    • T
      Add options to dump. · a8f47a40
      Tyler Harter 提交于
      Summary: added options to Dump() I missed in D12027.  I also ran a script to look for other missing options and found a couple which I added.  Should we also print anything for "PrepareForBulkLoad", "memtable_factory", and "statistics"?  Or should we leave those alone since it's not easy to print useful info for those?
      
      Test Plan: run anything and look at LOG file to make sure these are printed now.
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12219
      a8f47a40
  2. 14 8月, 2013 3 次提交
    • M
      Counter for merge failure · f1bf1694
      Mayank Agarwal 提交于
      Summary:
      With Merge returning bool, it can keep failing silently(eg. While faling to fetch timestamp in TTL). We need to detect this through a rocksdb counter which can get bumped whenever Merge returns false. This will also be super-useful for the mcrocksdb-counter service where Merge may fail.
      Added a counter NUMBER_MERGE_FAILURES and appropriately updated db/merge_helper.cc
      
      I felt that it would be better to directly add counter-bumping in Merge as a default function of MergeOperator class but user should not be aware of this, so this approach seems better to me.
      
      Test Plan: make all check
      
      Reviewers: dnicholas, haobo, dhruba, vamsi
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12129
      f1bf1694
    • T
      Prefix filters for scans (v4) · f5f18422
      Tyler Harter 提交于
      Summary: Similar to v2 (db and table code understands prefixes), but use ReadOptions as in v3.  Also, make the CreateFilter code faster and cleaner.
      
      Test Plan: make db_test; export LEVELDB_TESTS=PrefixScan; ./db_test
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: haobo, emayanke
      
      Differential Revision: https://reviews.facebook.net/D12027
      f5f18422
    • S
      Separate compaction filter for each compaction · 3b81df34
      sumeet 提交于
      Summary:
      If we have same compaction filter for each compaction,
      application cannot know about the different compaction processes.
      Later on, we can put in more details in compaction filter for the
      application to consume and use it according to its needs. For e.g. In
      the universal compaction, we have a compaction process involving all the
      files while others don't involve all the files. Applications may want to
      collect some stats only when during full compaction.
      
      Test Plan: run existing unit tests
      
      Reviewers: haobo, dhruba
      
      Reviewed By: dhruba
      
      CC: xinyaohu, leveldb
      
      Differential Revision: https://reviews.facebook.net/D12057
      3b81df34
  3. 13 8月, 2013 3 次提交
  4. 10 8月, 2013 2 次提交
    • D
      Universal Compaction should keep DeleteMarkers unless it is the earliest file. · 93d77a27
      Dhruba Borthakur 提交于
      Summary:
      The pre-existing code was purging a DeleteMarker if thay key did not
      exist in deeper levels.  But in the Universal Compaction Style, all
      files are in Level0. For compaction runs that did not include the
      earliest file, we were erroneously purging the DeleteMarkers.
      
      The fix is to purge DeleteMarkers only if the compaction includes
      the earlist file.
      
      Test Plan: DBTest.Randomized triggers this code path.
      
      Differential Revision: https://reviews.facebook.net/D12081
      93d77a27
    • X
      Fix unit tests for universal compaction (step 2) · 8ae905ed
      Xing Jin 提交于
      Summary:
      Continue fixing existing unit tests for universal compaction. I have
      tried to apply universal compaction to all unit tests those haven't
      called ChangeOptions(). I left a few which are either apparently not
      applicable to universal compaction (because they check files/keys/values
      at level 1 or above levels), or apparently not related to compaction
      (e.g., open a file, open a db).
      
      I also add a new unit test for universal compaction.
      
      Good news is I didn't see any bugs during this round.
      
      Test Plan: Ran "make all check" yesterday. Has rebased and is rerunning
      
      Reviewers: haobo, dhruba
      
      Differential Revision: https://reviews.facebook.net/D12135
      8ae905ed
  5. 09 8月, 2013 2 次提交
  6. 08 8月, 2013 2 次提交
    • X
      Fix unit tests/bugs for universal compaction (first step) · 17b8f786
      Xing Jin 提交于
      Summary:
      This is the first step to fix unit tests and bugs for universal
      compactiion. I added universal compaction option to ChangeOptions(), and
      fixed all unit tests calling ChangeOptions(). Some of these tests
      obviously assume more than 1 level and check file number/values in level
      1 or above levels. I set kSkipUniversalCompaction for these tests.
      
      The major bug I found is manual compaction with universal compaction never stops. I have put a fix for
      it.
      
      I have also set universal compaction as the default compaction and found
      at least 20+ unit tests failing. I haven't looked into the details. The
      next step is to check all unit tests without calling ChangeOptions().
      
      Test Plan: make all check
      
      Reviewers: dhruba, haobo
      
      Differential Revision: https://reviews.facebook.net/D12051
      17b8f786
    • D
      Merge branch 'performance' of github.com:facebook/rocksdb into performance · f5fa26b6
      Dhruba Borthakur 提交于
      Conflicts:
      	db/builder.cc
      	db/db_impl.cc
      	db/version_set.cc
      	include/leveldb/statistics.h
      f5fa26b6
  7. 07 8月, 2013 3 次提交
  8. 06 8月, 2013 7 次提交
    • D
      [RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences. · c2d7826c
      Deon Nicholas 提交于
      Summary:
      Here are the major changes to the Merge Interface. It has been expanded
      to handle cases where the MergeOperator is not associative. It does so by stacking
      up merge operations while scanning through the key history (i.e.: during Get() or
      Compaction), until a valid Put/Delete/end-of-history is encountered; it then
      applies all of the merge operations in the correct sequence starting with the
      base/sentinel value.
      
      I have also introduced an "AssociativeMerge" function which allows the user to
      take advantage of associative merge operations (such as in the case of counters).
      The implementation will always attempt to merge the operations/operands themselves
      together when they are encountered, and will resort to the "stacking" method if
      and only if the "associative-merge" fails.
      
      This implementation is conjectured to allow MergeOperator to handle the general
      case, while still providing the user with the ability to take advantage of certain
      efficiencies in their own merge-operator / data-structure.
      
      NOTE: This is a preliminary diff. This must still go through a lot of review,
      revision, and testing. Feedback welcome!
      
      Test Plan:
        -This is a preliminary diff. I have only just begun testing/debugging it.
        -I will be testing this with the existing MergeOperator use-cases and unit-tests
      (counters, string-append, and redis-lists)
        -I will be "desk-checking" and walking through the code with the help gdb.
        -I will find a way of stress-testing the new interface / implementation using
      db_bench, db_test, merge_test, and/or db_stress.
        -I will ensure that my tests cover all cases: Get-Memtable,
      Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
      Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
      end-of-history, end-of-file, etc.
        -A lot of feedback from the reviewers.
      
      Reviewers: haobo, dhruba, zshao, emayanke
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11499
      c2d7826c
    • M
      Fix build · 73f9518b
      Mayank Agarwal 提交于
      Summary: remove reference
      
      Test Plan: make OPT=-g
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      73f9518b
    • J
      Add soft_rate_limit stats · 8e792e58
      Jim Paton 提交于
      Summary: This diff adds histogram stats for soft_rate_limit stalls. It also renames the old rate_limit stats to hard_rate_limit.
      
      Test Plan: make -j32 check
      
      Reviewers: dhruba, haobo, MarkCallaghan
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12021
      8e792e58
    • M
      Expose base db object from ttl wrapper · 1d7b4765
      Mayank Agarwal 提交于
      Summary: rocksdb replicaiton will need this when writing value+TS from master to slave 'as is'
      
      Test Plan: make
      
      Reviewers: dhruba, vamsi, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11919
      1d7b4765
    • J
      Add soft and hard rate limit support · 1036537c
      Jim Paton 提交于
      Summary:
      This diff adds support for both soft and hard rate limiting. The following changes are included:
      
      1) Options.rate_limit is renamed to Options.hard_rate_limit.
      2) Options.rate_limit_delay_milliseconds is renamed to Options.rate_limit_delay_max_milliseconds.
      3) Options.soft_rate_limit is added.
      4) If the maximum compaction score is > hard_rate_limit and rate_limit_delay_max_milliseconds == 0, then writes are delayed by 1 ms at a time until the max compaction score falls below hard_rate_limit.
      5) If the max compaction score is > soft_rate_limit but <= hard_rate_limit, then writes are delayed by 0-1 ms depending on how close we are to hard_rate_limit.
      6) Users can disable 4 by setting hard_rate_limit = 0. They can add a limit to the maximum amount of time waited by setting rate_limit_delay_max_milliseconds > 0. Thus, the old behavior can be preserved by setting soft_rate_limit = 0, which is the default.
      
      Test Plan:
      make -j32 check
      ./db_stress
      
      Reviewers: dhruba, haobo, MarkCallaghan
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D12003
      1036537c
    • M
      Support user's compaction filter in TTL logic · cacd812f
      Mayank Agarwal 提交于
      Summary: TTL uses compaction filter to purge key-values and required the user to not pass one. This diff makes it accommodating of user's compaciton filter. Added test to ttl_test
      
      Test Plan: make; ./ttl_test
      
      Reviewers: dhruba, haobo, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11973
      cacd812f
    • M
      Changing Makefile to have rocksdb instead of leveldb in binary-names · 7c9093ab
      Mayank Agarwal 提交于
      Summary: did a find-replace
      
      Test Plan: make
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11979
      7c9093ab
  9. 03 8月, 2013 1 次提交
  10. 02 8月, 2013 2 次提交
    • M
      Merge operator for ttl · c42485f6
      Mayank Agarwal 提交于
      Summary: Implemented a TtlMergeOperator class which inherits from MergeOperator and is TTL aware. It strips out timestamp from existing_value and attaches timestamp to new_value, calling user-provided-Merge in between.
      
      Test Plan: make all check
      
      Reviewers: haobo, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11775
      c42485f6
    • M
      Expand KeyMayExist to return the proper value if it can be found in memory and... · 59d0b02f
      Mayank Agarwal 提交于
      Expand KeyMayExist to return the proper value if it can be found in memory and also check block_cache
      
      Summary: Removed KeyMayExistImpl because KeyMayExist demanded Get like semantics now. Removed no_io from memtable and imm because we need the proper value now and shouldn't just stop when we see Merge in memtable. Added checks to block_cache. Updated documentation and unit-test
      
      Test Plan: make all check;db_stress for 1 hour
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11853
      59d0b02f
  11. 01 8月, 2013 2 次提交
    • J
      Slow down writes gradually rather than suddenly · 9700677a
      Jim Paton 提交于
      Summary:
      Currently, when a certain number of level0 files (level0_slowdown_writes_trigger) are present, RocksDB will slow down each write by 1ms. There is a second limit of level0 files at which RocksDB will stop writes altogether (level0_stop_writes_trigger).
      
      This patch enables the user to supply a third parameter specifying the number of files at which Rocks will start slowing down writes (level0_start_slowdown_writes). When this number is reached, Rocks will slow down writes as a quadratic function of level0_slowdown_writes_trigger - num_level0_files.
      
      For some workloads, this improves latency and throughput. I will post some stats momentarily in https://our.intern.facebook.com/intern/tasks/?t=2613384.
      
      Test Plan:
      make -j32 check
      ./db_stress
      ./db_bench
      
      Reviewers: dhruba, haobo, MarkCallaghan, xjin
      
      Reviewed By: xjin
      
      CC: leveldb, xjin, zshao
      
      Differential Revision: https://reviews.facebook.net/D11859
      9700677a
    • X
      Make arena block size configurable · 0f0a24e2
      Xing Jin 提交于
      Summary:
      Add an option for arena block size, default value 4096 bytes. Arena will allocate blocks with such size.
      
      I am not sure about passing parameter to skiplist in the new virtualized framework, though I talked to Jim a bit. So add Jim as reviewer.
      
      Test Plan:
      new unit test, I am running db_test.
      
      For passing paramter from configured option to Arena, I tried tests like:
      
        TEST(DBTest, Arena_Option) {
        std::string dbname = test::TmpDir() + "/db_arena_option_test";
        DestroyDB(dbname, Options());
      
        DB* db = nullptr;
        Options opts;
        opts.create_if_missing = true;
        opts.arena_block_size = 1000000; // tested 99, 999999
        Status s = DB::Open(opts, dbname, &db);
        db->Put(WriteOptions(), "a", "123");
        }
      
      and printed some debug info. The results look good. Any suggestion for such a unit-test?
      
      Reviewers: haobo, dhruba, emayanke, jpaton
      
      Reviewed By: dhruba
      
      CC: leveldb, zshao
      
      Differential Revision: https://reviews.facebook.net/D11799
      0f0a24e2
  12. 30 7月, 2013 4 次提交
    • D
      Fix README contents. · 542cc10b
      Dhruba Borthakur 提交于
      Summary:
      Fix README contents.
      
      Test Plan:
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      542cc10b
    • J
      Don't use redundant Env::NowMicros() calls · 6db52b52
      Jim Paton 提交于
      Summary: After my patch for stall histograms, there are redundant calls to NowMicros() by both the stop watches and DBImpl::MakeRoomForWrites. So I removed the redundant calls such that the information is gotten from the stopwatch.
      
      Test Plan:
      make clean
      make -j32 check
      
      Reviewers: dhruba, haobo, MarkCallaghan
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11883
      6db52b52
    • J
      Use specific DB name in merge_test · abc90b06
      Jim Paton 提交于
      Summary: Currently, merge_test uses /tmp/testdb for the test database. It should really use something more specific to merge_test. Most of the other tests use test::TmpDir() + "/<test name>db". This patch implements such behavior for merge_test; it makes merge_test use test::TmpDir() + "/merge_testdb"
      
      Test Plan:
      make clean
      make -j32 merge_test
      ./merge_test
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11877
      abc90b06
    • J
      Add stall counts to statistics · 18afff2e
      Jim Paton 提交于
      Summary: Previously, statistics are kept on how much time is spent on stalls of different types. This patch adds support for keeping number of stalls of each type. For example, instead of just reporting how many microseconds are spent waiting for memtables to be compacted, it will also report how many times a write stalled for that to occur.
      
      Test Plan:
      make -j32 check
      ./db_stress
      
      # Not really sure what else should be done...
      
      Reviewers: dhruba, MarkCallaghan, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11841
      18afff2e
  13. 25 7月, 2013 2 次提交
  14. 24 7月, 2013 4 次提交
    • J
      Virtualize SkipList Interface · 52d7ecfc
      Jim Paton 提交于
      Summary: This diff virtualizes the skiplist interface so that users can provide their own implementation of a backing store for MemTables. Eventually, the backing store will be responsible for its own synchronization, allowing users (and us) to experiment with different lockless implementations.
      
      Test Plan:
      make clean
      make -j32 check
      ./db_stress
      
      Reviewers: dhruba, emayanke, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11739
      52d7ecfc
    • D
      If disable wal is set, then batch commits are avoided. · 6fbe4e98
      Dhruba Borthakur 提交于
      Summary:
      rocksdb uses batch commit to write to transaction log. But if
      disable wal is set, then writes to transaction log are anyways
      avoided. In this case, there is not much value-add to batch things,
      batching can cause unnecessary delays to Puts().
      This patch avoids batching when disableWal is set.
      
      Test Plan:
      make check.
      
      I am running db_stress now.
      
      Reviewers: haobo
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11763
      6fbe4e98
    • M
      Adding filter_deletes to crash_tests run in jenkins · f3baeecd
      Mayank Agarwal 提交于
      Summary: filter_deletes options introduced in db_stress makes it drop Deletes on key if KeyMayExist(key) returns false on the key. code change was simple and tested so not wasting reviewer's time.
      
      Test Plan: maek crash_test; python tools/db_crashtest[1|2].py
      
      CC: dhruba, vamsi
      
      Differential Revision: https://reviews.facebook.net/D11769
      f3baeecd
    • M
      Use KeyMayExist for WriteBatch-Deletes · bf66c10b
      Mayank Agarwal 提交于
      Summary:
      Introduced KeyMayExist checking during writebatch-delete and removed from Outer Delete API because it uses writebatch-delete.
      Added code to skip getting Table from disk if not already present in table_cache.
      Some renaming of variables.
      Introduced KeyMayExistImpl which allows checking since specified sequence number in GetImpl useful to check partially written writebatch.
      Changed KeyMayExist to not be pure virtual and provided a default implementation.
      Expanded unit-tests in db_test to check appropriately.
      Ran db_stress for 1 hour with ./db_stress --max_key=100000 --ops_per_thread=10000000 --delpercent=50 --filter_deletes=1 --statistics=1.
      
      Test Plan: db_stress;make check
      
      Reviewers: dhruba, haobo
      
      Reviewed By: dhruba
      
      CC: leveldb, xjin
      
      Differential Revision: https://reviews.facebook.net/D11745
      bf66c10b
  15. 23 7月, 2013 1 次提交
  16. 20 7月, 2013 1 次提交