1. 17 9月, 2016 1 次提交
    • Y
      Remove ColumnFamilyData::options() · 0a88f38b
      Yi Wu 提交于
      Summary: One more small refactor before I split DBOptions into mutable and immutable parts.
      
      Test Plan: existing unit tests.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64047
      0a88f38b
  2. 15 9月, 2016 1 次提交
  3. 14 9月, 2016 1 次提交
    • Y
      Refactor MutableCFOptions · 81747f1b
      Yi Wu 提交于
      Summary:
      * Change constructor of MutableCFOptions to depends only on ColumnFamilyOptions.
      * Move `max_subcompactions`, `compaction_options_fifo` and `compaction_pri` to ImmutableCFOptions to make it clear that they are immutable.
      
      Test Plan: existing unit tests.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D63945
      81747f1b
  4. 02 9月, 2016 1 次提交
    • S
      Merge options source_compaction_factor, max_grandparent_overlap_bytes and... · 32149059
      sdong 提交于
      Merge options source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes
      
      Summary: To reduce number of options, merge source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes.
      
      Test Plan: Add two new unit tests. Run all existing tests, including jtest.
      
      Reviewers: yhchiang, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D59829
      32149059
  5. 03 8月, 2016 2 次提交
    • Y
      Ignore write stall triggers when auto-compaction is disabled · ee027fc1
      Yi Wu 提交于
      Summary:
      My understanding is that the purpose of write stall triggers are to wait for auto-compaction to catch up. Without auto-compaction, we don't need to stall writes.
      
      Also with this diff, flush/compaction conditions are recalculated on dynamic option change. Previously the conditions are recalculate only when write stall options are changed.
      
      Test Plan: See the new test. Removed two tests that are no longer valid.
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D61437
      ee027fc1
    • J
      Add a GetComparator() function to the ColumnFamilyHandle base class so that... · cdc4eb68
      Jay Edgar 提交于
      Add a GetComparator() function to the ColumnFamilyHandle base class so that the user's comparator can be retrieved.
      
      Summary: MyRocks is adding support for the user of the SstFileWriter which needs a comparator.  It would be more convenient to get the comparator from the column family (which already has to have it) than to have caller keep track of it.
      
      Test Plan: Standard tests (adding one for the new method)
      
      Reviewers: IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D61155
      cdc4eb68
  6. 06 7月, 2016 1 次提交
  7. 18 6月, 2016 1 次提交
    • S
      Deprectate filter_deletes · 7b79238b
      sdong 提交于
      Summary: filter_deltes is not a frequently used feature. Remove it.
      
      Test Plan: Run all test suites.
      
      Reviewers: igor, yhchiang, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D59427
      7b79238b
  8. 11 6月, 2016 1 次提交
    • S
      memtable_prefix_bloom_bits -> memtable_prefix_bloom_bits_ratio and deprecate... · 20699df8
      sdong 提交于
      memtable_prefix_bloom_bits -> memtable_prefix_bloom_bits_ratio and deprecate memtable_prefix_bloom_probes
      
      Summary:
      memtable_prefix_bloom_probes is not a critical option. Remove it to reduce number of options.
      It's easier for users to make mistakes with memtable_prefix_bloom_bits, turn it to memtable_prefix_bloom_bits_ratio
      
      Test Plan: Run all existing tests
      
      Reviewers: yhchiang, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: gunnarku, yoshinorim, MarkCallaghan, leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D59199
      20699df8
  9. 03 5月, 2016 1 次提交
    • W
      Fix #1110, 32-bit build failure on Mac OSX (#1112) · b8cf9130
      Warren Falk 提交于
      Using explicit 64-bit type in conditional in platforms above 32-bits
      This appears to be necessary on Mac OSX as std::conditional does not appear to short circuit and evaluates the third template arg
      Making the third template arg be 64 bits explicitly works around this problem and will work on both 32 bit and 64+ bit platforms.
      b8cf9130
  10. 15 3月, 2016 1 次提交
  11. 10 2月, 2016 1 次提交
  12. 06 2月, 2016 1 次提交
    • S
      Explictly fail when memtable doesn't support concurrent insert · b1887c5d
      sdong 提交于
      Summary: If users turn on concurrent insert but the memtable doesn't support it, they might see unexcepted crash. Fix it by explicitly fail.
      
      Test Plan:
      Run different setting of stress_test and make sure it fails correctly.
      Will add a unit test too.
      
      Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, andrewkr, ngbronson
      
      Reviewed By: ngbronson
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D53895
      b1887c5d
  13. 30 1月, 2016 1 次提交
    • V
      Add options.base_background_compactions as a number of compaction threads for low compaction debt · 3b2a1ddd
      Venkatesh Radhakrishnan 提交于
      Summary:
      If options.base_background_compactions is given, we try to schedule number of compactions not existing this number, only when L0 files increase to certain number, or pending compaction bytes more than certain threshold, we schedule compactions based on options.max_background_compactions.
      
      The watermarks are calculated based on slowdown thresholds.
      
      Test Plan:
      Add new test cases in column_family_test.
      Adding more unit tests.
      
      Reviewers: IslamAbdelRahman, yhchiang, kradhakrishnan, rven, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba, yoshinorim
      
      Differential Revision: https://reviews.facebook.net/D53409
      3b2a1ddd
  14. 29 1月, 2016 1 次提交
  15. 07 1月, 2016 1 次提交
    • Y
      Add ColumnFamilyHandle::GetDescriptor() · 6935eb24
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch addes ColumnFamilyHandle::GetDescriptor(), which allows
      developers to obtain the CF options and names of the associated column
      family given its handle.
      
        // Returns the up-to-date descriptor used by the current handle.  Since it
        // returns the up-to-date information, this call might internally locks
        // and releases DB mutex to access the up-to-date CF options.
        virtual ColumnFamilyDescriptor GetDescriptor() = 0;
      
      Test Plan: augment column_family_test
      
      Reviewers: sdong, yoshinorim, IslamAbdelRahman, rven, kradhakrishnan, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D51543
      6935eb24
  16. 29 12月, 2015 1 次提交
  17. 26 12月, 2015 1 次提交
    • N
      support for concurrent adds to memtable · 7d87f027
      Nathan Bronson 提交于
      Summary:
      This diff adds support for concurrent adds to the skiplist memtable
      implementations.  Memory allocation is made thread-safe by the addition of
      a spinlock, with small per-core buffers to avoid contention.  Concurrent
      memtable writes are made via an additional method and don't impose a
      performance overhead on the non-concurrent case, so parallelism can be
      selected on a per-batch basis.
      
      Write thread synchronization is an increasing bottleneck for higher levels
      of concurrency, so this diff adds --enable_write_thread_adaptive_yield
      (default off).  This feature causes threads joining a write batch
      group to spin for a short time (default 100 usec) using sched_yield,
      rather than going to sleep on a mutex.  If the timing of the yield calls
      indicates that another thread has actually run during the yield then
      spinning is avoided.  This option improves performance for concurrent
      situations even without parallel adds, although it has the potential to
      increase CPU usage (and the heuristic adaptation is not yet mature).
      
      Parallel writes are not currently compatible with
      inplace updates, update callbacks, or delete filtering.
      Enable it with --allow_concurrent_memtable_write (and
      --enable_write_thread_adaptive_yield).  Parallel memtable writes
      are performance neutral when there is no actual parallelism, and in
      my experiments (SSD server-class Linux and varying contention and key
      sizes for fillrandom) they are always a performance win when there is
      more than one thread.
      
      Statistics are updated earlier in the write path, dropping the number
      of DB mutex acquisitions from 2 to 1 for almost all cases.
      
      This diff was motivated and inspired by Yahoo's cLSM work.  It is more
      conservative than cLSM: RocksDB's write batch group leader role is
      preserved (along with all of the existing flush and write throttling
      logic) and concurrent writers are blocked until all memtable insertions
      have completed and the sequence number has been advanced, to preserve
      linearizability.
      
      My test config is "db_bench -benchmarks=fillrandom -threads=$T
      -batch_size=1 -memtablerep=skip_list -value_size=100 --num=1000000/$T
      -level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999
      -disable_auto_compactions --max_write_buffer_number=8
      -max_background_flushes=8 --disable_wal --write_buffer_size=160000000
      --block_size=16384 --allow_concurrent_memtable_write" on a two-socket
      Xeon E5-2660 @ 2.2Ghz with lots of memory and an SSD hard drive.  With 1
      thread I get ~440Kops/sec.  Peak performance for 1 socket (numactl
      -N1) is slightly more than 1Mops/sec, at 16 threads.  Peak performance
      across both sockets happens at 30 threads, and is ~900Kops/sec, although
      with fewer threads there is less performance loss when the system has
      background work.
      
      Test Plan:
      1. concurrent stress tests for InlineSkipList and DynamicBloom
      2. make clean; make check
      3. make clean; DISABLE_JEMALLOC=1 make valgrind_check; valgrind db_bench
      4. make clean; COMPILE_WITH_TSAN=1 make all check; db_bench
      5. make clean; COMPILE_WITH_ASAN=1 make all check; db_bench
      6. make clean; OPT=-DROCKSDB_LITE make check
      7. verify no perf regressions when disabled
      
      Reviewers: igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, IslamAbdelRahman, anthony, yhchiang, rven, sdong, guyg8, kradhakrishnan, dhruba
      
      Differential Revision: https://reviews.facebook.net/D50589
      7d87f027
  18. 24 12月, 2015 1 次提交
    • S
      When slowdown is triggered, reduce the write rate · b9f77ba1
      sdong 提交于
      Summary: It's usually hard for users to set a value of options.delayed_write_rate. With this diff, after slowdown condition triggers, we greedily reduce write rate if estimated pending compaction bytes increase. If estimated compaction pending bytes drop, we increase the write rate.
      
      Test Plan:
      Add a unit test
      Test with db_bench setting:
      TEST_TMPDIR=/dev/shm/ ./db_bench --benchmarks=fillrandom -num=10000000 --soft_pending_compaction_bytes_limit=1000000000 --hard_pending_compaction_bytes_limit=3000000000 --delayed_write_rate=100000000
      
      and make sure without the commit, write stop will happen, but with the commit, it will not happen.
      
      Reviewers: igor, anthony, rven, yhchiang, kradhakrishnan, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D52131
      b9f77ba1
  19. 18 12月, 2015 1 次提交
    • S
      Slowdown when writing to the last write buffer · d72b3177
      sdong 提交于
      Summary: Now if inserting to mem table is much faster than writing to files, there is no mechanism users can rely on to avoid stopping for reaching options.max_write_buffer_number. With the commit, if there are more than four maximum write buffers configured, we slow down to the rate of options.delayed_write_rate while we reach the last one.
      
      Test Plan:
      1. Add a new unit test.
      2. Run db_bench with
      
      ./db_bench --benchmarks=fillrandom --num=10000000 --max_background_flushes=6 --batch_size=32 -max_write_buffer_number=4 --delayed_write_rate=500000 --statistics
      
      based on hard drive and see stopping is avoided with the commit.
      
      Reviewers: yhchiang, IslamAbdelRahman, anthony, rven, kradhakrishnan, igor
      
      Reviewed By: igor
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D52047
      d72b3177
  20. 15 12月, 2015 1 次提交
    • V
      Running manual compactions in parallel with other automatic or manual... · 030215bf
      Venkatesh Radhakrishnan 提交于
      Running manual compactions in parallel with other automatic or manual compactions in restricted cases
      
      Summary:
      This diff provides a framework for doing manual
      compactions in parallel with other compactions. We now have a deque of manual compactions. We also pass manual compactions as an argument from RunManualCompactions down to
      BackgroundCompactions, so that RunManualCompactions can be reentrant.
      Parallelism is controlled by the two routines
      ConflictingManualCompaction to allow/disallow new parallel/manual
      compactions based on already existing ManualCompactions. In this diff, by default manual compactions still have to run exclusive of other compactions. However, by setting the compaction option, exclusive_manual_compaction to false, it is possible to run other compactions in parallel with a manual compaction. However, we are still restricted to one manual compaction per column family at a time. All of these restrictions will be relaxed in future diffs.
      I will be adding more tests later.
      
      Test Plan: Rocksdb regression + new tests + valgrind
      
      Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D47973
      030215bf
  21. 10 12月, 2015 1 次提交
    • S
      Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit · 56e77f09
      sdong 提交于
      Summary: Deprecate options.soft_rate_limit, which is hard to tune, with options.soft_pending_compaction_bytes_limit, which would trigger the slowdown if estimated pending compaction bytes exceeds the threshold. The hope is to make it more striaght-forward to tune.
      
      Test Plan: Modify DBTest.SoftLimit to cover options.soft_pending_compaction_bytes_limit instead; run all unit tests.
      
      Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, igor, anthony
      
      Reviewed By: anthony
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D51117
      56e77f09
  22. 17 11月, 2015 1 次提交
    • S
      UniversalCompactionPicker::PickCompaction(): avoid to form compactions if there is no file · dac5b248
      sdong 提交于
      Summary:
      Currently RocksDB may break in lines like this:
      
      for (size_t i = sorted_runs.size() - 1; i >= first_index_after; i--) {
      
      if options.level0_file_num_compaction_trigger=0.
      
      Fix it by not executing the logic of picking compactions if there is no file (sorted_runs.size() = 0). Also internally set options.level0_file_num_compaction_trigger=1 if users give a 0. 0 is a value makes no sense in RocksDB.
      
      Test Plan: Run all tests. Will add a unit test too.
      
      Reviewers: yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, rven
      
      Reviewed By: rven
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D50727
      dac5b248
  23. 17 10月, 2015 1 次提交
  24. 15 9月, 2015 2 次提交
    • S
      Add options.hard_pending_compaction_bytes_limit to stop writes if compaction lagging behind · 5de807ac
      sdong 提交于
      Summary: Add an option to stop writes if compaction lefts behind. If estimated pending compaction bytes is more than threshold specified by options.hard_pending_compaction_bytes_liimt, writes will stop until compactions are cleared to under the threshold.
      
      Test Plan: Add unit test DBTest.HardLimit
      
      Reviewers: rven, kradhakrishnan, anthony, IslamAbdelRahman, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D45999
      5de807ac
    • A
      Add counters for L0 stall while L0-L1 compaction is taking place · 03ddce9a
      Ari Ekmekji 提交于
      Summary:
      Although there are currently counters to keep track of the
      stall caused by having too many L0 files, there is no distinction as
      to whether when that stall occurs either (A) L0-L1 compaction is taking
      place to try and mitigate it, or (B) no L0-L1 compaction has been scheduled
      at the moment. This diff adds a counter for (A) so that the nature of L0
      stalls can be better understood.
      
      Test Plan: make all && make check
      
      Reviewers: sdong, igor, anthony, noetzli, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, dhruba
      
      Differential Revision: https://reviews.facebook.net/D46749
      03ddce9a
  25. 09 9月, 2015 1 次提交
    • A
      better tuning of arena block size · b5b2b75e
      agiardullo 提交于
      Summary: Currently, if users didn't set options.arena_block_size, we set "result.arena_block_size = result.write_buffer_size / 10". It makes result.arena_block_size not a multiplier of 4KB, even if options.write_buffer_size is a multiplier of MBs. When calling malloc to arena_block_size, we may waste a small amount of memory for it. We now make the default to be /8 or /16 and align it to 4KB.
      
      Test Plan: unit tests
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D46467
      b5b2b75e
  26. 21 8月, 2015 1 次提交
  27. 20 8月, 2015 1 次提交
    • Y
      Introduce GetIntProperty("rocksdb.size-all-mem-tables") · df79eafc
      Yueh-Hsuan Chiang 提交于
      Summary:
      Currently, GetIntProperty("rocksdb.cur-size-all-mem-tables") only returns
      the memory usage by those memtables which have not yet been flushed.
      
      This patch introduces GetIntProperty("rocksdb.size-all-mem-tables"),
      which includes the memory usage by all the memtables, includes those
      have been flushed but pinned by iterators.
      
      Test Plan: Added a test in db_test
      
      Reviewers: igor, anthony, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D44229
      df79eafc
  28. 18 7月, 2015 1 次提交
    • I
      Don't let flushes preempt compactions · 35ca5936
      Igor Canadi 提交于
      Summary:
      When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.
      
      We have a special case that when you set max_background_flushes to 0, we
      schedule the flush to execute on the compaction thread.
      
      Test Plan: make check (there might be some unit tests that depend on this behavior)
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D41931
      35ca5936
  29. 08 7月, 2015 1 次提交
  30. 19 6月, 2015 2 次提交
    • I
      Fail DB::Open() when the requested compression is not available · 760e9a94
      Igor Canadi 提交于
      Summary:
      Currently RocksDB silently ignores this issue and doesn't compress the data. Based on discussion, we agree that this is pretty bad because it can cause confusion for our users.
      
      This patch fails DB::Open() if we don't support the compression that is specified in the options.
      
      Test Plan: make check with LZ4 not present. If Snappy is not present all tests will just fail because Snappy is our default library. We should make Snappy the requirement, since without it our default DB::Open() fails.
      
      Reviewers: sdong, MarkCallaghan, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D39687
      760e9a94
    • I
      Don't dump DBOptions for each column family · 4b8bb62f
      Igor Canadi 提交于
      Summary: Currently we dump DBOptions for each column family options we dump. This leads to duplicate lines in our LOG file. This diff fixes that.
      
      Test Plan: Check out the LOG
      
      Reviewers: sdong, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: IslamAbdelRahman, yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D39729
      4b8bb62f
  31. 12 6月, 2015 1 次提交
    • S
      Slow down writes by bytes written · 7842920b
      sdong 提交于
      Summary:
      We slow down data into the database to the rate of options.delayed_write_rate (a new option) with this patch.
      
      The thread synchronization approach I take is to still synchronize write controller by DB mutex and GetDelay() is inside DB mutex. Try to minimize the frequency of getting time in GetDelay(). I verified it through db_bench and it seems to work
      
      hard_rate_limit is deprecated.
      
      options.delayed_write_rate is still not dynamically changeable. Need to work on it as a follow-up.
      
      Test Plan: Add new unit tests in db_test
      
      Reviewers: yhchiang, rven, kradhakrishnan, anthony, MarkCallaghan, igor
      
      Reviewed By: igor
      
      Subscribers: ikabiljo, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D36351
      7842920b
  32. 03 6月, 2015 1 次提交
    • Y
      Allow EventListener::OnCompactionCompleted to return CompactionJobStats. · fe5c6321
      Yueh-Hsuan Chiang 提交于
      Summary:
      Allow EventListener::OnCompactionCompleted to return CompactionJobStats,
      which contains useful information about a compaction.
      
      Example CompactionJobStats returned by OnCompactionCompleted():
          smallest_output_key_prefix 05000000
          largest_output_key_prefix 06990000
          elapsed_time 42419
          num_input_records 300
          num_input_files 3
          num_input_files_at_output_level 2
          num_output_records 200
          num_output_files 1
          actual_bytes_input 167200
          actual_bytes_output 110688
          total_input_raw_key_bytes 5400
          total_input_raw_value_bytes 300000
          num_records_replaced 100
          is_manual_compaction 1
      
      Test Plan: Developed a mega test in db_test which covers 20 variables in CompactionJobStats.
      
      Reviewers: rven, igor, anthony, sdong
      
      Reviewed By: sdong
      
      Subscribers: tnovak, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D38463
      fe5c6321
  33. 30 5月, 2015 1 次提交
    • A
      Optimistic Transactions · dc9d70de
      agiardullo 提交于
      Summary: Optimistic transactions supporting begin/commit/rollback semantics.  Currently relies on checking the memtable to determine if there are any collisions at commit time.  Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty.  You should probably start with transaction.h to get an overview of what is currently supported.
      
      Test Plan: Added a new test, but still need to look into stress testing.
      
      Reviewers: yhchiang, igor, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: adamretter, MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D33435
      dc9d70de
  34. 29 5月, 2015 2 次提交
    • A
      Support saving history in memtable_list · c8153510
      agiardullo 提交于
      Summary:
      For transactions, we are using the memtables to validate that there are no write conflicts.  But after flushing, we don't have any memtables, and transactions could fail to commit.  So we want to someone keep around some extra history to use for conflict checking.  In addition, we want to provide a way to increase the size of this history if too many transactions fail to commit.
      
      After chatting with people, it seems like everyone prefers just using Memtables to store this history (instead of a separate history structure).  It seems like the best place for this is abstracted inside the memtable_list.  I decide to create a separate list in MemtableListVersion as using the same list complicated the flush/installalflushresults logic too much.
      
      This diff adds a new parameter to control how much memtable history to keep around after flushing.  However, it sounds like people aren't too fond of adding new parameters.  So I am making the default size of flushed+not-flushed memtables be set to max_write_buffers.  This should not change the maximum amount of memory used, but make it more likely we're using closer the the limit.  (We are now postponing deleting flushed memtables until the max_write_buffer limit is reached).  So while we might use more memory on average, we are still obeying the limit set (and you could argue it's better to go ahead and use up memory now instead of waiting for a write stall to happen to test this limit).
      
      However, if people are opposed to this default behavior, we can easily set it to 0 and require this parameter be set in order to use transactions.
      
      Test Plan: Added a xfunc test to play around with setting different values of this parameter in all tests.  Added testing in memtablelist_test and planning on adding more testing here.
      
      Reviewers: sdong, rven, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D37443
      c8153510
    • Y
      [API Change] Move listeners from ColumnFamilyOptions to DBOptions · 672dda9b
      Yueh-Hsuan Chiang 提交于
      Summary: Move listeners from ColumnFamilyOptions to DBOptions
      
      Test Plan:
      listener_test
      compact_files_test
      
      Reviewers: rven, anthony, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D39087
      672dda9b
  35. 23 5月, 2015 1 次提交
  36. 19 5月, 2015 1 次提交