1. 26 8月, 2015 5 次提交
    • D
      Address noexcept and const integer lambda capture · 6924d758
      Dmitri Smirnov 提交于
        VS 2013 does not support noexcept.
         Complains about usage of ineteger constant within lambda requiring explicit capture.
      6924d758
    • I
      Add throttling to multi-threaded backups · 53b88784
      Igor Canadi 提交于
      Summary: See internal task t8056182
      
      Test Plan: Added multi-threading in RateLimiter test
      
      Reviewers: benj, AaronFeldman
      
      Reviewed By: AaronFeldman
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45459
      53b88784
    • A
      Fix compact_files_example · 09d982f9
      Andres Notzli 提交于
      Summary:
      See task #7983654. The example was triggering an assert in compaction job
      because the compaction was not marked as manual. With this patch,
      CompactionPicker::FormCompaction() marks compactions as manual. This patch
      also fixes a couple of typos, adds optimistic_transaction_example to
      .gitignore and librocksdb as a dependency for examples. Adding librocksdb as
      a dependency makes sure that the examples are built with the latest changes
      in librocksdb.
      
      Test Plan: make clean && cd examples && make all && ./compact_files_example
      
      Reviewers: rven, sdong, anthony, igor, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45117
      09d982f9
    • Y
      Expose per-level aggregated table properties via GetProperty() · 6996de87
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch adds "rocksdb.aggregated-table-properties"
      and "rocksdb.aggregated-table-properties-at-levelN", the former
      returns the aggreated table properties of a column family,
      while the later returns the aggregated table properties
      of the specified level N.
      
      Test Plan: Added tests in db_test
      
      Reviewers: igor, sdong, IslamAbdelRahman, anthony
      
      Reviewed By: anthony
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45087
      6996de87
    • A
      Fix Windows build · 86d6c3cd
      agiardullo 提交于
      Summary: wrong filename
      
      Test Plan: none
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45531
      86d6c3cd
  2. 25 8月, 2015 4 次提交
    • A
      Common base class for transactions · 20d1e547
      agiardullo 提交于
      Summary:
      As I keep adding new features to transactions, I keep creating more duplicate code.  This diff cleans this up by creating a base implementation class for Transaction and OptimisticTransaction to inherit from.
      
      The code in TransactionBase.h/.cc is all just copied from elsewhere.  The only entertaining part of this class worth looking at is the virtual TryLock method which allows OptimisticTransactions and Transactions to share the same common code for Put/Get/etc.
      
      The rest of this diff is mostly red and easy on the eyes.
      
      Test Plan: No functionality change.  existing tests pass.
      
      Reviewers: sdong, jkedgar, rven, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45135
      20d1e547
    • A
      Fixing race condition in DBTest.DynamicMemtableOptions · 20508329
      Andres Noetzli 提交于
      Summary:
      This patch fixes a race condition in DBTEst.DynamicMemtableOptions. In rare cases,
      it was possible that the main thread would fill up both memtables before the flush
      job acquired its work. Then, the flush job was flushing both memtables together,
      producing only one L0 file while the test expected two. Now, the test waits for
      flushes to finish earlier, to make sure that the memtables are flushed in separate
      flush jobs.
      
      Test Plan:
      Insert "usleep(10000);" after "IOSTATS_SET_THREAD_POOL_ID(Env::Priority::HIGH);" in BGWorkFlush()
      to make the issue more likely. Then test with:
      make db_test && time while ./db_test --gtest_filter=*DynamicMemtableOptions; do true; done
      
      Reviewers: rven, sdong, yhchiang, anthony, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45429
      20508329
    • I
      Remove an extra 's' from cur-size-all-mem-tabless · e46bcc08
      Igor Canadi 提交于
      Summary: As title
      
      Test Plan: make check
      
      Reviewers: yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45447
      e46bcc08
    • I
      Smarter purging during flush · 4ab26c5a
      Igor Canadi 提交于
      Summary:
      Currently, we only purge duplicate keys and deletions during flush if `earliest_seqno_in_memtable <= newest_snapshot`. This means that the newest snapshot happened before we first created the memtable. This is almost never true for MyRocks and MongoRocks.
      
      This patch makes purging during flush able to understand snapshots. The main logic is copied from compaction_job.cc, although the logic over there is much more complicated and extensive. However, we should try to merge the common functionality at some point.
      
      I need this patch to implement no_overwrite_i_promise functionality for flush. We'll also need this to support SingleDelete() during Flush(). @yoshinorim requested the feature.
      
      Test Plan:
      make check
      I had to adjust some unit tests to understand this new behavior
      
      Reviewers: yhchiang, yoshinorim, anthony, sdong, noetzli
      
      Reviewed By: noetzli
      
      Subscribers: yoshinorim, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D42087
      4ab26c5a
  3. 23 8月, 2015 1 次提交
    • M
      Fix benchmark report script · 4c81ac0c
      Mark Callaghan 提交于
      Summary:
      db_bench output now displays Percentile many times with --statistics after
      read IO latency histograms were added. So I only need the last one in the report output.
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run run_flash_bench.sh
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D45093
      4c81ac0c
  4. 22 8月, 2015 2 次提交
  5. 21 8月, 2015 11 次提交
    • S
      Add options.new_table_reader_for_compaction_inputs · 9130873a
      sdong 提交于
      Summary: Currently compaction inputs share the same file descriptor and table reader as other foreground threads. It makes fadvise works less predictable. Add options.new_table_reader_for_compaction_inputs to enforce to create a new file descriptor and new table reader for it.
      
      Test Plan: Add the option.
      
      Reviewers: rven, anthony, kradhakrishnan, IslamAbdelRahman, igor, yhchiang
      
      Reviewed By: igor
      
      Subscribers: igor, MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D43311
      9130873a
    • S
      Add a counter about estimated pending compaction bytes · 07d2d341
      sdong 提交于
      Summary:
      Add a counter of estimated bytes the DB needs to compact for all the compactions to finish. Expose it as a DB Property.
      In the future, we can use threshold of this counter to replace soft rate limit and hard rate limit. A single threshold of estimated compaction debt in bytes will be easier for users to reason about when should slow down and stopping than more abstract soft and hard rate limits.
      
      Test Plan: Add unit tests
      
      Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, anthony, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D44205
      07d2d341
    • M
      Improve defaults for benchmarks · 41a0e281
      Mark Callaghan 提交于
      Summary:
      Changes include:
      * don't sync-on-commit for single writer thread in readwhile... tests
      * make default block size 8kb rather than 4kb to avoid too small blocks after compression
      * use snappy instead of zlib to avoid stalls from compression latency
      * disable statistics
      * use bytes_per_sync=8M to reduce throughput loss on disk
      * use open_files=-1 to reduce mutex contention
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run benchmark
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44961
      41a0e281
    • Y
      Fixed a rare deadlock in DBTest.ThreadStatusFlush · a203b913
      Yueh-Hsuan Chiang 提交于
      Summary:
      Currently, ThreadStatusFlush uses two sync-points to ensure
      there's a flush currently running when calling GetThreadList().
      However, one of the sync-point is inside db-mutex, which could
      cause deadlock in case there's a DB::Get() call.
      
      This patch fix this issue by moving the sync-point to a better
      place where the flush job does not hold the mutex.
      
      Test Plan: db_test
      
      Reviewers: igor, sdong, anthony, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D45045
      a203b913
    • S
      Merge pull request #695 from yuslepukhin/address_windows_build · 962aa642
      Siying Dong 提交于
      Address windows build issues caused by introducing Subcompaction
      962aa642
    • D
      More indent adjustment. · 5bf89076
      Dmitri Smirnov 提交于
      5bf89076
    • D
      Adjust indent · e2a9f43d
      Dmitri Smirnov 提交于
      e2a9f43d
    • D
      Merge branch 'address_windows_build' of https://github.com/yuslepukhin/rocksdb... · 6e9a260b
      Dmitri Smirnov 提交于
      Merge branch 'address_windows_build' of https://github.com/yuslepukhin/rocksdb into address_windows_build
      6e9a260b
    • D
      Address windows build issues · 1cac89c9
      Dmitri Smirnov 提交于
       Intro SubCompactionState move functionality
       =delete copy functionality
       #ifdef SyncPoint in tests for Windows Release builds
      1cac89c9
    • D
      Address windows build issues · f25f06dd
      Dmitri Smirnov 提交于
        Intro SubCompactionState move functionality
        =delete copy functionality
        #ifdef SyncPoint in tests for Windows Release builds
      f25f06dd
    • I
      Total SST files size DB Property · 027ca5b2
      Islam AbdelRahman 提交于
      Summary: Add a new DB property that calculate the total size of files used by all RocksDB Versions
      
      Test Plan: Unittests for the new property
      
      Reviewers: igor, yhchiang, anthony, rven, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44799
      027ca5b2
  6. 20 8月, 2015 5 次提交
  7. 19 8月, 2015 7 次提交
    • A
      Removing variables used only in assertions to prevent build error · 137c3766
      Ari Ekmekji 提交于
      Summary:
      A couple variables were declared but only used in assertions
      which causes issues when building in fbcode.
      
      Test Plan: make dbg  and   make release
      
      Reviewers: yhchiang, sdong, igor, anthony, MarkCallaghan
      
      Reviewed By: MarkCallaghan
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44937
      137c3766
    • A
      Bounding Number of Subcompactions · b47cc585
      Ari Ekmekji 提交于
      Summary:
      In D43239 (https://reviews.facebook.net/D43239) the number
      of subcompactions is set based on the number of L1 files with
      unique starting keys. In certain cases when this number is very large
      this causes issues, particularly with the overlap between files since
      very small output files can be generated. This diff bounds the number
      of subcompactions to the user option DBOption.num_subcompactions.
      
      Test Plan: ./db_test ./db_compaction_test
      
      Reviewers: sdong, igor, anthony, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44883
      b47cc585
    • V
      Make tailing iterator show new entries in memtable. · e58e1b18
      Venkatesh Radhakrishnan 提交于
      Summary:
      Reseek mutable_iter if it is invalid in Next and immutable_iter
      is invalid.
      
      Test Plan: DBTestTailingIterator.TailingIteratorSeekToNext
      
      Reviewers: tnovak, march, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D44865
      e58e1b18
    • Y
      DBOptions serialization and deserialization · 9ec95715
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch implements DBOptions deserialization and improve
      the current implementation of DBOptions serialization by
      using a static structure that stores the offset of each
      DBOptions member variables to perform serialization and
      deserialization instead of using tons of if-then-branch
      to determine the mapping between string and variables.
      
      Test Plan: Added test in options_test.cc
      
      Reviewers: igor, anthony, sdong, IslamAbdelRahman
      
      Reviewed By: sdong
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44097
      9ec95715
    • Y
      Make HashCuckooRep::ApproximateMemoryUsage() return reasonable estimation. · b2df20a8
      Yueh-Hsuan Chiang 提交于
      Summary:
      HashCuckooRep::ApproximateMemoryUsage() previously return
      std::numeric_limits<size_t>::max() when it cannot accept more
      entries.  This patch makes it return a more reasonable estimation.
      
      This change is necessary in order to make GetIntProperty("rocksdb.cur-size-all-mem-tables")
      handles HashCuckooRep properly in diff https://reviews.facebook.net/D44229.
      
      Test Plan: db_test
      
      Reviewers: sdong, anthony, IslamAbdelRahman, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44241
      b2df20a8
    • A
      Fixing Failed Assertion in Subcompaction State Diff · 601b1aac
      Ari Ekmekji 提交于
      Summary:
      In D43239 (https://reviews.facebook.net/D43239) there is an
      assertion to make sure a subcompaction's output is never empty at the
      end of execution. This assertion however breaks the build because some
      tests lead to exactly that scenario. So instead I have altered the logic
      to handle this case instead of just failing the assertion.
      
      The reason that it is possible for a subcompaction's output to be empty is
      that during a sequential execution of subcompactions, if a user aborts the
      compaction job then some of the later subcompactions to be executed may
      have yet to process any keys and therefore have yet to generate output files.
      This becomes very rare once the subcompactions are executed in parallel,
      but for now they are still sequential so the case is possible when there is an
      early termination, as in some of the tests.
      
      Test Plan: ./db_test  ./db_compaction_test
      
      Reviewers: sdong, igor, anthony, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44877
      601b1aac
    • A
      [Parallel L0-L1 Compaction Prep]: Giving Subcompactions Their Own State · f0da6977
      Ari Ekmekji 提交于
      Summary:
      In prepration for running multiple threads at the same time during
      a compaction job, this patch assigns each subcompaction its own state
      (instead of sharing the one global CompactionState). Each subcompaction then
      uses this state to update its statistics, keep track of its snapshots, etc.
      during the course of execution. Then at the end of all the executions the
      statistics are aggregated across the subcompactions so that the final result
      is the same as if only one larger compaction had run.
      
      Test Plan: ./db_test  ./db_compaction_test  ./compaction_job_test
      
      Reviewers: sdong, anthony, igor, noetzli, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43239
      f0da6977
  8. 18 8月, 2015 1 次提交
    • A
      Simplify querying of merge results · f32a5720
      Andres Notzli 提交于
      Summary:
      While working on supporting mixing merge operators with
      single deletes ( https://reviews.facebook.net/D43179 ),
      I realized that returning and dealing with merge results
      can be made simpler. Submitting this as a separate diff
      because it is not directly related to single deletes.
      
      Before, callers of merge helper had to retrieve the merge
      result in one of two ways depending on whether the merge
      was successful or not (success = result of merge was single
      kTypeValue). For successful merges, the caller could query
      the resulting key/value pair and for unsuccessful merges,
      the result could be retrieved in the form of two deques of
      keys and values. However, with single deletes, a successful merge
      does not return a single key/value pair (if merge
      operands are merged with a single delete, we have to generate
      a value and keep the original single delete around to make
      sure that we are not accidentially producing a key overwrite).
      In addition, the two existing call sites of the merge
      helper were taking the same actions independently from whether
      the merge was successful or not, so this patch simplifies that.
      
      Test Plan: make clean all check
      
      Reviewers: rven, sdong, yhchiang, anthony, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43353
      f32a5720
  9. 15 8月, 2015 2 次提交
    • S
      Measure file read latency histogram per level · 72613657
      sdong 提交于
      Summary: In internal stats, remember read latency histogram, if statistics is enabled. It can be retrieved from DB::GetProperty() with "rocksdb.dbstats" property, if it is enabled.
      
      Test Plan: Manually run db_bench and prints out "rocksdb.dbstats" by hand and make sure it prints out as expected
      
      Reviewers: igor, IslamAbdelRahman, rven, kradhakrishnan, anthony, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D44193
      72613657
    • N
      reduce db mutex contention for write batch groups · b7198c3a
      Nathan Bronson 提交于
      Summary:
      This diff allows a Writer to join the next write batch group
      without acquiring any locks. Waiting is performed via a per-Writer mutex,
      so all of the non-leader writers never need to acquire the db mutex.
      It is now possible to join a write batch group after the leader has been
      chosen but before the batch has been constructed. This diff doesn't
      increase parallelism, but reduces synchronization overheads.
      
      For some CPU-bound workloads (no WAL, RAM-sized working set) this can
      substantially reduce contention on the db mutex in a multi-threaded
      environment.  With T=8 N=500000 in a CPU-bound scenario (see the test
      plan) this is good for a 33% perf win.  Not all scenarios see such a
      win, but none show a loss.  This code is slightly faster even for the
      single-threaded case (about 2% for the CPU-bound scenario below).
      
      Test Plan:
      1. unit tests
      2. COMPILE_WITH_TSAN=1 make check
      3. stress high-contention scenarios with db_bench -benchmarks=fillrandom -threads=$T -batch_size=1 -memtablerep=skip_list -value_size=0 --num=$N -level0_slowdown_writes_trigger=9999 -level0_stop_writes_trigger=9999 -disable_auto_compactions --max_write_buffer_number=8 -max_background_flushes=8 --disable_wal --write_buffer_size=160000000
      
      Reviewers: sdong, igor, rven, ljin, yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D43887
      b7198c3a
  10. 14 8月, 2015 2 次提交
    • S
      Add options.compaction_measure_io_stats to print write I/O stats in compactions · 603b6da8
      sdong 提交于
      Summary:
      Add options.compaction_measure_io_stats to print out / pass to listener accumulated time spent on write calls. Example outputs in info logs:
      
      2015/08/12-16:27:59.463944 7fd428bff700 (Original Log Time 2015/08/12-16:27:59.463922) EVENT_LOG_v1 {"time_micros": 1439422079463897, "job": 6, "event": "compaction_finished", "output_level": 1, "num_output_files": 4, "total_output_size": 6900525, "num_input_records": 111483, "num_output_records": 106877, "file_write_nanos": 15663206, "file_range_sync_nanos": 649588, "file_fsync_nanos": 349614797, "file_prepare_write_nanos": 1505812, "lsm_state": [2, 4, 0, 0, 0, 0, 0]}
      
      Add two more counters in iostats_context.
      
      Also add a parameter of db_bench.
      
      Test Plan: Add a unit test. Also manually verify LOG outputs in db_bench
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D44115
      603b6da8
    • I
      Change master to 3.14 · dc9d5634
      Islam AbdelRahman 提交于
      Summary: Change master version to 3.14
      
      Test Plan: simple change
      
      Reviewers: sdong, yhchiang, kradhakrishnan, rven, anthony, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D44187
      dc9d5634