1. 01 7月, 2013 2 次提交
    • D
      Reduce write amplification by merging files in L0 back into L0 · 47c4191f
      Dhruba Borthakur 提交于
      Summary:
      There is a new option called hybrid_mode which, when switched on,
      causes HBase style compactions.  Files from L0 are
      compacted back into L0. This meat of this compaction algorithm
      is in PickCompactionHybrid().
      
      All files reside in L0. That means all files have overlapping
      keys. Each file has a time-bound, i.e. each file contains a
      range of keys that were inserted around the same time. The
      start-seqno and the end-seqno refers to the timeframe when
      these keys were inserted.  Files that have contiguous seqno
      are compacted together into a larger file. All files are
      ordered from most recent to the oldest.
      
      The current compaction algorithm starts to look for
      candidate files starting from the most recent file. It continues to
      add more files to the same compaction run as long as the
      sum of the files chosen till now is smaller than the next
      candidate file size. This logic needs to be debated
      and validated.
      
      The above logic should reduce write amplification to a
      large extent... will publish numbers shortly.
      
      Test Plan: dbstress runs for 6 hours with no data corruption (tested so far).
      
      Differential Revision: https://reviews.facebook.net/D11289
      47c4191f
    • D
      Reduce write amplification by merging files in L0 back into L0 · 554c06dd
      Dhruba Borthakur 提交于
      Summary:
      There is a new option called hybrid_mode which, when switched on,
      causes HBase style compactions.  Files from L0 are
      compacted back into L0. This meat of this compaction algorithm
      is in PickCompactionHybrid().
      
      All files reside in L0. That means all files have overlapping
      keys. Each file has a time-bound, i.e. each file contains a
      range of keys that were inserted around the same time. The
      start-seqno and the end-seqno refers to the timeframe when
      these keys were inserted.  Files that have contiguous seqno
      are compacted together into a larger file. All files are
      ordered from most recent to the oldest.
      
      The current compaction algorithm starts to look for
      candidate files starting from the most recent file. It continues to
      add more files to the same compaction run as long as the
      sum of the files chosen till now is smaller than the next
      candidate file size. This logic needs to be debated
      and validated.
      
      The above logic should reduce write amplification to a
      large extent... will publish numbers shortly.
      
      Test Plan: dbstress runs for 6 hours with no data corruption (tested so far).
      
      Differential Revision: https://reviews.facebook.net/D11289
      554c06dd
  2. 20 6月, 2013 1 次提交
  3. 18 6月, 2013 1 次提交
  4. 13 6月, 2013 1 次提交
    • H
      [RocksDB] cleanup EnvOptions · bdf10859
      Haobo Xu 提交于
      Summary:
      This diff simplifies EnvOptions by treating it as POD, similar to Options.
      - virtual functions are removed and member fields are accessed directly.
      - StorageOptions is removed.
      - Options.allow_readahead and Options.allow_readahead_compactions are deprecated.
      - Unused global variables are removed: useOsBuffer, useFsReadAhead, useMmapRead, useMmapWrite
      
      Test Plan: make check; db_stress
      
      Reviewers: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D11175
      bdf10859
  5. 24 5月, 2013 1 次提交
  6. 22 5月, 2013 2 次提交
    • V
      [Kill randomly at various points in source code for testing] · 760dd475
      Vamsi Ponnekanti 提交于
      Summary:
      This is initial version. A few ways in which this could
      be extended in the future are:
      (a) Killing from more places in source code
      (b) Hashing stack and using that hash in determining whether to crash.
          This is to avoid crashing more often at source lines that are executed
          more often.
      (c) Raising exceptions or returning errors instead of killing
      
      Test Plan:
      This whole thing is for testing.
      
      Here is part of output:
      
      python2.7 tools/db_crashtest2.py -d 600
      Running db_stress
      
      db_stress retncode -15 output LevelDB version     : 1.5
      Number of threads   : 32
      Ops per thread      : 10000000
      Read percentage     : 50
      Write-buffer-size   : 4194304
      Delete percentage   : 30
      Max key             : 1000
      Ratio #ops/#keys    : 320000
      Num times DB reopens: 0
      Batches/snapshots   : 1
      Purge redundant %   : 50
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      No lock creation because test_batches_snapshots set
      2013/04/26-17:55:17  Starting database operations
      Created bg thread 0x7fc1f07ff700
      ... finished 60000 ops
      Running db_stress
      
      db_stress retncode -15 output LevelDB version     : 1.5
      Number of threads   : 32
      Ops per thread      : 10000000
      Read percentage     : 50
      Write-buffer-size   : 4194304
      Delete percentage   : 30
      Max key             : 1000
      Ratio #ops/#keys    : 320000
      Num times DB reopens: 0
      Batches/snapshots   : 1
      Purge redundant %   : 50
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      Created bg thread 0x7ff0137ff700
      No lock creation because test_batches_snapshots set
      2013/04/26-17:56:15  Starting database operations
      ... finished 90000 ops
      
      Revert Plan: OK
      
      Task ID: #2252691
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb, haobo
      
      Differential Revision: https://reviews.facebook.net/D10581
      760dd475
    • M
      Check to db_stress to not allow disable_wal and reopens set together · 3827403c
      Mayank Agarwal 提交于
      Summary: db can't reopen safely with disable_wal set!
      
      Test Plan: make db_stress; run db_stress with disable_wal and reopens set and see error
      
      Reviewers: dhruba, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10857
      3827403c
  7. 21 5月, 2013 1 次提交
  8. 03 5月, 2013 1 次提交
    • M
      Timestamp and TTL Wrapper for rocksdb · d786b25e
      Mayank Agarwal 提交于
      Summary:
      When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
      Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
      
      Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
      
      Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
      
      Reviewed By: vamsi
      
      CC: zshao, xjin, vkrest, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D10311
      d786b25e
  9. 09 4月, 2013 2 次提交
  10. 02 4月, 2013 1 次提交
    • M
      Python script to periodically run and kill the db_stress test · e937d471
      Mayank Agarwal 提交于
      Summary: The script runs and kills the stress test periodically. Default values have been used in the script now. Should I make this a part of the Makefile or automated rocksdb build? The values can be easily changed in the script right now, but should I add some support for variable values or input to the script? I believe the script achieves its objective of unsafe crashes and reopening to expect sanity in the database.
      
      Test Plan: python tools/db_crashtest.py
      
      Reviewers: dhruba, vamsi, MarkCallaghan
      
      Reviewed By: vamsi
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9369
      e937d471
  11. 28 3月, 2013 1 次提交
    • A
      memory manage statistics · 63f216ee
      Abhishek Kona 提交于
      Summary:
      Earlier Statistics object was a raw pointer. This meant the user had to clear up
      the Statistics object after creating the database. In most use cases the database is created in a function and the statistics pointer is out of scope. Hence the statistics object would never be deleted.
      Now Using a shared_ptr to manage this.
      
      Want this in before the next release.
      
      Test Plan: make all check.
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9735
      63f216ee
  12. 11 3月, 2013 1 次提交
    • V
      [Report the #gets and #founds in db_stress] · 8ade9359
      Vamsi Ponnekanti 提交于
      Summary:
      Also added some comments and fixed some bugs in
      stats reporting. Now the stats seem to match what is expected.
      
      Test Plan:
      [nponnekanti@dev902 /data/users/nponnekanti/rocksdb] ./db_stress --test_batches_snapshots=1 --ops_per_thread=1000 --threads=1 --max_key=320
      LevelDB version     : 1.5
      Number of threads   : 1
      Ops per thread      : 1000
      Read percentage     : 10
      Delete percentage   : 30
      Max key             : 320
      Ratio #ops/#keys    : 3
      Num times DB reopens: 10
      Batches/snapshots   : 1
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      No lock creation because test_batches_snapshots set
      2013/03/04-15:58:56  Starting database operations
      2013/03/04-15:58:56  Reopening database for the 1th time
      2013/03/04-15:58:56  Reopening database for the 2th time
      2013/03/04-15:58:56  Reopening database for the 3th time
      2013/03/04-15:58:56  Reopening database for the 4th time
      Created bg thread 0x7f4542bff700
      2013/03/04-15:58:56  Reopening database for the 5th time
      2013/03/04-15:58:56  Reopening database for the 6th time
      2013/03/04-15:58:56  Reopening database for the 7th time
      2013/03/04-15:58:57  Reopening database for the 8th time
      2013/03/04-15:58:57  Reopening database for the 9th time
      2013/03/04-15:58:57  Reopening database for the 10th time
      2013/03/04-15:58:57  Reopening database for the 11th time
      2013/03/04-15:58:57  Limited verification already done during gets
      Stress Test : 1811.551 micros/op 552 ops/sec
                  : Wrote 0.10 MB (0.05 MB/sec) (598% of 1011 ops)
                  : Wrote 6050 times
                  : Deleted 3050 times
                  : 500/900 gets found the key
                  : Got errors 0 times
      
      [nponnekanti@dev902 /data/users/nponnekanti/rocksdb] ./db_stress --ops_per_thread=1000 --threads=1 --max_key=320
      LevelDB version     : 1.5
      Number of threads   : 1
      Ops per thread      : 1000
      Read percentage     : 10
      Delete percentage   : 30
      Max key             : 320
      Ratio #ops/#keys    : 3
      Num times DB reopens: 10
      Batches/snapshots   : 0
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      Creating 80 locks
      2013/03/04-15:58:17  Starting database operations
      2013/03/04-15:58:17  Reopening database for the 1th time
      2013/03/04-15:58:17  Reopening database for the 2th time
      2013/03/04-15:58:17  Reopening database for the 3th time
      2013/03/04-15:58:17  Reopening database for the 4th time
      Created bg thread 0x7fc0f5bff700
      2013/03/04-15:58:17  Reopening database for the 5th time
      2013/03/04-15:58:17  Reopening database for the 6th time
      2013/03/04-15:58:18  Reopening database for the 7th time
      2013/03/04-15:58:18  Reopening database for the 8th time
      2013/03/04-15:58:18  Reopening database for the 9th time
      2013/03/04-15:58:18  Reopening database for the 10th time
      2013/03/04-15:58:18  Reopening database for the 11th time
      2013/03/04-15:58:18  Starting verification
      Stress Test : 1836.258 micros/op 544 ops/sec
                  : Wrote 0.01 MB (0.01 MB/sec) (59% of 1011 ops)
                  : Wrote 605 times
                  : Deleted 305 times
                  : 50/90 gets found the key
                  : Got errors 0 times
      2013/03/04-15:58:18  Verification successful
      
      Revert Plan: OK
      
      Task ID: #
      
      Reviewers: emayanke, dhruba
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9081
      8ade9359
  13. 08 3月, 2013 1 次提交
    • A
      Make db_stress Not purge redundant keys on some opens · 3b6653b1
      amayank 提交于
      Summary: In light of the new option introduced by commit 806e2643 where the database has an option to compact before flushing to disk, we want the stress test to test both sides of the option. Have made it to 'deterministically' and configurably change that option for reopens.
      
      Test Plan: make db_stress; ./db_stress with some differnet options
      
      Reviewers: dhruba, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb, sheki
      
      Differential Revision: https://reviews.facebook.net/D9165
      3b6653b1
  14. 26 2月, 2013 1 次提交
  15. 23 2月, 2013 1 次提交
    • V
      [Add a second kind of verification to db_stress · 465b9103
      Vamsi Ponnekanti 提交于
      Summary:
      Currently the test tracks all writes in memory and
      uses it for verification at the end. This has 4 problems:
      (a) It needs mutex for each write to ensure in-memory update
      and leveldb update are done atomically. This slows down the
      benchmark.
      (b) Verification phase at the end is time consuming as well
      (c) Does not test batch writes or snapshots
      (d) We cannot kill the test and restart multiple times in a
      loop because in-memory state will be lost.
      
      I am adding a FLAGS_multi that does MultiGet/MultiPut/MultiDelete
      instead of get/put/delete to get/put/delete a group of related
      keys with same values atomically. Every get retrieves the group
      of keys and checks that their values are same. This does not have
      the above problems but the downside is that it does less amount
      of validation than the other approach.
      
      Test Plan:
      This whole this is a test! Here is a small run. I am doing larger run now.
      
      [nponnekanti@dev902 /data/users/nponnekanti/rocksdb] ./db_stress --ops_per_thread=10000 --multi=1 --ops_per_key=25
      LevelDB version     : 1.5
      Number of threads   : 32
      Ops per thread      : 10000
      Read percentage     : 10
      Delete percentage   : 30
      Max key             : 2147483648
      Num times DB reopens: 10
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      Creating 536870912 locks
      2013/02/20-16:59:32  Starting database operations
      Created bg thread 0x7f9ebcfff700
      2013/02/20-16:59:37  Reopening database for the 1th time
      2013/02/20-16:59:46  Reopening database for the 2th time
      2013/02/20-16:59:57  Reopening database for the 3th time
      2013/02/20-17:00:11  Reopening database for the 4th time
      2013/02/20-17:00:25  Reopening database for the 5th time
      2013/02/20-17:00:36  Reopening database for the 6th time
      2013/02/20-17:00:47  Reopening database for the 7th time
      2013/02/20-17:00:59  Reopening database for the 8th time
      2013/02/20-17:01:10  Reopening database for the 9th time
      2013/02/20-17:01:20  Reopening database for the 10th time
      2013/02/20-17:01:31  Reopening database for the 11th time
      2013/02/20-17:01:31  Starting verification
      Stress Test : 109.125 micros/op 22191 ops/sec
                  : Wrote 0.00 MB (0.23 MB/sec) (59% of 32 ops)
                  : Deleted 10 times
      2013/02/20-17:01:31  Verification successful
      
      Revert Plan: OK
      
      Task ID: #
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8733
      465b9103
  16. 22 2月, 2013 1 次提交
    • A
      Exploring the rocksdb stress test · 1052ea23
      amayank 提交于
      Summary:
      Fixed a bug in the stress-test where the correct size was not being
      passed to GenerateValue. This bug was there since the beginning but assertions
      were switched on in our code-base only recently.
      Added comments on the top detailing how the stress test works and how to
      quicken/slow it down after investigation.
      
      Test Plan: make all check. ./db_stress
      
      Reviewers: dhruba, asad
      
      Reviewed By: dhruba
      
      CC: vamsi, sheki, heyongqiang, zshao
      
      Differential Revision: https://reviews.facebook.net/D8727
      1052ea23
  17. 21 2月, 2013 1 次提交
    • A
      Introduce histogram in statistics.h · fe10200d
      Abhishek Kona 提交于
      Summary:
      * Introduce is histogram in statistics.h
      * stop watch to measure time.
      * introduce two timers as a poc.
      Replaced NULL with nullptr to fight some lint errors
      Should be useful for google.
      
      Test Plan:
      ran db_bench and check stats.
      make all check
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8637
      fe10200d
  18. 24 1月, 2013 1 次提交
    • C
      Fix a number of object lifetime/ownership issues · 2fdf91a4
      Chip Turner 提交于
      Summary:
      Replace manual memory management with std::unique_ptr in a
      number of places; not exhaustive, but this fixes a few leaks with file
      handles as well as clarifies semantics of the ownership of file handles
      with log classes.
      
      Test Plan: db_stress, make check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: zshao, leveldb, heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D8043
      2fdf91a4
  19. 18 1月, 2013 1 次提交
    • K
      Fixed issues Valgrind found. · 3c3df740
      Kosie van der Merwe 提交于
      Summary:
      Found issues with `db_test` and `db_stress` when running valgrind.
      
      `DBImpl` had an issue where if an compaction failed then it will use the uninitialised file size of an output file is used. This manifested as the final call to output to the log in `DoCompactionWork()` branching on uninitialized memory (all the way down in printf's innards).
      
      Test Plan:
      Ran `valgrind --track_origins=yes ./db_test` and `valgrind ./db_stress` to see if issues disappeared.
      
      Ran `make check` to see if there were no regressions.
      
      Reviewers: vamsi, dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8001
      3c3df740
  20. 17 1月, 2013 1 次提交
    • A
      rollover manifest file. · 7d5a4383
      Abhishek Kona 提交于
      Summary:
      Check in LogAndApply if the file size is more than the limit set in
      Options.
      Things to consider : will this be expensive?
      
      Test Plan: make all check. Inputs on a new unit test?
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7701
      7d5a4383
  21. 19 11月, 2012 1 次提交
    • D
      enhance dbstress to simulate hard crash · 62e7583f
      Dhruba Borthakur 提交于
      Summary:
      dbstress has an option to reopen the database. Make it such that the
      previous handle is not closed before we reopen, this simulates a
      situation similar to a process crash.
      
      Added new api to DMImpl to remove the lock file.
      
      Test Plan: run db_stress
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6777
      62e7583f
  22. 15 11月, 2012 1 次提交
  23. 13 11月, 2012 1 次提交
    • A
      Introducing "database reopens" into the stress test. Database will reopen... · e6262617
      amayank 提交于
      Introducing "database reopens" into the stress test. Database will reopen after a specified number of iterations (configurable) of each thread when they will wait for the databse to reopen.
      
      Summary: FLAGS_reopen (configurable) specifies the number of times the databse is to be reopened. FLAGS_ops_per_thread is divided into points based on that reopen field. At these points all threads come together to wait for the databse to reopen. Each thread "votes" for the database to reopen and when all have voted, the database reopens.
      
      Test Plan: make all;./db_stress
      
      Reviewers: dhruba, MarkCallaghan, sheki, asad, heyongqiang
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6627
      e6262617
  24. 10 11月, 2012 1 次提交
  25. 09 11月, 2012 1 次提交
  26. 07 11月, 2012 1 次提交
  27. 20 10月, 2012 1 次提交
    • D
      This is the mega-patch multi-threaded compaction · 1ca05843
      Dhruba Borthakur 提交于
      published in https://reviews.facebook.net/D5997.
      
      Summary:
      This patch allows compaction to occur in multiple background threads
      concurrently.
      
      If a manual compaction is issued, the system falls back to a
      single-compaction-thread model. This is done to ensure correctess
      and simplicity of code. When the manual compaction is finished,
      the system resumes its concurrent-compaction mode automatically.
      
      The updates to the manifest are done via group-commit approach.
      
      Test Plan: run db_bench
      1ca05843
  28. 13 10月, 2012 1 次提交
  29. 11 10月, 2012 1 次提交
    • A
      [tools] Add a tool to stress test concurrent writing to levelDB · 24f7983b
      Asad K Awan 提交于
      Summary:
      Created a tool that runs multiple threads that concurrently read and write to levelDB.
      All writes to the DB are stored in an in-memory hashtable and verified at the end of the
      test. All writes for a given key are serialzied.
      
      Test Plan:
       - Verified by writing only a few keys and logging all writes and verifying that values read and written are correct.
       - Verified correctness of value generator.
       - Ran with various parameters of number of keys, locks, and threads.
      
      Reviewers: dhruba, MarkCallaghan, heyongqiang
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D5829
      24f7983b