1. 02 6月, 2013 1 次提交
    • H
      [RocksDB] Introduce Fast Mutex option · d897d33b
      Haobo Xu 提交于
      Summary:
      This diff adds an option to specify whether PTHREAD_MUTEX_ADAPTIVE_NP will be enabled for the rocksdb single big kernel lock. db_bench also have this option now.
      Quickly tested 8 thread cpu bound 100 byte random read.
      No fast mutex: ~750k/s ops
      With fast mutex: ~880k/s ops
      
      Test Plan: make check; db_bench; db_stress
      
      Reviewers: dhruba
      
      CC: MarkCallaghan, leveldb
      
      Differential Revision: https://reviews.facebook.net/D11031
      d897d33b
  2. 31 5月, 2013 1 次提交
    • H
      [RocksDB] [Performance] Allow different posix advice to be applied to the same table file · ab8d2f6a
      Haobo Xu 提交于
      Summary:
      Current posix advice implementation ties up the access pattern hint with the creation of a file.
      It is not possible to apply different advice for different access (random get vs compaction read),
      without keeping two open files for the same table. This patch extended the RandomeAccessFile interface
      to accept new access hint at anytime. Particularly, we are able to set different access hint on the same
      table file based on when/how the file is used.
      Two options are added to set the access hint, after the file is first opened and after the file is being
      compacted.
      
      Test Plan: make check; db_stress; db_bench
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: MarkCallaghan, leveldb
      
      Differential Revision: https://reviews.facebook.net/D10905
      ab8d2f6a
  3. 30 5月, 2013 1 次提交
  4. 29 5月, 2013 2 次提交
  5. 25 5月, 2013 3 次提交
  6. 24 5月, 2013 4 次提交
  7. 22 5月, 2013 6 次提交
    • V
      [Kill randomly at various points in source code for testing] · 760dd475
      Vamsi Ponnekanti 提交于
      Summary:
      This is initial version. A few ways in which this could
      be extended in the future are:
      (a) Killing from more places in source code
      (b) Hashing stack and using that hash in determining whether to crash.
          This is to avoid crashing more often at source lines that are executed
          more often.
      (c) Raising exceptions or returning errors instead of killing
      
      Test Plan:
      This whole thing is for testing.
      
      Here is part of output:
      
      python2.7 tools/db_crashtest2.py -d 600
      Running db_stress
      
      db_stress retncode -15 output LevelDB version     : 1.5
      Number of threads   : 32
      Ops per thread      : 10000000
      Read percentage     : 50
      Write-buffer-size   : 4194304
      Delete percentage   : 30
      Max key             : 1000
      Ratio #ops/#keys    : 320000
      Num times DB reopens: 0
      Batches/snapshots   : 1
      Purge redundant %   : 50
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      No lock creation because test_batches_snapshots set
      2013/04/26-17:55:17  Starting database operations
      Created bg thread 0x7fc1f07ff700
      ... finished 60000 ops
      Running db_stress
      
      db_stress retncode -15 output LevelDB version     : 1.5
      Number of threads   : 32
      Ops per thread      : 10000000
      Read percentage     : 50
      Write-buffer-size   : 4194304
      Delete percentage   : 30
      Max key             : 1000
      Ratio #ops/#keys    : 320000
      Num times DB reopens: 0
      Batches/snapshots   : 1
      Purge redundant %   : 50
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      Created bg thread 0x7ff0137ff700
      No lock creation because test_batches_snapshots set
      2013/04/26-17:56:15  Starting database operations
      ... finished 90000 ops
      
      Revert Plan: OK
      
      Task ID: #2252691
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb, haobo
      
      Differential Revision: https://reviews.facebook.net/D10581
      760dd475
    • H
      [RocksDB] Introduce an option to skip log error on recovery · 87d0af15
      Haobo Xu 提交于
      Summary:
      Currently, with paranoid_check on, DB::Open will fail on any log read error on recovery.
      If client is ok with losing most recent updates, we could simply skip those errors.
      However, it's important to introduce an additional flag, so that paranoid_check can
      still guard against more serious problems.
      
      Test Plan: make check; db_stress
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb, emayanke
      
      Differential Revision: https://reviews.facebook.net/D10869
      87d0af15
    • D
      Ability to set different size fanout multipliers for every level. · d1aaaf71
      Dhruba Borthakur 提交于
      Summary:
      There is an existing field Options.max_bytes_for_level_multiplier that
      sets the multiplier for the size of each level in the database.
      
      This patch introduces the ability to set different multipliers
      for every level in the database. The size of a level is determined
      by using both max_bytes_for_level_multiplier as well as the
      per-level fanout.
      
      size of level[i] = size of level[i-1] * max_bytes_for_level_multiplier
                         * fanout[i-1]
      
      The default value of fanout is 1, so that it is backward compatible.
      
      Test Plan: make check
      
      Reviewers: haobo, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10863
      d1aaaf71
    • H
      [RocksDB] [Performance Bug] MemTable::Get Slow · c3c13db3
      Haobo Xu 提交于
      Summary:
      The merge operator diff introduced a performance problem in MemTable::Get.
      An exit condition is missed when the current key does not match the user key.
      This could lead to full memtable scan if the user key is not found.
      
      Test Plan: make check; db_bench
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10851
      c3c13db3
    • M
      Check to db_stress to not allow disable_wal and reopens set together · 3827403c
      Mayank Agarwal 提交于
      Summary: db can't reopen safely with disable_wal set!
      
      Test Plan: make db_stress; run db_stress with disable_wal and reopens set and see error
      
      Reviewers: dhruba, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10857
      3827403c
    • H
      [RocksDB] Fix PosixLogger and AutoRollLogger thread safety · 839f6db7
      Haobo Xu 提交于
      Summary:
      PosixLogger and AutoRollLogger do not seem to be thread safe.
      For PosixLogger, log_size_ is not atomically updated.
      For AutoRollLogger, the underlying logger_ might be deleted by
      one thread while still being accessed by another.
      
      Test Plan: make check
      
      Reviewers: kailiu, dhruba, heyongqiang
      
      Reviewed By: kailiu
      
      CC: leveldb, zshao, sheki
      
      Differential Revision: https://reviews.facebook.net/D9699
      839f6db7
  8. 21 5月, 2013 1 次提交
  9. 18 5月, 2013 2 次提交
  10. 17 5月, 2013 1 次提交
  11. 16 5月, 2013 1 次提交
    • M
      Enhance the ldb tool to support ttl databases · 8a48410f
      Mayank Agarwal 提交于
      Summary: ldb works with raw data from the database and needs to be aware of ttl-database to work with it meaningfully. '-ttl' option now tells it that. Also added onto the ldb_test.py test. This option may be specified alongwith put, get, scan or dump. There is no support to provide a ttl-value and it uses default forever because there is no use-case for this currently.
      
      Test Plan: make ldb_test; python tools/ldb_test.py
      
      Reviewers: dhruba, sheki, haobo, vamsi
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10797
      8a48410f
  12. 15 5月, 2013 1 次提交
  13. 14 5月, 2013 3 次提交
    • D
      Implemented StringAppendOperator and unit tests. · accd3deb
      Deon Nicholas 提交于
      Summary:
      Implemented the StringAppendOperator class (subclass of MergeOperator).
      Found in utilities/merge_operators/string_append/stringappend.{h,cc}
      
      It is a rocksdb Merge Operator that supports string/list concatenation
       with a configurable delimiter.
      
      The tests are found in .../stringappend_test.cc. It implements a
       map : key -> (list of strings), with core operations Append(list_key,val)
       and Get(list_key).
      
      Test Plan:
      1. Navigate to your rocksdb repository
      2. Execute: make stringappend_test  (to compile)
      3. Execute: ./stringappend_test (to run the tests)
      4. Execute: make all check (to test the ENTIRE rocksdb codebase / regression)
      
      Reviewers: haobo, dhruba, zshao
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10737
      accd3deb
    • H
      [RocksDB] Cleanup compaction filter to use a class interface, instead of... · 4ca3c67b
      Haobo Xu 提交于
      [RocksDB] Cleanup compaction filter to use a class interface, instead of function pointer and additional context pointer.
      
      Summary:
      This diff replaces compaction_filter_args and CompactionFilter with a single compaction_filter parameter. It gives CompactionFilter better encapsulation and a similar look to Comparator and MergeOpertor, which improves consistency of the overall interface.
      The change is not backward compatible. Nevertheless, the two references in fbcode are not in production yet.
      
      Test Plan: make check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb, zshao
      
      Differential Revision: https://reviews.facebook.net/D10773
      4ca3c67b
    • H
      [RocksDB] fix compaction filter trigger condition · 73c0a333
      Haobo Xu 提交于
      Summary:
      Currently, compaction filter is run on internal key older than the oldest snapshot, which is incorrect.
      Compaction filter should really be run on the most recent internal key when there is no external snapshot.
      
      Test Plan: make check; db_stress
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D10641
      73c0a333
  14. 11 5月, 2013 3 次提交
  15. 10 5月, 2013 3 次提交
  16. 09 5月, 2013 1 次提交
    • D
      Assertion failure for L0-L1 compactions. · a8d3aa2c
      Dhruba Borthakur 提交于
      Summary:
      For level-0 compactions, we try to find if can include more L0 files
      in the same compaction run. This causes the 'smallest' and 'largest'
      key to get extended to a larger range. But the suceeding call to
      ParentRangeInCompaction() was still using the earlier
      values of 'smallest' and 'largest',
      
      Because of this bug, a file in L1 can be part of two concurrent
      compactions: one L0-L1 compaction and the other L1-L2 compaction.
      
      This should not cause any data loss, but will cause an assertion
      failure with debug builds.
      
      Test Plan: make check
      
      Differential Revision: https://reviews.facebook.net/D10677
      a8d3aa2c
  17. 07 5月, 2013 2 次提交
    • A
      [RocksDB] Clear Archive WAL files · 988c20b9
      Abhishek Kona 提交于
      Summary:
      WAL files are moved to archive directory and clear only at DB::Open.
      Can lead to a lot of space consumption in a Database. Added logic to periodically clear Archive Directory too.
      
      Test Plan: make all check + add unit test
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10617
      988c20b9
    • H
      [RocksDB] fix build · 3c4efc44
      Haobo Xu 提交于
      Summary: makefile change: LIBRARY => LIBOBJECTS
      thanks Abhishek for reproducing this locally.
      
      Test Plan: make release
      
      Reviewers: sheki
      
      CC: leveldb
      
      Task ID: #
      
      Blame Rev:
      3c4efc44
  18. 04 5月, 2013 2 次提交
    • H
      [Rocksdb] Support Merge operation in rocksdb · 05e88540
      Haobo Xu 提交于
      Summary:
      This diff introduces a new Merge operation into rocksdb.
      The purpose of this review is mostly getting feedback from the team (everyone please) on the design.
      
      Please focus on the four files under include/leveldb/, as they spell the client visible interface change.
      include/leveldb/db.h
      include/leveldb/merge_operator.h
      include/leveldb/options.h
      include/leveldb/write_batch.h
      
      Please go over local/my_test.cc carefully, as it is a concerete use case.
      
      Please also review the impelmentation files to see if the straw man implementation makes sense.
      
      Note that, the diff does pass all make check and truly supports forward iterator over db and a version
      of Get that's based on iterator.
      
      Future work:
      - Integration with compaction
      - A raw Get implementation
      
      I am working on a wiki that explains the design and implementation choices, but coding comes
      just naturally and I think it might be a good idea to share the code earlier. The code is
      heavily commented.
      
      Test Plan: run all local tests
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: leveldb, zshao, sheki, emayanke, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D9651
      05e88540
    • M
      Fix invalid-read to freed memory in ttl-iterator · 37e97b12
      Mayank Agarwal 提交于
      Summary: value function in ttl-iterator was returning string which would have been freed before its usage as a slice. Thanks valgrind!
      
      Test Plan: valgrind ./ttl_test
      
      Reviewers: dhruba, haobo, sheki, vamsi
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D10635
      37e97b12
  19. 03 5月, 2013 1 次提交
    • M
      Timestamp and TTL Wrapper for rocksdb · d786b25e
      Mayank Agarwal 提交于
      Summary:
      When opened with DBTimestamp::Open call, timestamps are prepended to and stripped from the value during subsequent Put and Get calls respectively. The Timestamp is used to discard values in Get and custom compaction filter which have exceeded their TTL which is specified during Open.
      Have made a temporary change to Makefile to let us test with the temporary file TestTime.cc. Have also changed the private members of db_impl.h to protected to let them be inherited by the new class DBTimestamp
      
      Test Plan: make db_timestamp; TestTime.cc(will not check it in) shows how to use the apis currently, but I will write unit-tests shortly
      
      Reviewers: dhruba, vamsi, haobo, sheki, heyongqiang, vkrest
      
      Reviewed By: vamsi
      
      CC: zshao, xjin, vkrest, MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D10311
      d786b25e
  20. 30 4月, 2013 1 次提交