1. 21 3月, 2013 3 次提交
    • D
      Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. · ad96563b
      Dhruba Borthakur 提交于
      Summary:
      This patch allows an application to specify whether to use bufferedio,
      reads-via-mmaps and writes-via-mmaps per database. Earlier, there
      was a global static variable that was used to configure this functionality.
      
      The default setting remains the same (and is backward compatible):
       1. use bufferedio
       2. do not use mmaps for reads
       3. use mmap for writes
       4. use readaheads for reads needed for compaction
      
      I also added a parameter to db_bench to be able to explicitly specify
      whether to do readaheads for compactions or not.
      
      Test Plan: make check
      
      Reviewers: sheki, heyongqiang, MarkCallaghan
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9429
      ad96563b
    • D
      1.5.8.1.fb release. · 2adddeef
      Dhruba Borthakur 提交于
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      2adddeef
    • M
      Removing boost from ldb_cmd.cc · a6f42754
      Mayank Agarwal 提交于
      Summary: Getting rid of boost in our github codebase which caused problems on third-party
      
      Test Plan: make ldb; python tools/ldb_test.py
      
      Reviewers: sheki, dhruba
      
      Reviewed By: sheki
      
      Differential Revision: https://reviews.facebook.net/D9543
      a6f42754
  2. 20 3月, 2013 5 次提交
  3. 19 3月, 2013 1 次提交
  4. 16 3月, 2013 1 次提交
  5. 15 3月, 2013 2 次提交
    • M
      Doing away with boost in ldb_cmd.h · a78fb5e8
      Mayank Agarwal 提交于
      Summary: boost functions cause complications while deploying to third-party
      
      Test Plan: make
      
      Reviewers: sheki, dhruba
      
      Reviewed By: sheki
      
      Differential Revision: https://reviews.facebook.net/D9441
      a78fb5e8
    • M
      Enhance db_bench · 5a8c8845
      Mark Callaghan 提交于
      Summary:
      Add --benchmarks=updaterandom for read-modify-write workloads. This is different
      from --benchmarks=readrandomwriterandom in a few ways. First, an "operation" is the
      combined time to do the read & write rather than treating them as two ops. Second,
      the same key is used for the read & write.
      
      Change RandomGenerator to support rows larger than 1M. That was using "assert"
      to fail and assert is compiled-away when -DNDEBUG is used.
      
      Add more options to db_bench
      --duration - sets the number of seconds for tests to run. When not set the
      operation count continues to be the limit. This is used by random operation
      tests.
      
      --use_snapshot - when set GetSnapshot() is called prior to each random read.
      This is to measure the overhead from using snapshots.
      
      --get_approx - when set GetApproximateSizes() is called prior to each random
      read. This is to measure the overhead for a query optimizer.
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D9267
      5a8c8845
  6. 14 3月, 2013 2 次提交
    • M
      Updating fbcode.gcc471.sh to use jemalloc 3.3.1 · e93dc3c0
      Mayank Agarwal 提交于
      Summary: Updated TOOL_CHAIN_LIB_BASE to use the third-party version for jemalloc-3.3.1 which contains a bug fix in quarantine.cc. This was detected while debugging valgrind issues with the rocksdb table_test
      
      Test Plan: make table_test;valgrind --leak-check=full ./table_test
      
      Reviewers: dhruba, sheki, vamsi
      
      Reviewed By: sheki
      
      Differential Revision: https://reviews.facebook.net/D9387
      e93dc3c0
    • A
      Use posix_fallocate as default. · 1ba5abca
      Abhishek Kona 提交于
      Summary:
      Ftruncate does not throw an error on disk-full. This causes Sig-bus in
      the case where the database tries to issue a Put call on a full-disk.
      
      Use posix_fallocate for allocation instead of truncate.
      Add a check to use MMaped files only on ext4, xfs and tempfs, as
      posix_fallocate is very slow on ext3 and older.
      
      Test Plan: make all check
      
      Reviewers: dhruba, chip
      
      Reviewed By: dhruba
      
      CC: adsharma, leveldb
      
      Differential Revision: https://reviews.facebook.net/D9291
      1ba5abca
  7. 13 3月, 2013 2 次提交
  8. 12 3月, 2013 2 次提交
    • D
      Prevent segfault because SizeUnderCompaction was called without any locks. · ebf16f57
      Dhruba Borthakur 提交于
      Summary:
      SizeBeingCompacted was called without any lock protection. This causes
      crashes, especially when running db_bench with value_size=128K.
      The fix is to compute SizeUnderCompaction while holding the mutex and
      passing in these values into the call to Finalize.
      
      (gdb) where
      #4  leveldb::VersionSet::SizeBeingCompacted (this=this@entry=0x7f0b490931c0, level=level@entry=4) at db/version_set.cc:1827
      #5  0x000000000043a3c8 in leveldb::VersionSet::Finalize (this=this@entry=0x7f0b490931c0, v=v@entry=0x7f0b3b86b480) at db/version_set.cc:1420
      #6  0x00000000004418d1 in leveldb::VersionSet::LogAndApply (this=0x7f0b490931c0, edit=0x7f0b3dc8c200, mu=0x7f0b490835b0, new_descriptor_log=<optimized out>) at db/version_set.cc:1016
      #7  0x00000000004222b2 in leveldb::DBImpl::InstallCompactionResults (this=this@entry=0x7f0b49083400, compact=compact@entry=0x7f0b2b8330f0) at db/db_impl.cc:1473
      #8  0x0000000000426027 in leveldb::DBImpl::DoCompactionWork (this=this@entry=0x7f0b49083400, compact=compact@entry=0x7f0b2b8330f0) at db/db_impl.cc:1757
      #9  0x0000000000426690 in leveldb::DBImpl::BackgroundCompaction (this=this@entry=0x7f0b49083400, madeProgress=madeProgress@entry=0x7f0b41bf2d1e, deletion_state=...) at db/db_impl.cc:1268
      #10 0x0000000000428f42 in leveldb::DBImpl::BackgroundCall (this=0x7f0b49083400) at db/db_impl.cc:1170
      #11 0x000000000045348e in BGThread (this=0x7f0b49023100) at util/env_posix.cc:941
      #12 leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper (arg=0x7f0b49023100) at util/env_posix.cc:874
      #13 0x00007f0b4a7cf10d in start_thread (arg=0x7f0b41bf3700) at pthread_create.c:301
      #14 0x00007f0b49b4b11d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
      
      Test Plan:
      make check
      
      I am running db_bench with a value size of 128K to see if the segfault is fixed.
      
      Reviewers: MarkCallaghan, sheki, emayanke
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9279
      ebf16f57
    • D
      Make the build-time show up in the leveldb library. · c04c956b
      Dhruba Borthakur 提交于
      Summary:
      This is a regression caused by
      https://github.com/facebook/rocksdb/commit/772f75b3fbc5cfcf4d519114751efeae04411fa1
      
      If you do "strings libleveldb.a | grep leveldb_build_git_datetime" it will
      show you the time when the binary was built.
      
      Test Plan: make check
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9273
      c04c956b
  9. 11 3月, 2013 1 次提交
    • V
      [Report the #gets and #founds in db_stress] · 8ade9359
      Vamsi Ponnekanti 提交于
      Summary:
      Also added some comments and fixed some bugs in
      stats reporting. Now the stats seem to match what is expected.
      
      Test Plan:
      [nponnekanti@dev902 /data/users/nponnekanti/rocksdb] ./db_stress --test_batches_snapshots=1 --ops_per_thread=1000 --threads=1 --max_key=320
      LevelDB version     : 1.5
      Number of threads   : 1
      Ops per thread      : 1000
      Read percentage     : 10
      Delete percentage   : 30
      Max key             : 320
      Ratio #ops/#keys    : 3
      Num times DB reopens: 10
      Batches/snapshots   : 1
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      No lock creation because test_batches_snapshots set
      2013/03/04-15:58:56  Starting database operations
      2013/03/04-15:58:56  Reopening database for the 1th time
      2013/03/04-15:58:56  Reopening database for the 2th time
      2013/03/04-15:58:56  Reopening database for the 3th time
      2013/03/04-15:58:56  Reopening database for the 4th time
      Created bg thread 0x7f4542bff700
      2013/03/04-15:58:56  Reopening database for the 5th time
      2013/03/04-15:58:56  Reopening database for the 6th time
      2013/03/04-15:58:56  Reopening database for the 7th time
      2013/03/04-15:58:57  Reopening database for the 8th time
      2013/03/04-15:58:57  Reopening database for the 9th time
      2013/03/04-15:58:57  Reopening database for the 10th time
      2013/03/04-15:58:57  Reopening database for the 11th time
      2013/03/04-15:58:57  Limited verification already done during gets
      Stress Test : 1811.551 micros/op 552 ops/sec
                  : Wrote 0.10 MB (0.05 MB/sec) (598% of 1011 ops)
                  : Wrote 6050 times
                  : Deleted 3050 times
                  : 500/900 gets found the key
                  : Got errors 0 times
      
      [nponnekanti@dev902 /data/users/nponnekanti/rocksdb] ./db_stress --ops_per_thread=1000 --threads=1 --max_key=320
      LevelDB version     : 1.5
      Number of threads   : 1
      Ops per thread      : 1000
      Read percentage     : 10
      Delete percentage   : 30
      Max key             : 320
      Ratio #ops/#keys    : 3
      Num times DB reopens: 10
      Batches/snapshots   : 0
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      Creating 80 locks
      2013/03/04-15:58:17  Starting database operations
      2013/03/04-15:58:17  Reopening database for the 1th time
      2013/03/04-15:58:17  Reopening database for the 2th time
      2013/03/04-15:58:17  Reopening database for the 3th time
      2013/03/04-15:58:17  Reopening database for the 4th time
      Created bg thread 0x7fc0f5bff700
      2013/03/04-15:58:17  Reopening database for the 5th time
      2013/03/04-15:58:17  Reopening database for the 6th time
      2013/03/04-15:58:18  Reopening database for the 7th time
      2013/03/04-15:58:18  Reopening database for the 8th time
      2013/03/04-15:58:18  Reopening database for the 9th time
      2013/03/04-15:58:18  Reopening database for the 10th time
      2013/03/04-15:58:18  Reopening database for the 11th time
      2013/03/04-15:58:18  Starting verification
      Stress Test : 1836.258 micros/op 544 ops/sec
                  : Wrote 0.01 MB (0.01 MB/sec) (59% of 1011 ops)
                  : Wrote 605 times
                  : Deleted 305 times
                  : 50/90 gets found the key
                  : Got errors 0 times
      2013/03/04-15:58:18  Verification successful
      
      Revert Plan: OK
      
      Task ID: #
      
      Reviewers: emayanke, dhruba
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9081
      8ade9359
  10. 09 3月, 2013 3 次提交
  11. 08 3月, 2013 2 次提交
    • A
      Make db_stress Not purge redundant keys on some opens · 3b6653b1
      amayank 提交于
      Summary: In light of the new option introduced by commit 806e2643 where the database has an option to compact before flushing to disk, we want the stress test to test both sides of the option. Have made it to 'deterministically' and configurably change that option for reopens.
      
      Test Plan: make db_stress; ./db_stress with some differnet options
      
      Reviewers: dhruba, vamsi
      
      Reviewed By: dhruba
      
      CC: leveldb, sheki
      
      Differential Revision: https://reviews.facebook.net/D9165
      3b6653b1
    • D
      A mechanism to detect manifest file write errors and put db in readonly mode. · 6d812b6a
      Dhruba Borthakur 提交于
      Summary:
      If there is an error while writing an edit to the manifest file, the manifest
      file is closed and reopened to check if the edit made it in. However, if the
      re-opening of the manifest is unsuccessful and options.paranoid_checks is set
      t true, then the db refuses to accept new puts, effectively putting the db
      in readonly mode.
      
      In a future diff, I would like to make the default value of paranoid_check
      to true.
      
      Test Plan: make check
      
      Reviewers: sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9201
      6d812b6a
  12. 07 3月, 2013 3 次提交
    • A
      Use version 3.8.1 for valgrind in third_party and do away with log files · 3b87e2bd
      amayank 提交于
      Summary:
      valgrind 3.7.0 used currently has a bug that needs LD_PRELOAD being set as a workaround. This caused problems when run on jenkins. 3.8.1 has fixed this issue and we should use it from third party
      Also, have done away with log files. The whole output will be there on the terminal and the failed tests will be listed at the end. This is done because jenkins only lets us download the different files and not view them in the browser which is undesirable.
      
      Test Plan: make valgrind_check
      
      Reviewers: akushner, dhruba, vamsi, sheki, heyongqiang
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9171
      3b87e2bd
    • A
      Do not allow Transaction Log Iterator to fall ahead when writer is writing the same file · d68880a1
      Abhishek Kona 提交于
      Summary:
      Store the last flushed, seq no. in db_impl. Check against it in
      transaction Log iterator. Do not attempt to read ahead if we do not know
      if the data is flushed completely.
      Does not work if flush is disabled. Any ideas on fixing that?
      * Minor change, iter->Next is called the first time automatically for
      * the first time.
      
      Test Plan:
      existing test pass.
      More ideas on testing this?
      Planning to run some stress test.
      
      Reviewers: dhruba, heyongqiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9087
      d68880a1
    • D
      Fox db_stress crash by copying keys before changing sequencenum to zero. · afed6093
      Dhruba Borthakur 提交于
      Summary:
      The compaction process zeros out sequence numbers if the output is
      part of the bottommost level.
      The Slice is supposed to refer to an immutable data buffer. The
      merger that implements the priority queue while reading kvs as
      the input of a compaction run reies on this fact. The bug was that
      were updating the sequence number of a record in-place and that was
      causing suceeding invocations of the merger to return kvs in
      arbitrary order of sequence numbers.
      The fix is to copy the key to a local memory buffer before setting
      its seqno to 0.
      
      Test Plan:
      Set Options.purge_redundant_kvs_while_flush = false and then run
      db_stress --ops_per_thread=1000 --max_key=320
      
      Reviewers: emayanke, sheki
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D9147
      afed6093
  13. 06 3月, 2013 3 次提交
  14. 05 3月, 2013 2 次提交
  15. 04 3月, 2013 2 次提交
    • M
      Add rate_delay_limit_milliseconds · 993543d1
      Mark Callaghan 提交于
      Summary:
      This adds the rate_delay_limit_milliseconds option to make the delay
      configurable in MakeRoomForWrite when the max compaction score is too high.
      This delay is called the Ln slowdown. This change also counts the Ln slowdown
      per level to make it possible to see where the stalls occur.
      
      From IO-bound performance testing, the Level N stalls occur:
      * with compression -> at the largest uncompressed level. This makes sense
                            because compaction for compressed levels is much
                            slower. When Lx is uncompressed and Lx+1 is compressed
                            then files pile up at Lx because the (Lx,Lx+1)->Lx+1
                            compaction process is the first to be slowed by
                            compression.
      * without compression -> at level 1
      
      Task ID: #1832108
      
      Blame Rev:
      
      Test Plan:
      run with real data, added test
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D9045
      993543d1
    • D
      Ability for rocksdb to compact when flushing the in-memory memtable to a file in L0. · 806e2643
      Dhruba Borthakur 提交于
      Summary:
      Rocks accumulates recent writes and deletes in the in-memory memtable.
      When the memtable is full, it writes the contents on the memtable to
      a file in L0.
      
      This patch removes redundant records at the time of the flush. If there
      are multiple versions of the same key in the memtable, then only the
      most recent one is dumped into the output file. The purging of
      redundant records occur only if the most recent snapshot is earlier
      than the earliest record in the memtable.
      
      Should we switch on this feature by default or should we keep this feature
      turned off in the default settings?
      
      Test Plan: Added test case to db_test.cc
      
      Reviewers: sheki, vamsi, emayanke, heyongqiang
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8991
      806e2643
  16. 02 3月, 2013 2 次提交
    • B
      enable the ability to set key size in db_bench in rocksdb · 49926337
      bil 提交于
      Summary:
      1. the default value for key size is still 16
      2. enable the ability to set the key size via command line --key_size=
      
      Test Plan:
      build & run db_banch and pass some value via command line.
      verify it works correctly.
      
      Reviewers: sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8943
      49926337
    • A
      Automating valgrind to run with jenkins · ec96ad54
      amayank 提交于
      Summary:
      The script valgrind_test.sh runs Valgrind for all tests in the makefile
      including leak-checks and outputs the logs for every test in a separate file
      with the name "valgrind_log_<testname>". It prints the failed tests in the file
      "valgrind_failed_tests". All these files are created in the directory
      "VALGRIND_LOGS" which can be changed in the Makefile.
      Finally it checks the line-count for the file "valgrind_failed_tests"
      and returns 0 if no tests failed and 1 otherwise.
      
      Test Plan: ./valgrind_test.sh; Changed the tests to incorporte leaks and verified correctness
      
      Reviewers: dhruba, sheki, MarkCallaghan
      
      Reviewed By: sheki
      
      CC: zshao
      
      Differential Revision: https://reviews.facebook.net/D8877
      ec96ad54
  17. 01 3月, 2013 1 次提交
  18. 27 2月, 2013 1 次提交
  19. 26 2月, 2013 1 次提交
  20. 23 2月, 2013 1 次提交
    • V
      [Add a second kind of verification to db_stress · 465b9103
      Vamsi Ponnekanti 提交于
      Summary:
      Currently the test tracks all writes in memory and
      uses it for verification at the end. This has 4 problems:
      (a) It needs mutex for each write to ensure in-memory update
      and leveldb update are done atomically. This slows down the
      benchmark.
      (b) Verification phase at the end is time consuming as well
      (c) Does not test batch writes or snapshots
      (d) We cannot kill the test and restart multiple times in a
      loop because in-memory state will be lost.
      
      I am adding a FLAGS_multi that does MultiGet/MultiPut/MultiDelete
      instead of get/put/delete to get/put/delete a group of related
      keys with same values atomically. Every get retrieves the group
      of keys and checks that their values are same. This does not have
      the above problems but the downside is that it does less amount
      of validation than the other approach.
      
      Test Plan:
      This whole this is a test! Here is a small run. I am doing larger run now.
      
      [nponnekanti@dev902 /data/users/nponnekanti/rocksdb] ./db_stress --ops_per_thread=10000 --multi=1 --ops_per_key=25
      LevelDB version     : 1.5
      Number of threads   : 32
      Ops per thread      : 10000
      Read percentage     : 10
      Delete percentage   : 30
      Max key             : 2147483648
      Num times DB reopens: 10
      Num keys per lock   : 4
      Compression         : snappy
      ------------------------------------------------
      Creating 536870912 locks
      2013/02/20-16:59:32  Starting database operations
      Created bg thread 0x7f9ebcfff700
      2013/02/20-16:59:37  Reopening database for the 1th time
      2013/02/20-16:59:46  Reopening database for the 2th time
      2013/02/20-16:59:57  Reopening database for the 3th time
      2013/02/20-17:00:11  Reopening database for the 4th time
      2013/02/20-17:00:25  Reopening database for the 5th time
      2013/02/20-17:00:36  Reopening database for the 6th time
      2013/02/20-17:00:47  Reopening database for the 7th time
      2013/02/20-17:00:59  Reopening database for the 8th time
      2013/02/20-17:01:10  Reopening database for the 9th time
      2013/02/20-17:01:20  Reopening database for the 10th time
      2013/02/20-17:01:31  Reopening database for the 11th time
      2013/02/20-17:01:31  Starting verification
      Stress Test : 109.125 micros/op 22191 ops/sec
                  : Wrote 0.00 MB (0.23 MB/sec) (59% of 32 ops)
                  : Deleted 10 times
      2013/02/20-17:01:31  Verification successful
      
      Revert Plan: OK
      
      Task ID: #
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D8733
      465b9103