1. 09 1月, 2013 1 次提交
  2. 08 1月, 2013 1 次提交
    • M
      Add --seed, --read_range to db_bench · 4069f66c
      Mark Callaghan 提交于
      Summary:
      Adds the option --seed to db_bench to specify the base for the per-thread RNG.
      When not set each thread uses the same value across runs of db_bench which defeats
      IO stress testing.
      
      Adds the option --read_range. When set to a value > 1 an iterator is created and
      each query done for the randomread benchmark will do a range scan for that many
      rows. When not set or set to 1 the existing behavior (a point lookup) is done.
      
      Fixes a bug where a printf format string was missing.
      
      Test Plan: run db_bench
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7749
      4069f66c
  3. 29 11月, 2012 2 次提交
    • S
      Move WAL files to archive directory, instead of deleting. · d4627e6d
      sheki 提交于
      Summary:
      Create a directory "archive" in the DB directory.
      During DeleteObsolteFiles move the WAL files (*.log) to the Archive directory,
      instead of deleting.
      
      Test Plan: Created a DB using DB_Bench. Reopened it. Checked if files move.
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6975
      d4627e6d
    • A
      Fix all the lint errors. · d29f1819
      Abhishek Kona 提交于
      Summary:
      Scripted and removed all trailing spaces and converted all tabs to
      spaces.
      
      Also fixed other lint errors.
      All lint errors from this point of time should be taken seriously.
      
      Test Plan: make all check
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D7059
      d29f1819
  4. 22 11月, 2012 1 次提交
    • D
      Support taking a configurable number of files from the same level to compact... · 7632fdb5
      Dhruba Borthakur 提交于
      Support taking a configurable number of  files from the same level to compact in a single compaction run.
      
      Summary:
      The compaction process takes some files from LevelK and
      merges it into LevelK+1. The number of files it picks from
      LevelK was capped such a way that the total amount of
      data picked does not exceed the maxfilesize of that level.
      This essentially meant that only one file from LevelK
      is picked for a single compaction.
      
      For bulkloads, we would like to take many many file from
      LevelK and compact them using a single compaction run.
      
      This patch introduces a option called the 'source_compaction_factor'
      (similar to expanded_compaction_factor). It is a multiplier
      that is multiplied by the maxfilesize of that level to arrive
      at the limit that is used to throttle the number of source
      files from LevelK.  For bulk loads, set source_compaction_factor
      to a very high number so that multiple files from the same
      level are picked for compaction in a single compaction.
      
      The default value of source_compaction_factor is 1, so that
      we can keep backward compatibilty with existing compaction semantics.
      
      Test Plan: make clean check
      
      Reviewers: emayanke, sheki
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6867
      7632fdb5
  5. 21 11月, 2012 1 次提交
  6. 15 11月, 2012 1 次提交
  7. 13 11月, 2012 1 次提交
    • D
      The db_bench utility was broken in 1.5.4.fb because of a signed-unsigned comparision. · a785e029
      Dhruba Borthakur 提交于
      Summary:
      The db_bench utility was broken in 1.5.4.fb because of a
      signed-unsigned comparision.
      
      The static variable FLAGS_min_level_to_compress was recently
      changed from int to 'unsigned in' but it is initilized to a
      nagative value -1.
      
      The segfault is of this type:
      Program received signal SIGSEGV, Segmentation fault.
      Open (this=0x7fffffffdee0) at db/db_bench.cc:939
      939	db/db_bench.cc: No such file or directory.
      (gdb) where
      
      Test Plan: run db_bench with no options.
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: MarkCallaghan, emayanke, sheki
      
      Differential Revision: https://reviews.facebook.net/D6663
      a785e029
  8. 10 11月, 2012 2 次提交
  9. 09 11月, 2012 1 次提交
    • A
      stat's collection in leveldb · 391885c4
      Abhishek Kona 提交于
      Summary:
      Prototype stat's collection. Diff is a good estimate of what
      the final code will look like.
      A few assumptions :
        * Used a global static instance of the statistics object. Plan to pass
        it to each internal function. Static allows metrics only at app
        level.
        * In the Ticker's do not do any locking. Depend on the mutex at each
         function of LevelDB. If we ever remove the mutex, we should change
         here too. The other option is use atomic objects anyways as there
         won't be any contention as they will be always acquired only by one
         thread.
        * The counters are dumb, increment through lifecycle. Plan to use ods
          etc to get last5min stat etc.
      
      Test Plan:
      made changes in db_bench
      Ran ./db_bench --statistics=1 --num=10000 --cache_size=5000
      This will print the cache hit/miss stats.
      
      Reviewers: dhruba, heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D6441
      391885c4
  10. 08 11月, 2012 1 次提交
  11. 07 11月, 2012 1 次提交
  12. 03 11月, 2012 1 次提交
  13. 02 11月, 2012 1 次提交
  14. 30 10月, 2012 3 次提交
    • D
      Allow having different compression algorithms on different levels. · 321dfdc3
      Dhruba Borthakur 提交于
      Summary:
      The leveldb API is enhanced to support different compression algorithms at
      different levels.
      
      This adds the option min_level_to_compress to db_bench that specifies
      the minimum level for which compression should be done when
      compression is enabled. This can be used to disable compression for levels
      0 and 1 which are likely to suffer from stalls because of the CPU load
      for memtable flushes and (L0,L1) compaction.  Level 0 is special as it
      gets frequent memtable flushes. Level 1 is special as it frequently
      gets all:all file compactions between it and level 0. But all other levels
      could be the same. For any level N where N > 1, the rate of sequential
      IO for that level should be the same. The last level is the
      exception because it might not be full and because files from it are
      not read to compact with the next larger level.
      
      The same amount of time will be spent doing compaction at any
      level N excluding N=0, 1 or the last level. By this standard all
      of those levels should use the same compression. The difference is that
      the loss (using more disk space) from a faster compression algorithm
      is less significant for N=2 than for N=3. So we might be willing to
      trade disk space for faster write rates with no compression
      for L0 and L1, snappy for L2, zlib for L3. Using a faster compression
      algorithm for the mid levels also allows us to reclaim some cpu
      without trading off much loss in disk space overhead.
      
      Also note that little is to be gained by compressing levels 0 and 1. For
      a 4-level tree they account for 10% of the data. For a 5-level tree they
      account for 1% of the data.
      
      With compression enabled:
      * memtable flush rate is ~18MB/second
      * (L0,L1) compaction rate is ~30MB/second
      
      With compression enabled but min_level_to_compress=2
      * memtable flush rate is ~320MB/second
      * (L0,L1) compaction rate is ~560MB/second
      
      This practicaly takes the same code from https://reviews.facebook.net/D6225
      but makes the leveldb api more general purpose with a few additional
      lines of code.
      
      Test Plan: make check
      
      Differential Revision: https://reviews.facebook.net/D6261
      321dfdc3
    • M
      Add more rates to db_bench output · acc8567b
      Mark Callaghan 提交于
      Summary:
      Adds the "MB/sec in" and "MB/sec out" to this line:
      Amplification: 1.7 rate, 0.01 GB in, 0.02 GB out, 8.24 MB/sec in, 13.75 MB/sec out
      
      Changes all values to be reported per interval and since test start for this line:
      ... thread 0: (10000,60000) ops and (19155.6,27307.5) ops/second in (0.522041,2.197198) seconds
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6291
      acc8567b
    • M
      Adds DB::GetNextCompaction and then uses that for rate limiting db_bench · 70c42bf0
      Mark Callaghan 提交于
      Summary:
      Adds a method that returns the score for the next level that most
      needs compaction. That method is then used by db_bench to rate limit threads.
      Threads are put to sleep at the end of each stats interval until the score
      is less than the limit. The limit is set via the --rate_limit=$double option.
      The specified value must be > 1.0. Also adds the option --stats_per_interval
      to enable additional metrics reported every stats interval.
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6243
      70c42bf0
  15. 25 10月, 2012 1 次提交
    • M
      Improve statistics · e7206f43
      Mark Callaghan 提交于
      Summary:
      This adds more statistics to be reported by GetProperty("leveldb.stats").
      The new stats include time spent waiting on stalls in MakeRoomForWrite.
      This also includes the total amplification rate where that is:
          (#bytes of sequential IO during compaction) / (#bytes from Put)
      This also includes a lot more data for the per-level compaction report.
      * Rn(MB) - MB read from level N during compaction between levels N and N+1
      * Rnp1(MB) - MB read from level N+1 during compaction between levels N and N+1
      * Wnew(MB) - new data written to the level during compaction
      * Amplify - ( Write(MB) + Rnp1(MB) ) / Rn(MB)
      * Rn - files read from level N during compaction between levels N and N+1
      * Rnp1 - files read from level N+1 during compaction between levels N and N+1
      * Wnp1 - files written to level N+1 during compaction between levels N and N+1
      * NewW - new files written to level N+1 during compaction
      * Count - number of compactions done for this level
      
      This is the new output from DB::GetProperty("leveldb.stats"). The old output stopped at Write(MB)
      
                                     Compactions
      Level  Files Size(MB) Time(sec) Read(MB) Write(MB)  Rn(MB) Rnp1(MB) Wnew(MB) Amplify Read(MB/s) Write(MB/s)   Rn Rnp1 Wnp1 NewW Count
      -------------------------------------------------------------------------------------------------------------------------------------
        0        3        6        33        0       576       0        0      576    -1.0       0.0         1.3     0    0    0    0   290
        1      127      242       351     5316      5314     570     4747      567    17.0      12.1        12.1   287 2399 2685  286    32
        2      161      328        54      822       824     326      496      328     4.0       1.9         1.9   160  251  411  160   161
      Amplification: 22.3 rate, 0.56 GB in, 12.55 GB out
      Uptime(secs): 439.8
      Stalls(secs): 206.938 level0_slowdown, 0.000 level0_numfiles, 24.129 memtable_compaction
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      (cherry picked from commit ecdeead38f86cc02e754d0032600742c4f02fec8)
      
      Reviewers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6153
      e7206f43
  16. 20 10月, 2012 2 次提交
    • D
      db_bench was not correctly initializing the value for delete_obsolete_files_period_micros option. · cf5adc80
      Dhruba Borthakur 提交于
      Summary:
      The parameter delete_obsolete_files_period_micros controls the
      periodicity of deleting obsolete files. db_bench was reading in
      this parameter intoa local variable called 'l' but was incorrectly
      using another local variable called 'n' while setting it in the
      db.options data structure.
      This patch also logs the value of delete_obsolete_files_period_micros
      in the LOG file at db startup time.
      
      I am hoping that this will improve the overall write throughput drastically.
      
      Test Plan: run db_bench
      
      Reviewers: MarkCallaghan, heyongqiang
      
      Reviewed By: MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D6099
      cf5adc80
    • D
      This is the mega-patch multi-threaded compaction · 1ca05843
      Dhruba Borthakur 提交于
      published in https://reviews.facebook.net/D5997.
      
      Summary:
      This patch allows compaction to occur in multiple background threads
      concurrently.
      
      If a manual compaction is issued, the system falls back to a
      single-compaction-thread model. This is done to ensure correctess
      and simplicity of code. When the manual compaction is finished,
      the system resumes its concurrent-compaction mode automatically.
      
      The updates to the manifest are done via group-commit approach.
      
      Test Plan: run db_bench
      1ca05843
  17. 17 10月, 2012 1 次提交
    • D
      The deletion of obsolete files should not occur very frequently. · aa73538f
      Dhruba Borthakur 提交于
      Summary:
      The method DeleteObsolete files is a very costly methind, especially
      when the number of files in a system is large. It makes a list of
      all live-files and then scans the directory to compute the diff.
      By default, this method is executed after every compaction run.
      
      This patch makes it such that DeleteObsolete files is never
      invoked twice within a configured period.
      
      Test Plan: run all unit tests
      
      Reviewers: heyongqiang, MarkCallaghan
      
      Reviewed By: MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D6045
      aa73538f
  18. 16 10月, 2012 1 次提交
  19. 04 10月, 2012 3 次提交
    • D
      An configurable option to write data using write instead of mmap. · c1006d42
      Dhruba Borthakur 提交于
      Summary:
      We have seen that reading data via the pread call (instead of
      mmap) is much faster on Linux 2.6.x kernels. This patch makes
      an equivalent option to switch off mmaps for the write path
      as well.
      
      db_bench --mmap_write=0 will use write() instead of mmap() to
      write data to a file.
      
      This change is backward compatible, the default
      option is to continue using mmap for writing to a file.
      
      Test Plan: "make check all"
      
      Differential Revision: https://reviews.facebook.net/D5781
      c1006d42
    • M
      Add --stats_interval option to db_bench · e678a594
      Mark Callaghan 提交于
      Summary:
      The option is zero by default and in that case reporting is unchanged.
      By unchanged, the interval at which stats are reported is scaled after each
      report and newline is not issued after each report so one line is rewritten.
      When non-zero it specifies the constant interval (in operations) at which
      statistics are reported and the stats include the rate per interval. This
      makes it easier to determine whether QPS changes over the duration of the test.
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D5817
      e678a594
    • M
      Fix the bounds check for the --readwritepercent option · d8763abe
      Mark Callaghan 提交于
      Summary:
      see above
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      run db_bench with invalid value for option
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D5823
      d8763abe
  20. 03 10月, 2012 1 次提交
    • M
      Fix compiler warnings and errors in ldb.c · 98804f91
      Mark Callaghan 提交于
      Summary:
      stdlib.h is needed for exit()
      --readhead --> --readahead
      
      Task ID: #
      
      Blame Rev:
      
      Test Plan:
      compile
      
      Revert Plan:
      
      Database Impact:
      
      Memcache Impact:
      
      Other Notes:
      
      EImportant:
      
      - begin *PUBLIC* platform impact section -
      Bugzilla: #
      - end platform impact -
      fix compiler warnings & errors
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D5805
      98804f91
  21. 02 10月, 2012 1 次提交
    • A
      Commandline tool to compace LevelDB databases. · fec81318
      Abhishek Kona 提交于
      Summary:
      A simple CLI which calles DB->CompactRange()
      Can take String key's as range.
      
      Test Plan:
      Inserted data into a table.
      Waited for a minute, used compact tool on it. File modification time's
      changed so Compact did something on the files.
      
      Existing unit tests work.
      
      Reviewers: heyongqiang, dhruba
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D5697
      fec81318
  22. 18 9月, 2012 1 次提交
    • H
      add an option to disable seek compaction · a8464ed8
      heyongqiang 提交于
      Summary:
      as subject. This diff should be good for benchmarking.
      
      will send another diff to make it better in the case the seek compaction is enable.
      In that coming diff, will not count a seek if the bloomfilter filters.
      
      Test Plan: build
      
      Reviewers: dhruba, MarkCallaghan
      
      Reviewed By: MarkCallaghan
      
      Differential Revision: https://reviews.facebook.net/D5481
      a8464ed8
  23. 17 9月, 2012 1 次提交
  24. 15 9月, 2012 2 次提交
  25. 14 9月, 2012 2 次提交
  26. 13 9月, 2012 1 次提交
  27. 07 9月, 2012 1 次提交
  28. 05 9月, 2012 1 次提交
    • D
      Benchmark with both reads and writes at the same time. · 94208a78
      Dhruba Borthakur 提交于
      Summary:
      This patch enables the db_bench benchmark to issue both random reads and random writes at the same time. This options can be trigged via
      ./db_bench --benchmarks=readrandomwriterandom
      
      The default percetage of reads is 90.
      
      One can change the percentage of reads by specifying the --readwritepercent.
      ./db_bench --benchmarks=readrandomwriterandom=50
      
      This is a feature request from Jeffro asking for leveldb performance with a 90:10 read:write ratio.
      
      Test Plan: run on test machine.
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D5067
      94208a78
  29. 30 8月, 2012 1 次提交
    • D
      The sharding of the block cache is limited to 2*20 pieces. · e5fe80e4
      Dhruba Borthakur 提交于
      Summary:
      The numbers of shards that the block cache is divided into is
      configurable. However, if the user specifies that he/she wants
      the block cache to be divided into more than 2**20 pieces, then
      the system will rey to allocate a huge array of that size) that
      could fail.
      
      It is better to limit the sharding of the block cache to an
      upper bound. The default sharding is 16 shards (i.e. 2**4)
      and the maximum is now 2 million shards (i.e. 2**20).
      
      Also, fixed a bug with the LRUCache where the numShardBits
      should be a private member of the LRUCache object rather than
      a static variable.
      
      Test Plan:
      run db_bench with --cache_numshardbits=64.
      
      Task ID: #
      
      Blame Rev:
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D5013
      e5fe80e4
  30. 29 8月, 2012 1 次提交
    • H
      merge 1.5 · a4f9b8b4
      heyongqiang 提交于
      Summary:
      
      as subject
      
      Test Plan:
      
      db_test table_test
      
      Reviewers: dhruba
      a4f9b8b4
  31. 28 8月, 2012 1 次提交
    • D
      Introduce a new method Env->Fsync() that issues fsync (instead of fdatasync). · fc20273e
      Dhruba Borthakur 提交于
      Summary:
      Introduce a new method Env->Fsync() that issues fsync (instead of fdatasync).
      This is needed for data durability when running on ext3 filesystems.
      Added options to the benchmark db_bench to generate performance numbers
      with either fsync or fdatasync enabled.
      
      Cleaned up Makefile to build leveldb_shell only when building the thrift
      leveldb server.
      
      Test Plan: build and run benchmark
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D4911
      fc20273e