1. 21 11月, 2012 1 次提交
    • D
      A major bug that was not considering the compaction score of the n-1 level. · 3754f2f4
      Dhruba Borthakur 提交于
      Summary:
      The method Finalize() recomputes the compaction score of each
      level and then sorts these score from largest to smallest. The
      idea is that the level with the largest compaction score will
      be a better candidate for compaction.  There are usually very
      few levels, and a bubble sort code was used to sort these
      compaction scores. There existed a bug in the sorting code that
      skipped looking at the score for the n-1 level. This meant that
      even if the compaction score of the n-1 level is large, it will
      not be picked for compaction.
      
      This patch fixes the bug and also introduces "asserts" in the
      code to detect any possible inconsistencies caused by future bugs.
      
      This bug existed in the very first code change that introduced
      multi-threaded compaction to the leveldb code. That version of
      code was committed on Oct 19th via
      https://github.com/facebook/leveldb/commit/1ca0584345af85d2dccc434f451218119626d36e
      
      Test Plan: make clean check OPT=-g
      
      Reviewers: emayanke, sheki, MarkCallaghan
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6837
      3754f2f4
  2. 20 11月, 2012 10 次提交
    • D
      Fix asserts · dde70898
      Dhruba Borthakur 提交于
      Summary:
      make check OPT=-g fails with the following assert.
      ==== Test DBTest.ApproximateSizes
      db_test: db/version_set.cc:765: void leveldb::VersionSet::Builder::CheckConsistencyForDeletes(leveldb::VersionEdit*, int, int): Assertion `found' failed.
      
      The assertion was that file #7 that was being deleted did not
      preexists, but actualy it did pre-exist as shown in the manifest
      dump shows below. The bug was that we did not check for file
      existance at the same level.
      
      *************************Edit[0] = VersionEdit {
        Comparator: leveldb.BytewiseComparator
      }
      
      *************************Edit[1] = VersionEdit {
        LogNumber: 8
        PrevLogNumber: 0
        NextFile: 9
        LastSeq: 80
        AddFile: 0 7 8005319 'key000000' @ 1 : 1 .. 'key000079' @ 80 : 1
      }
      
      *************************Edit[2] = VersionEdit {
        LogNumber: 8
        PrevLogNumber: 0
        NextFile: 13
        LastSeq: 80
        CompactPointer: 0 'key000079' @ 80 : 1
        DeleteFile: 0 7
        AddFile: 1 9 2101425 'key000000' @ 1 : 1 .. 'key000020' @ 21 : 1
        AddFile: 1 10 2101425 'key000021' @ 22 : 1 .. 'key000041' @ 42 : 1
        AddFile: 1 11 2101425 'key000042' @ 43 : 1 .. 'key000062' @ 63 : 1
        AddFile: 1 12 1701165 'key000063' @ 64 : 1 .. 'key000079' @ 80 : 1
      }
      
      Test Plan:
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      dde70898
    • D
      Merge branch 'master' into performance · a4b79b6e
      Dhruba Borthakur 提交于
      a4b79b6e
    • D
      Fix compilation error while compiling unit tests with OPT=-g · 74054fa9
      Dhruba Borthakur 提交于
      Summary:
      Fix compilation error while compiling with OPT=-g
      
      Test Plan:
      make clean check OPT=-g
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      74054fa9
    • D
      Fix compilation error introduced by previous commit · 48dafb2c
      Dhruba Borthakur 提交于
      7889e094
      
      Summary:
      Fix compilation error introduced by previous commit
      7889e094
      
      Test Plan:
      make clean check
      48dafb2c
    • D
      Enhance manifest_dump to print each individual edit. · 7889e094
      Dhruba Borthakur 提交于
      Summary:
      The manifest file contains a series of edits. If the verbose
      option is switched on, then print each individual edit in the
      manifest file. This helps in debugging.
      
      Test Plan: make clean manifest_dump
      
      Reviewers: emayanke, sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6807
      7889e094
    • A
      Fix LDB dumpwal to print the messages as in the file. · 661dc157
      Abhishek Kona 提交于
      Summary:
      StringStream.clear() does not clear the stream. It sets some flags.
      Who knew? Fixing that is not printing the stuff again and again.
      
      Test Plan: ran it on a local db
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6795
      661dc157
    • A
      Fix a coding error in db_test.cc · 65b035a4
      amayank 提交于
      Summary: The new function MinLevelToCompress in db_test.cc was incomplete. It needs to tell the calling function-TEST whether the test has to be skipped or not
      
      Test Plan: make all;./db_test
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: sheki
      
      Differential Revision: https://reviews.facebook.net/D6771
      65b035a4
    • A
      LDB can read WAL. · 30742e16
      Abhishek Kona 提交于
      Summary:
      Add option to read WAL and print a summary for each record.
      facebook task => #1885013
      
      E.G. Output :
      ./ldb dump_wal --walfile=/tmp/leveldbtest-5907/dbbench/026122.log --header
      Sequence,Count,ByteSize
      49981,1,100033
      49981,1,100033
      49982,1,100033
      49981,1,100033
      49982,1,100033
      49983,1,100033
      49981,1,100033
      49982,1,100033
      49983,1,100033
      49984,1,100033
      49981,1,100033
      49982,1,100033
      
      Test Plan:
      Works run
      ./ldb read_wal --wal-file=/tmp/leveldbtest-5907/dbbench/000078.log --header
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: emayanke, leveldb, zshao
      
      Differential Revision: https://reviews.facebook.net/D6675
      30742e16
    • D
      Enhance manifest_dump to print each individual edit. · 4b622ab0
      Dhruba Borthakur 提交于
      Summary:
      The manifest file contains a series of edits. If the verbose
      option is switched on, then print each individual edit in the
      manifest file. This helps in debugging.
      
      Test Plan: make clean manifest_dump
      
      Reviewers: emayanke, sheki
      
      Reviewed By: sheki
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6807
      4b622ab0
    • A
      Fix LDB dumpwal to print the messages as in the file. · b648401a
      Abhishek Kona 提交于
      Summary:
      StringStream.clear() does not clear the stream. It sets some flags.
      Who knew? Fixing that is not printing the stuff again and again.
      
      Test Plan: ran it on a local db
      
      Reviewers: dhruba, emayanke
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6795
      b648401a
  3. 19 11月, 2012 1 次提交
    • D
      enhance dbstress to simulate hard crash · 62e7583f
      Dhruba Borthakur 提交于
      Summary:
      dbstress has an option to reopen the database. Make it such that the
      previous handle is not closed before we reopen, this simulates a
      situation similar to a process crash.
      
      Added new api to DMImpl to remove the lock file.
      
      Test Plan: run db_stress
      
      Reviewers: emayanke
      
      Reviewed By: emayanke
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6777
      62e7583f
  4. 17 11月, 2012 2 次提交
    • A
      Fix a coding error in db_test.cc · de278a6d
      amayank 提交于
      Summary: The new function MinLevelToCompress in db_test.cc was incomplete. It needs to tell the calling function-TEST whether the test has to be skipped or not
      
      Test Plan: make all;./db_test
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: sheki
      
      Differential Revision: https://reviews.facebook.net/D6771
      de278a6d
    • A
      LDB can read WAL. · f5cdf931
      Abhishek Kona 提交于
      Summary:
      Add option to read WAL and print a summary for each record.
      facebook task => #1885013
      
      E.G. Output :
      ./ldb dump_wal --walfile=/tmp/leveldbtest-5907/dbbench/026122.log --header
      Sequence,Count,ByteSize
      49981,1,100033
      49981,1,100033
      49982,1,100033
      49981,1,100033
      49982,1,100033
      49983,1,100033
      49981,1,100033
      49982,1,100033
      49983,1,100033
      49984,1,100033
      49981,1,100033
      49982,1,100033
      
      Test Plan:
      Works run
      ./ldb read_wal --wal-file=/tmp/leveldbtest-5907/dbbench/000078.log --header
      
      Reviewers: dhruba, heyongqiang
      
      Reviewed By: dhruba
      
      CC: emayanke, leveldb, zshao
      
      Differential Revision: https://reviews.facebook.net/D6675
      f5cdf931
  5. 15 11月, 2012 3 次提交
  6. 14 11月, 2012 4 次提交
    • D
      Push release 1.5.5.fb. · 0f590af6
      Dhruba Borthakur 提交于
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      0f590af6
    • D
      Make sse compilation optional. · 33cf6f3b
      Dhruba Borthakur 提交于
      Summary:
      The fbcode compilation was always switching on msse by default.
      This patch keeps the same behaviour but allows the compilation
      process to switch off msse if needed.
      
      If one does not want to use sse, then do the following:
      export USE_SSE=0
      make clean all
      
      Test Plan: make clean all
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D6717
      33cf6f3b
    • D
      Improved CompactionFilter api: pass in a opaque argument to CompactionFilter invocation. · 5d16e503
      Dhruba Borthakur 提交于
      Summary:
      There are applications that operate on multiple leveldb instances.
      These applications will like to pass in an opaque type for each
      leveldb instance and this type should be passed back to the application
      with every invocation of the CompactionFilter api.
      
      Test Plan: Enehanced unit test for opaque parameter to CompactionFilter.
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: MarkCallaghan, sheki, emayanke
      
      Differential Revision: https://reviews.facebook.net/D6711
      5d16e503
    • D
      Fix asserts so that "make check OPT=-g" works on performance branch · 43d9a822
      Dhruba Borthakur 提交于
      Summary:
      Compilation used to fail with the error:
      db/version_set.cc:1773: error: ‘number_of_files_to_sort_’ is not a member of ‘leveldb::VersionSet’
      
      I created a new method called CheckConsistencyForDeletes() so that
      all the high cost checking is done only when OPT=-g is specified.
      
      I also fixed a bug in PickCompactionBySize that was triggered when
      OPT=-g was switched on. The base_index in the compaction record
      was not set correctly.
      
      Test Plan: make check OPT=-g
      
      Differential Revision: https://reviews.facebook.net/D6687
      43d9a822
  7. 13 11月, 2012 3 次提交
    • D
      The db_bench utility was broken in 1.5.4.fb because of a signed-unsigned comparision. · a785e029
      Dhruba Borthakur 提交于
      Summary:
      The db_bench utility was broken in 1.5.4.fb because of a
      signed-unsigned comparision.
      
      The static variable FLAGS_min_level_to_compress was recently
      changed from int to 'unsigned in' but it is initilized to a
      nagative value -1.
      
      The segfault is of this type:
      Program received signal SIGSEGV, Segmentation fault.
      Open (this=0x7fffffffdee0) at db/db_bench.cc:939
      939	db/db_bench.cc: No such file or directory.
      (gdb) where
      
      Test Plan: run db_bench with no options.
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: MarkCallaghan, emayanke, sheki
      
      Differential Revision: https://reviews.facebook.net/D6663
      a785e029
    • A
      Introducing "database reopens" into the stress test. Database will reopen... · e6262617
      amayank 提交于
      Introducing "database reopens" into the stress test. Database will reopen after a specified number of iterations (configurable) of each thread when they will wait for the databse to reopen.
      
      Summary: FLAGS_reopen (configurable) specifies the number of times the databse is to be reopened. FLAGS_ops_per_thread is divided into points based on that reopen field. At these points all threads come together to wait for the databse to reopen. Each thread "votes" for the database to reopen and when all have voted, the database reopens.
      
      Test Plan: make all;./db_stress
      
      Reviewers: dhruba, MarkCallaghan, sheki, asad, heyongqiang
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6627
      e6262617
    • H
      Fix test failure of reduce_num_levels · c64796fd
      heyongqiang 提交于
      Summary:
      I changed the reduce_num_levels logic to avoid "compactRange()" call if the current number of levels in use (levels that contain files) is smaller than the new num of levels.
      And that change breaks the assert in reduce_levels_test
      
      Test Plan: run reduce_levels_test
      
      Reviewers: dhruba, MarkCallaghan
      
      Reviewed By: dhruba
      
      CC: emayanke, sheki
      
      Differential Revision: https://reviews.facebook.net/D6651
      c64796fd
  8. 11 11月, 2012 1 次提交
    • D
      Compilation error while compiling with OPT=-g · 9c6c232e
      Dhruba Borthakur 提交于
      Summary:
      make clean check OPT=-g fails
      leveldb::DBStatistics::getTickerCount(leveldb::Tickers)’:
      ./db/db_statistics.h:34: error: ‘MAX_NO_TICKERS’ was not declared in this scope
      util/ldb_cmd.cc:255: warning: left shift count >= width of type
      
      Test Plan:
      make clean check OPT=-g
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      9c6c232e
  9. 10 11月, 2012 3 次提交
  10. 09 11月, 2012 2 次提交
    • A
      Introducing deletes for stress test · 9e97bfdc
      amayank 提交于
      Summary: Stress test modified to do deletes and later verify them
      
      Test Plan: running the test: db_stress
      
      Reviewers: dhruba, heyongqiang, asad, sheki, MarkCallaghan
      
      Reviewed By: dhruba
      
      Differential Revision: https://reviews.facebook.net/D6567
      9e97bfdc
    • A
      stat's collection in leveldb · 391885c4
      Abhishek Kona 提交于
      Summary:
      Prototype stat's collection. Diff is a good estimate of what
      the final code will look like.
      A few assumptions :
        * Used a global static instance of the statistics object. Plan to pass
        it to each internal function. Static allows metrics only at app
        level.
        * In the Ticker's do not do any locking. Depend on the mutex at each
         function of LevelDB. If we ever remove the mutex, we should change
         here too. The other option is use atomic objects anyways as there
         won't be any contention as they will be always acquired only by one
         thread.
        * The counters are dumb, increment through lifecycle. Plan to use ods
          etc to get last5min stat etc.
      
      Test Plan:
      made changes in db_bench
      Ran ./db_bench --statistics=1 --num=10000 --cache_size=5000
      This will print the cache hit/miss stats.
      
      Reviewers: dhruba, heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D6441
      391885c4
  11. 08 11月, 2012 5 次提交
    • D
      Move filesize-based-sorting to outside the Mutex · 95dda378
      Dhruba Borthakur 提交于
      Summary:
      When a new version is created, we sort all the files at every
      level based on their size. This is necessary because we want
      to compact the largest file first. The sorting takes quite a
      bit of CPU.
      
      Moved the sorting code to be outside the mutex. Also, the
      earlier code was sorting files at all levels but we do not
      need to sort the highest-number level because those files
      are never the cause of any compaction. To reduce sorting
      costs, we sort only the first few files in each level
      because it is likely that those are the only files in that
      level that will be picked for compaction.
      
      At steady state, I have seen that this patch increase
      throughout from 1500 writes/sec to 1700 writes/sec at the
      end of a 72 hour run. The cpu saving by not sorting the
      last level was not distinctive in this test run because
      there were only 100K files in the highest numbered level.
      I expect the cpu saving to be significant when the number of
      files is much higher.
      
      This is mostly an early preview and not ready for rigorous review.
      
      With this patch, the writs/sec is now bottlenecked not by the sorting code but by GetOverlappingInputs. I am working on a patch to optimize GetOverlappingInputs.
      
      Test Plan: make check
      
      Reviewers: MarkCallaghan, heyongqiang
      
      Reviewed By: heyongqiang
      
      Differential Revision: https://reviews.facebook.net/D6411
      95dda378
    • D
      Fixed compilation error in previous merge. · 18cb6004
      Dhruba Borthakur 提交于
      Summary:
      Fixed compilation error in previous merge.
      
      Test Plan:
      
      Reviewers:
      
      CC:
      
      Task ID: #
      
      Blame Rev:
      18cb6004
    • D
      Merge branch 'master' into performance · 8143062e
      Dhruba Borthakur 提交于
      Conflicts:
      	db/db_impl.cc
      	db/version_set.cc
      	util/options.cc
      8143062e
    • H
      Add a readonly db · 3fcf533e
      heyongqiang 提交于
      Summary: as subject
      
      Test Plan: run db_bench readrandom
      
      Reviewers: dhruba
      
      Reviewed By: dhruba
      
      CC: MarkCallaghan, emayanke, sheki
      
      Differential Revision: https://reviews.facebook.net/D6495
      3fcf533e
    • D
      Avoid doing a exhaustive search when looking for overlapping files. · 9b87a2ba
      Dhruba Borthakur 提交于
      Summary:
      The Version::GetOverlappingInputs() is called multiple times in
      the compaction code path. Eack invocation does a binary search
      for overlapping files in the specified key range.
      This patch remembers the offset of an overlapped file when
      GetOverlappingInputs() is called the first time within
      a compaction run. Suceeding calls to GetOverlappingInputs()
      uses the remembered index to avoid the binary search.
      
      I measured that 1000 iterations of GetOverlappingInputs
      takes around 4500 microseconds without this patch. If I use
      this patch with the hint on every invocation, then 1000
      iterations take about 3900 microsecond.
      
      Test Plan: make check OPT=-g
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: MarkCallaghan, emayanke, sheki
      
      Differential Revision: https://reviews.facebook.net/D6513
      9b87a2ba
  12. 07 11月, 2012 2 次提交
  13. 06 11月, 2012 3 次提交
    • D
      Merge branch 'master' into performance · 5f91868c
      Dhruba Borthakur 提交于
      Conflicts:
      	db/version_set.cc
      	util/options.cc
      5f91868c
    • D
      The method GetOverlappingInputs should use binary search. · cb7a0022
      Dhruba Borthakur 提交于
      Summary:
      The method Version::GetOverlappingInputs used a sequential search
      to map a kay-range to a set of files. But the files are arranged
      in ascending order of key, so a biary search is more effective.
      
      This patch implements Version::GetOverlappingInputsBinarySearch
      that finds one file that corresponds to the specified key range
      and then iterates backwards and forwards to find all overlapping
      files.
      
      This patch is critical for making compactions efficient, especially
      when there are thousands of files in a single level.
      
      I measured that 1000 iterations of TEST_MaxNextLevelOverlappingBytes
      takes 16000 microseconds without this patch. With this patch, the
      same method takes about 4600 microseconds.
      
      Test Plan: Almost all unit tests in db_test uses this method to lookup keys.
      
      Reviewers: heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: MarkCallaghan, emayanke, sheki
      
      Differential Revision: https://reviews.facebook.net/D6465
      cb7a0022
    • D
      Ability to invoke application hook for every key during compaction. · 5273c814
      Dhruba Borthakur 提交于
      Summary:
      There are certain use-cases where the application intends to
      delete older keys aftre they have expired a certian time period.
      One option for those applications is to periodically scan the
      entire database and delete appropriate keys.
      
      A better way is to allow the application to hook into the
      compaction process. This patch allows the application to set
      a method callback for every key that is being compacted. If
      this method returns true, then the key is not preserved in
      the output of the compaction.
      
      Test Plan:
      This is mostly to preview the proposed new public api.
      Since it is a public api, please do due diligence on reviewing it.
      
      I will be writing test cases for this api in mynext version of
      this patch.
      
      Reviewers: MarkCallaghan, heyongqiang
      
      Reviewed By: heyongqiang
      
      CC: sheki, adsharma
      
      Differential Revision: https://reviews.facebook.net/D6285
      5273c814