1. 08 11月, 2014 2 次提交
    • Y
      Fixed compile error in db/compaction.cc and db/compaction_picker.cc · 642ac9d8
      Yueh-Hsuan Chiang 提交于
      Summary:
      Fixed compile error in db/compaction.cc and db/compaction_picker.cc
      
      Test Plan:
      make
      642ac9d8
    • Y
      CompactFiles, EventListener and GetDatabaseMetaData · 28c82ff1
      Yueh-Hsuan Chiang 提交于
      Summary:
      This diff adds three sets of APIs to RocksDB.
      
      = GetColumnFamilyMetaData =
      * This APIs allow users to obtain the current state of a RocksDB instance on one column family.
      * See GetColumnFamilyMetaData in include/rocksdb/db.h
      
      = EventListener =
      * A virtual class that allows users to implement a set of
        call-back functions which will be called when specific
        events of a RocksDB instance happens.
      * To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners
      
      = CompactFiles =
      * CompactFiles API inputs a set of file numbers and an output level, and RocksDB
        will try to compact those files into the specified level.
      
      = Example =
      * Example code can be found in example/compact_files_example.cc, which implements
        a simple external compactor using EventListener, GetColumnFamilyMetaData, and
        CompactFiles API.
      
      Test Plan:
      listener_test
      compactor_test
      example/compact_files_example
      export ROCKSDB_TESTS=CompactFiles
      db_test
      export ROCKSDB_TESTS=MetaData
      db_test
      
      Reviewers: ljin, igor, rven, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D24705
      28c82ff1
  2. 05 11月, 2014 1 次提交
  3. 01 11月, 2014 1 次提交
    • I
      Turn on -Wshadow · 9f7fc3ac
      Igor Canadi 提交于
      Summary:
      ...and fix all the errors :)
      
      Jim suggested turning on -Wshadow because it helped him fix number of critical bugs in fbcode. I think it's a good idea to be -Wshadow clean.
      
      Test Plan: compiles
      
      Reviewers: yhchiang, rven, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D27711
      9f7fc3ac
  4. 30 10月, 2014 2 次提交
    • S
      Make CompactionPicker more easily tested · 76d1c28e
      sdong 提交于
      Summary:
      Make compaction picker easier to test.
      The basic idea is to separate a minimum subcomponent of Version to VersionStorageInfo, which just responsible to LSM tree. A stub VersionStorageInfo can then be easily created and passed into compaction picker so that we can check the outputs.
      
      It now passes most tests. Still two things need to be done:
      (1) deal with the FIFO compaction's file size.
      (2) write an example test to make sure the interface can do the job.
      
      Add a compaction_picker_test to make sure compaction picker codes can be easily unit tested.
      
      Test Plan:
      Pass all unit tests and compaction_picker_test
      
      Reviewers: yhchiang, rven, igor, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D27639
      76d1c28e
    • Y
      Apply InfoLogLevel to the logs in db/compaction_picker.cc · cda9943f
      Yueh-Hsuan Chiang 提交于
      Summary: Apply InfoLogLevel to the logs in db/compaction_picker.cc
      
      Test Plan: make
      
      Reviewers: ljin, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D27837
      cda9943f
  5. 29 10月, 2014 2 次提交
  6. 24 10月, 2014 1 次提交
  7. 02 10月, 2014 1 次提交
    • L
      make compaction related options changeable · 5ec53f3e
      Lei Jin 提交于
      Summary:
      make compaction related options changeable. Most of changes are tedious,
      following the same convention: grabs MutableCFOptions at the beginning
      of compaction under mutex, then pass it throughout the job and register
      it in SuperVersion at the end.
      
      Test Plan: make all check
      
      Reviewers: igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23349
      5ec53f3e
  8. 01 10月, 2014 1 次提交
  9. 27 9月, 2014 1 次提交
  10. 24 9月, 2014 1 次提交
  11. 10 9月, 2014 1 次提交
  12. 05 9月, 2014 1 次提交
  13. 23 8月, 2014 1 次提交
    • I
      Fix concurrency issue in CompactionPicker · 42ea7952
      Igor Canadi 提交于
      Summary:
      I am currently working on a project that uses RocksDB. While debugging some perf issues, I came up across interesting compaction concurrency issue. Namely, I had 15 idle threads and a good comapction to do, but CompactionPicker returned "Compaction nothing to do". Here's how Internal stats looked:
      
          2014/08/22-08:08:04.551982 7fc7fc3f5700 ------- DUMPING STATS -------
          2014/08/22-08:08:04.552000 7fc7fc3f5700
          ** Compaction Stats [default] **
          Level   Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s)  Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt)  Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms)
          ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            L0     7/5        353   1.0      0.0     0.0      0.0       2.3      2.3    0.0   0.0      0.0      9.4        0         0         0         0        247        46    5.359       8.53          1 8526.25
            L1     2/2         86   1.3      2.6     1.9      0.7       2.6      1.9    2.7   1.3     24.3     24.0       39        19        71        52        109        11    9.938       0.00          0    0.00
            L2    26/0        833   1.3      5.7     1.7      4.0       5.2      1.2    6.3   3.0     15.6     14.2       47       112       147        35        373        44    8.468       0.00          0    0.00
            L3    12/0        505   0.1      0.0     0.0      0.0       0.0      0.0    0.0   0.0      0.0      0.0        0         0         0         0          0         0    0.000       0.00          0    0.00
           Sum    47/7       1778   0.0      8.3     3.6      4.6      10.0      5.4    8.1   4.4     11.6     14.1       86       131       218        87        728       101    7.212       8.53          1 8526.25
           Int     0/0          0   0.0      2.4     0.8      1.6       2.7      1.2   11.5   6.1     12.0     13.6       20        43        63        20        203        23    8.845       0.00          0    0.00
          Flush(GB): accumulative 2.266, interval 0.444
          Stalls(secs): 0.000 level0_slowdown, 0.000 level0_numfiles, 8.526 memtable_compaction, 0.000 leveln_slowdown_soft, 0.000 leveln_slowdown_hard
          Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 1 memtable_compaction, 0 leveln_slowdown_soft, 0 leveln_slowdown_hard
      
          ** DB Stats **
          Uptime(secs): 336.8 total, 60.4 interval
          Cumulative writes: 61584000 writes, 6480589 batches, 9.5 writes per batch, 1.39 GB user ingest
          Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 GB written
          Interval writes: 11235257 writes, 1175050 batches, 9.6 writes per batch, 259.9 MB user ingest
          Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 MB written
      
      To see what happened, go here: https://github.com/facebook/rocksdb/blob/47b452cfcf9b1487d41f886a98bc0d6f95587e90/db/compaction_picker.cc#L430
      * The for loop started with level 1, because it has the worst score.
      * PickCompactionBySize on L429 returned nullptr because all files were being compacted
      * ExpandWhileOverlapping(c) returned true (because that's what it does when it gets nullptr!?)
      * for loop break-ed, never trying compactions for level 2 :( :(
      
      This bug was present at least since January. I have no idea how we didn't find this sooner.
      
      Test Plan:
      Unit testing compaction picker is hard. I tested this by running my service and observing L0->L1 and L2->L3 compactions in parallel. However, for long-term, I opened the task #4968469. @yhchiang is currently refactoring CompactionPicker, hopefully the new version will be unit-testable ;)
      
      Here's how my compactions look like after the patch:
      
          2014/08/22-08:50:02.166699 7f3400ffb700 ------- DUMPING STATS -------
          2014/08/22-08:50:02.166722 7f3400ffb700
          ** Compaction Stats [default] **
          Level   Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s)  Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt)  Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms)
          ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            L0     8/5        404   1.5      0.0     0.0      0.0       4.3      4.3    0.0   0.0      0.0      9.6        0         0         0         0        463        88    5.260       0.00          0    0.00
            L1     2/2         60   0.9      4.8     3.9      0.8       4.7      3.9    2.4   1.2     23.9     23.6       80        23       131       108        204        19   10.747       0.00          0    0.00
            L2    23/3        697   1.0     11.6     3.5      8.1      10.9      2.8    6.4   3.1     17.7     16.6       95       242       317        75        669        92    7.268       0.00          0    0.00
            L3    58/14      2207   0.3      6.2     1.6      4.6       5.9      1.3    7.4   3.6     14.6     13.9       43       121       159        38        436        36   12.106       0.00          0    0.00
           Sum    91/24      3368   0.0     22.5     9.1     13.5      25.8     12.4   11.2   6.0     13.0     14.9      218       386       607       221       1772       235    7.538       0.00          0    0.00
           Int     0/0          0   0.0      3.2     0.9      2.3       3.6      1.3   15.3   8.0     12.4     13.7       24        66        89        23        266        27    9.838       0.00          0    0.00
          Flush(GB): accumulative 4.336, interval 0.444
          Stalls(secs): 0.000 level0_slowdown, 0.000 level0_numfiles, 0.000 memtable_compaction, 0.000 leveln_slowdown_soft, 0.000 leveln_slowdown_hard
          Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 0 leveln_slowdown_soft, 0 leveln_slowdown_hard
      
          ** DB Stats **
          Uptime(secs): 577.7 total, 60.1 interval
          Cumulative writes: 116960736 writes, 11966220 batches, 9.8 writes per batch, 2.64 GB user ingest
          Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 GB written
          Interval writes: 11643735 writes, 1206136 batches, 9.7 writes per batch, 269.2 MB user ingest
          Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 MB written
      
      Yay for concurrent L0->L1 and L2->L3 compactions!
      
      Reviewers: sdong, yhchiang, ljin
      
      Reviewed By: yhchiang
      
      Subscribers: yhchiang, leveldb
      
      Differential Revision: https://reviews.facebook.net/D22305
      42ea7952
  14. 14 8月, 2014 1 次提交
  15. 12 8月, 2014 1 次提交
    • M
      Changes to support unity build: · 93e6b5e9
      miguelportilla 提交于
      * Script for building the unity.cc file via Makefile
      * Unity executable Makefile target for testing builds
      * Source code changes to fix compilation of unity build
      93e6b5e9
  16. 29 7月, 2014 1 次提交
    • L
      make statistics forward-able · 40fa8a4c
      Lei Jin 提交于
      Summary:
      Make StatisticsImpl being able to forward stats to provided statistics
      implementation. The main purpose is to allow us to collect internal
      stats in the future even when user supplies custom statistics
      implementation. It avoids intrumenting 2 sets of stats collection code.
      One immediate use case is tuning advisor, which needs to collect some
      internal stats, users may not be interested.
      
      Test Plan:
      ran db_bench and see stats show up at the end of run
      Will run make all check since some tests rely on statistics
      
      Reviewers: yhchiang, sdong, igor
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D20145
      40fa8a4c
  17. 22 7月, 2014 1 次提交
  18. 17 7月, 2014 1 次提交
  19. 16 7月, 2014 1 次提交
  20. 10 7月, 2014 1 次提交
    • Y
      Some fixes on size compensation logic for deletion entry in compaction · 70828557
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch include two fixes:
      1. newly created Version will now takes the aggregated stats for average-value-size from the latest Version.
      2. compensated size of a file is now computed only for newly created / loaded file, this addresses the issue where files are already sorted by their compensated file size but might sometimes observe some out-of-order due to later update on compensated file size.
      
      Test Plan:
      export ROCKSDB_TESTS=CompactionDele
      ./db_test
      
      Reviewers: ljin, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19557
      70828557
  21. 03 7月, 2014 2 次提交
    • S
      Support Multiple DB paths (without having an interface to expose to users) · 2459f7ec
      sdong 提交于
      Summary:
      In this patch, we allow RocksDB to support multiple DB paths internally.
      No user interface is supported yet so this patch is silent to users.
      
      Test Plan: make all check
      
      Reviewers: igor, haobo, ljin, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18921
      2459f7ec
    • I
      Centralize compression decision to compaction picker · f146cab2
      Igor Canadi 提交于
      Summary:
      Before this diff, we're deciding enable_compression in CompactionPicker and then we're deciding final compression type in DBImpl. This is kind of confusing.
      
      After the diff, the final compression type will be decided in CompactionPicker.
      
      The reason for this is that I want CompactFiles() to specify output compression type, so that people can mix and match compression styles in their compaction algorithms. This diff makes it much easier to do that.
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, sdong, yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19137
      f146cab2
  22. 01 7月, 2014 2 次提交
    • I
      Fix compile error · f5d4df1c
      Igor Canadi 提交于
      f5d4df1c
    • I
      No need for files_by_size_ in universal compaction · a2e0d890
      Igor Canadi 提交于
      Summary: files_by_size_ is sorted by time in case of universal compaction. However, Version::files_ is also sorted by time. So no need for files_by_size_
      
      Test Plan:
      1) make check with the change
      2) make check with `assert(last_index == c->input_version_->files_[level].size() - 1);` in compaction picker
      
      Reviewers: dhruba, haobo, yhchiang, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19125
      a2e0d890
  23. 28 6月, 2014 1 次提交
  24. 25 6月, 2014 1 次提交
    • Y
      Allow compaction to reclaim storage more effectively. · e813f5b6
      Yueh-Hsuan Chiang 提交于
      Summary:
      This diff allows compaction to reclaim storage more effectively.
      In the current design, compactions are mainly triggered based on
      the file sizes.  However, since deletion entries does not have
      value, files which have many deletion entries are less likely
      to be compacted.  As a result, it may took a while to make
      deletion entries to be compacted.
      
      This diff address issue by compensating the size of deletion
      entries during compaction process: the size of each deletion
      entry in the compaction process is augmented by 2x average
      value size.  The diff applies to both leveled and universal
      compacitons.
      
      Test Plan:
      develop CompactionDeletionTrigger
      make db_test
      ./db_test
      
      Reviewers: haobo, igor, ljin, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19029
      e813f5b6
  25. 20 6月, 2014 1 次提交
    • I
      Remove seek compaction · d4a84233
      Igor Canadi 提交于
      Summary:
      As discussed in our internal group, we don't get much use of seek compaction at the moment, while it's making code more complicated and slower in some cases.
      
      This diff removes seek compaction and (hopefully) all code that was introduced to support seek compaction.
      
      There is one test case that relied on didIO information. I'll try to find another way to implement it.
      
      Test Plan: make check
      
      Reviewers: sdong, haobo, yhchiang, ljin, dhruba
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19161
      d4a84233
  26. 17 6月, 2014 1 次提交
    • S
      Refactor: group metadata needed to open an SST file to a separate copyable struct · cadc1adf
      sdong 提交于
      Summary:
      We added multiple fields to FileMetaData recently and are planning to add more.
      This refactoring separate the minimum information for accessing the file. This object is copyable (FileMetaData is not copyable since the ref counter). I hope this refactoring can enable further improvements:
      
      (1) use it to design a more efficient data structure to speed up read queries.
      (2) in the future, when we add information of storage level, we can easily do the encoding, instead of enlarge this structure, which might expand memory work set for file meta data.
      
      The definition is same as current EncodedFileMetaData used in two level iterator, so now the logic in two level iterator is easier to understand.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb, dhruba, yhchiang
      
      Differential Revision: https://reviews.facebook.net/D18933
      cadc1adf
  27. 22 5月, 2014 1 次提交
    • I
      FIFO compaction style · 6de6a066
      Igor Canadi 提交于
      Summary:
      Introducing new compaction style -- FIFO.
      
      FIFO compaction style has write amplification of 1 (+1 for WAL) and it deletes the oldest files when the total DB size exceeds pre-configured values.
      
      FIFO compaction style is suited for storing high-frequency event logs.
      
      Test Plan: Added a unit test
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: dhruba
      
      Subscribers: alberts, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18765
      6de6a066
  28. 25 4月, 2014 1 次提交
    • I
      Column family logging · ad3cd39c
      Igor Canadi 提交于
      Summary:
      Now that we have column families involved, we need to add extra context to every log message. They now start with "[column family name] log message"
      
      Also added some logging that I think would be useful, like level summary after every flush (I often needed that when going through the logs).
      
      Test Plan: make check + ran db_bench to confirm I'm happy with log output
      
      Reviewers: dhruba, haobo, ljin, yhchiang, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D18303
      ad3cd39c
  29. 04 4月, 2014 2 次提交
  30. 20 3月, 2014 2 次提交
    • I
      ComputeCompactionScore in CompactionPicker · fcd5c5e8
      Igor Canadi 提交于
      Summary:
      As it turns out, we need the call to ComputeCompactionScore (previously: Finalize) in CompactionPicker.
      
      The issue caused a deadlock in db_stress: http://ci-builds.fb.com/job/rocksdb_crashtest/290/console
      
      The last two lines before a deadlock were:
      2014/03/18-22:43:41.481029 7facafbee700 (Original Log Time 2014/03/18-22:43:41.480989) Compaction nothing to do
      2014/03/18-22:43:41.481041 7faccf7fc700 wait for fewer level0 files...
      
      "Compaction nothing to do" and other thread waiting for fewer level0 files. Hm hm.
      
      I moved the pre-sorting to SaveTo, which should fix both the original and the new issue.
      
      Test Plan: make check for now, will run db_stress in jenkins
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17037
      fcd5c5e8
    • I
      Don't compact with zero input files · e493f2f5
      Igor Canadi 提交于
      Summary:
      We have an issue with internal service trying to run compaction with zero input files:
      2014/02/07-02:26:58.386531 7f79117ec700 Compaction start summary: Base version 1420 Base level 3, seek compaction:0, inputs:[ϛ~^Qy^?],[]
      2014/02/07-02:26:58.386539 7f79117ec700 Compacted 0@3 + 0@4 files => 0 bytes
      
      There are two issues:
      * inputsummary is printing out junk
      * it's constantly retrying (since I guess madeProgress is true), so it prints out a lot of data in the LOG file (40GB in one day).
      
      I read through the Level compaction picker and added some failure condition if input[0] is empty. I think PickCompaction() should not return compaction with zero input files with this change. I'm not confident enough to add an assertion though :)
      
      Test Plan: make check
      
      Reviewers: dhruba, haobo, sdong, kailiu
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16005
      e493f2f5
  31. 19 3月, 2014 1 次提交
    • I
      Don't Finalize in CompactionPicker · 758fa8c3
      Igor Canadi 提交于
      Summary:
      Finalize re-sorts (read: mutates) the files_ in Version* and it is called by CompactionPicker during normal runtime. At the same time, this same Version* lives in the SuperVersion* and is accessed without the mutex in GetImpl() code path.
      
      Mutating the files_ in one thread and reading the same files_ in another thread is a bad idea. It caused this issue: http://ci-builds.fb.com/job/rocksdb_crashtest/285/console
      
      Long-term, we need to be more careful with method contracts and clearly document what state can be mutated when. Now that we are much faster because we don't lock in GetImpl(), we keep running into data races that were not a problem before when we were slower. db_stress has been very helpful in detecting those.
      
      Short-term, I removed Finalize() from CompactionPicker.
      
      Note: I believe this is an issue in current 2.7 version running in production.
      
      Test Plan:
      make check
      Will also run db_stress to see if issue is gone
      
      Reviewers: sdong, ljin, dhruba, haobo
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16983
      758fa8c3
  32. 13 3月, 2014 1 次提交
    • I
      [CF] Code cleanup part 1 · fb2346fc
      Igor Canadi 提交于
      Summary:
      I'm cleaning up some code preparing for the big diff review tomorrow. This is the first part of the cleanup.
      
      Changes are mostly cosmetic. The goal is to decrease amount of code difference between columnfamilies and master branch.
      
      This diff also fixes race condition when dropping column family.
      
      Test Plan: Ran db_stress with variety of parameters
      
      Reviewers: dhruba, haobo
      
      Differential Revision: https://reviews.facebook.net/D16833
      fb2346fc
  33. 11 3月, 2014 1 次提交
    • H
      [RocksDB] LogBuffer Cleanup · 66da4679
      Haobo Xu 提交于
      Summary: Moved LogBuffer class to an internal header. Removed some unneccesary indirection. Enabled log buffer for BackgroundCallFlush. Forced log buffer flush right after Unlock to improve time ordering of info log.
      
      Test Plan: make check; db_bench compare LOG output
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      CC: leveldb, igor
      
      Differential Revision: https://reviews.facebook.net/D16707
      66da4679