1. 02 10月, 2014 1 次提交
    • L
      make compaction related options changeable · 5ec53f3e
      Lei Jin 提交于
      Summary:
      make compaction related options changeable. Most of changes are tedious,
      following the same convention: grabs MutableCFOptions at the beginning
      of compaction under mutex, then pass it throughout the job and register
      it in SuperVersion at the end.
      
      Test Plan: make all check
      
      Reviewers: igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23349
      5ec53f3e
  2. 30 9月, 2014 1 次提交
    • L
      use GetContext to replace callback function pointer · 2faf49d5
      Lei Jin 提交于
      Summary:
      Intead of passing callback function pointer and its arg on Table::Get()
      interface, passing GetContext. This makes the interface cleaner and
      possible better perf. Also adding a fast pass for SaveValue()
      
      Test Plan: make all check
      
      Reviewers: igor, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D24057
      2faf49d5
  3. 26 9月, 2014 1 次提交
    • L
      CompactedDBImpl · 3c680061
      Lei Jin 提交于
      Summary:
      Add a CompactedDBImpl that will enabled when calling OpenForReadOnly()
      and the DB only has one level (>0) of files. As a performan comparison,
      CuckooTable performs 2.1M/s with CompactedDBImpl vs. 1.78M/s with
      ReadOnlyDBImpl.
      
      Test Plan: db_bench
      
      Reviewers: yhchiang, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23553
      3c680061
  4. 24 9月, 2014 1 次提交
  5. 09 9月, 2014 2 次提交
    • L
      rename version_set options_ to db_options_ to avoid confusion · 9b0f7ffa
      Lei Jin 提交于
      Summary: as title
      
      Test Plan: make release
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D23007
      9b0f7ffa
    • I
      Push- instead of pull-model for managing Write stalls · a2bb7c3c
      Igor Canadi 提交于
      Summary:
      Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
      * proceed with all writes without delay
      * delay all writes by fixed time
      * stop all writes
      
      The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
      
      When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
      
      This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
      
      Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
      
      Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D22791
      a2bb7c3c
  6. 05 9月, 2014 1 次提交
  7. 07 8月, 2014 1 次提交
    • S
      Add DB property "rocksdb.estimate-table-readers-mem" · 1242bfca
      sdong 提交于
      Summary:
      Add a DB Property "rocksdb.estimate-table-readers-mem" to return estimated memory usage by all loaded table readers, other than allocated from block cache.
      
      Refactor the property codes to allow getting property from a version, with DB mutex not acquired.
      
      Test Plan: Add several checks of this new property in existing codes for various cases.
      
      Reviewers: yhchiang, ljin
      
      Reviewed By: ljin
      
      Subscribers: xjin, igor, leveldb
      
      Differential Revision: https://reviews.facebook.net/D20733
      1242bfca
  8. 29 7月, 2014 1 次提交
    • S
      Add DB property estimated number of keys · f6784766
      sdong 提交于
      Summary: Add a DB property of estimated number of live keys, by adding number of entries of all mem tables and all files, subtracted by all deletions in all files.
      
      Test Plan: Add the case in unit tests
      
      Reviewers: hobbymanyp, ljin
      
      Reviewed By: ljin
      
      Subscribers: MarkCallaghan, yoshinorim, leveldb, igor, dhruba
      
      Differential Revision: https://reviews.facebook.net/D20631
      f6784766
  9. 22 7月, 2014 1 次提交
    • L
      make internal stats independent of statistics · f6f1533c
      Lei Jin 提交于
      Summary:
      also make it aware of column family
      output from db_bench
      
      ```
      ** Compaction Stats [default] **
      Level Files Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s)  Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt)  Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms)
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        L0    14      956   0.9      0.0     0.0      0.0       2.7      2.7    0.0   0.0      0.0    111.6        0         0         0         0         24        40    0.612      75.20     492387    0.15
        L1    21     2001   2.0      5.7     2.0      3.7       5.3      1.6    5.4   2.6     71.2     65.7       31        43        55        12         82         2   41.242      43.72      41183    1.06
        L2   217    18974   1.9     16.5     2.0     14.4      15.1      0.7   15.6   7.4     70.1     64.3       17       182       185         3        241        16   15.052       0.00          0    0.00
        L3  1641   188245   1.8      9.1     1.1      8.0       8.5      0.5   15.4   7.4     61.3     57.2        9        75        76         1        152         9   16.887       0.00          0    0.00
        L4  4447   449025   0.4     13.4     4.8      8.6       9.1      0.5    4.7   1.9     77.8     52.7       38        79       100        21        176        38    4.639       0.00          0    0.00
       Sum  6340   659201   0.0     44.7    10.0     34.7      40.6      6.0   32.0  15.2     67.7     61.6       95       379       416        37        676       105    6.439     118.91     533570    0.22
       Int     0        0   0.0      1.2     0.4      0.8       1.3      0.5    5.2   2.7     59.1     65.6        3         7         9         2         20        10    2.003       0.00          0    0.00
      Stalls(secs): 75.197 level0_slowdown, 0.000 level0_numfiles, 0.000 memtable_compaction, 43.717 leveln_slowdown
      Stalls(count): 492387 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 41183 leveln_slowdown
      
      ** DB Stats **
      Uptime(secs): 202.1 total, 13.5 interval
      Cumulative writes: 6291456 writes, 6291456 batches, 1.0 writes per batch, 4.90 ingest GB
      Cumulative WAL: 6291456 writes, 6291456 syncs, 1.00 writes per sync, 4.90 GB written
      Interval writes: 1048576 writes, 1048576 batches, 1.0 writes per batch, 836.0 ingest MB
      Interval WAL: 1048576 writes, 1048576 syncs, 1.00 writes per sync, 0.82 MB written
      
      Test Plan: ran it
      
      Reviewers: sdong, yhchiang, igor
      
      Reviewed By: igor
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19917
      f6f1533c
  10. 17 7月, 2014 1 次提交
    • F
      store file_indexer info in sequential memory · c11d604a
      Feng Zhu 提交于
      Summary:
        use arena to allocate space for next_level_index_ and level_rb_
        Thus increasing data locality and make Version::Get faster.
      
      Benchmark detail
      Base version: commit d2a727c1
      
      command used:
      ./db_bench --db=/mnt/db/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --block_size=4096 --cache_size=17179869184 --cache_numshardbits=6 --compression_type=none --compression_ratio=1 --min_level_to_compress=-1 --disable_seek_compaction=1 --hard_rate_limit=2 --write_buffer_size=134217728 --max_write_buffer_number=2 --level0_file_num_compaction_trigger=8 --target_file_size_base=2097152 --max_bytes_for_level_base=1073741824 --disable_wal=0 --sync=0 --disable_data_sync=1 --verify_checksum=1 --delete_obsolete_files_period_micros=314572800 --max_grandparent_overlap_factor=10 --max_background_compactions=4 --max_background_flushes=0 --level0_slowdown_writes_trigger=16 --level0_stop_writes_trigger=24 --statistics=0 --stats_per_interval=0 --stats_interval=1048576 --histogram=0 --use_plain_table=1 --open_files=-1 --mmap_read=1 --mmap_write=0 --memtablerep=prefix_hash --bloom_bits=10 --bloom_locality=1 --perf_level=0 --benchmarks=fillseq, readrandom,readrandom,readrandom --use_existing_db=0 --num=52428800 --threads=1
      
      Result:
      cpu running percentage:
      Version::Get, improved from 7.98% to 7.42%
      FileIndexer::GetNextLevelIndex, improved from 1.18% to 0.68%.
      
      Test Plan:
        make all check
      
      Reviewers: ljin, haobo, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, igor
      
      Differential Revision: https://reviews.facebook.net/D19845
      c11d604a
  11. 12 7月, 2014 1 次提交
    • F
      use FileLevel in LevelFileNumIterator · 178fd6f9
      Feng Zhu 提交于
      Summary:
        Use FileLevel in LevelFileNumIterator, thus use new version of findFile.
        Old version of findFile function is deleted.
        Write a function in version_set.cc to generate FileLevel from files_.
        Add GenerateFileLevelTest in version_set_test.cc
      
      Test Plan:
        make all check
      
      Reviewers: ljin, haobo, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: igor, dhruba
      
      Differential Revision: https://reviews.facebook.net/D19659
      178fd6f9
  12. 10 7月, 2014 2 次提交
    • F
      create compressed_levels_ in Version, allocate its space using arena. Make... · f697cad1
      Feng Zhu 提交于
      create compressed_levels_ in Version, allocate its space using arena. Make Version::Get, Version::FindFile faster
      
      Summary:
          Define CompressedFileMetaData that just contains fd, smallest_slice, largest_slice. Create compressed_levels_ in Version, the space is allocated using arena
          Thus increase the file meta data locality, speed up "Get" and "FindFile"
      
          benchmark with in-memory tmpfs, could have 4% improvement under "random read" and 2% improvement under "read while writing"
      
      benchmark command:
      ./db_bench --db=/mnt/db/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --block_size=4096 --cache_size=17179869184 --cache_numshardbits=6 --compression_type=none --compression_ratio=1 --min_level_to_compress=-1 --disable_seek_compaction=1 --hard_rate_limit=2 --write_buffer_size=134217728 --max_write_buffer_number=2 --level0_file_num_compaction_trigger=8 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --disable_wal=0 --sync=0 --disable_data_sync=1 --verify_checksum=1 --delete_obsolete_files_period_micros=314572800 --max_grandparent_overlap_factor=10 --max_background_compactions=4 --max_background_flushes=0 --level0_slowdown_writes_trigger=16 --level0_stop_writes_trigger=24 --statistics=0 --stats_per_interval=0 --stats_interval=1048576 --histogram=0 --use_plain_table=1 --open_files=-1 --mmap_read=1 --mmap_write=0 --memtablerep=prefix_hash --bloom_bits=10 --bloom_locality=1 --perf_level=0 --benchmarks=readwhilewriting,readwhilewriting,readwhilewriting --use_existing_db=1 --num=52428800 --threads=1 —writes_per_second=81920
      
      Read Random:
      From 1.8363 ms/op, improve to 1.7587 ms/op.
      Read while writing:
      From 2.985 ms/op, improve to 2.924 ms/op.
      
      Test Plan:
          make all check
      
      Reviewers: ljin, haobo, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, igor
      
      Differential Revision: https://reviews.facebook.net/D19419
      f697cad1
    • Y
      Some fixes on size compensation logic for deletion entry in compaction · 70828557
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch include two fixes:
      1. newly created Version will now takes the aggregated stats for average-value-size from the latest Version.
      2. compensated size of a file is now computed only for newly created / loaded file, this addresses the issue where files are already sorted by their compensated file size but might sometimes observe some out-of-order due to later update on compensated file size.
      
      Test Plan:
      export ROCKSDB_TESTS=CompactionDele
      ./db_test
      
      Reviewers: ljin, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19557
      70828557
  13. 03 7月, 2014 1 次提交
  14. 01 7月, 2014 1 次提交
    • I
      No need for files_by_size_ in universal compaction · a2e0d890
      Igor Canadi 提交于
      Summary: files_by_size_ is sorted by time in case of universal compaction. However, Version::files_ is also sorted by time. So no need for files_by_size_
      
      Test Plan:
      1) make check with the change
      2) make check with `assert(last_index == c->input_version_->files_[level].size() - 1);` in compaction picker
      
      Reviewers: dhruba, haobo, yhchiang, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19125
      a2e0d890
  15. 25 6月, 2014 1 次提交
    • Y
      Allow compaction to reclaim storage more effectively. · e813f5b6
      Yueh-Hsuan Chiang 提交于
      Summary:
      This diff allows compaction to reclaim storage more effectively.
      In the current design, compactions are mainly triggered based on
      the file sizes.  However, since deletion entries does not have
      value, files which have many deletion entries are less likely
      to be compacted.  As a result, it may took a while to make
      deletion entries to be compacted.
      
      This diff address issue by compensating the size of deletion
      entries during compaction process: the size of each deletion
      entry in the compaction process is augmented by 2x average
      value size.  The diff applies to both leveled and universal
      compacitons.
      
      Test Plan:
      develop CompactionDeletionTrigger
      make db_test
      ./db_test
      
      Reviewers: haobo, igor, ljin, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19029
      e813f5b6
  16. 20 6月, 2014 1 次提交
    • I
      Remove seek compaction · d4a84233
      Igor Canadi 提交于
      Summary:
      As discussed in our internal group, we don't get much use of seek compaction at the moment, while it's making code more complicated and slower in some cases.
      
      This diff removes seek compaction and (hopefully) all code that was introduced to support seek compaction.
      
      There is one test case that relied on didIO information. I'll try to find another way to implement it.
      
      Test Plan: make check
      
      Reviewers: sdong, haobo, yhchiang, ljin, dhruba
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19161
      d4a84233
  17. 14 6月, 2014 1 次提交
  18. 13 6月, 2014 1 次提交
  19. 03 6月, 2014 1 次提交
    • S
      In DB::NewIterator(), try to allocate the whole iterator tree in an arena · df9069d2
      sdong 提交于
      Summary:
      In this patch, try to allocate the whole iterator tree starting from DBIter from an arena
      1. ArenaWrappedDBIter is created when serves as the entry point of an iterator tree, with an arena in it.
      2. Add an option to create iterator from arena for following iterators: DBIter, MergingIterator, MemtableIterator, all mem table's iterators, all table reader's iterators and two level iterator.
      3. MergeIteratorBuilder is created to incrementally build the tree of internal iterators. It is passed to mem table list and version set and add iterators to it.
      
      Limitations:
      (1) Only DB::NewIterator() without tailing uses the arena. Other cases, including readonly DB and compactions are still from malloc
      (2) Two level iterator itself is allocated in arena, but not iterators inside it.
      
      Test Plan: make all check
      
      Reviewers: ljin, haobo
      
      Reviewed By: haobo
      
      Subscribers: leveldb, dhruba, yhchiang, igor
      
      Differential Revision: https://reviews.facebook.net/D18513
      df9069d2
  20. 31 5月, 2014 1 次提交
    • L
      forward iterator · 388d2054
      Lei Jin 提交于
      Summary:
      Forward iterator puts everything together in a flat structure instead of
      a hierarchy of nested iterators. this should simplify the code and
      provide better performance. It also enables more optimization since all
      information are accessiable in one place.
      Init evaluation shows about 6% improvement
      
      Test Plan: db_test and db_bench
      
      Reviewers: dhruba, igor, tnovak, sdong, haobo
      
      Reviewed By: haobo
      
      Subscribers: sdong, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18795
      388d2054
  21. 22 5月, 2014 1 次提交
    • I
      FIFO compaction style · 6de6a066
      Igor Canadi 提交于
      Summary:
      Introducing new compaction style -- FIFO.
      
      FIFO compaction style has write amplification of 1 (+1 for WAL) and it deletes the oldest files when the total DB size exceeds pre-configured values.
      
      FIFO compaction style is suited for storing high-frequency event logs.
      
      Test Plan: Added a unit test
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: dhruba
      
      Subscribers: alberts, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18765
      6de6a066
  22. 27 4月, 2014 1 次提交
  23. 26 4月, 2014 1 次提交
  24. 22 4月, 2014 1 次提交
    • L
      hints for narrowing down FindFile range and avoiding checking unrelevant L0 files · 0f2d7681
      Lei Jin 提交于
      Summary:
      The file tree structure in Version is prebuilt and the range of each file is known.
      On the Get() code path, we do binary search in FindFile() by comparing
      target key with each file's largest key and also check the range for each L0 file.
      With some pre-calculated knowledge, each key comparision that has been done can serve
      as a hint to narrow down further searches:
      (1) If a key falls within a L0 file's range, we can safely skip the next
      file if its range does not overlap with the current one.
      (2) If a key falls within a file's range in level L0 - Ln-1, we should only
      need to binary search in the next level for files that overlap with the current one.
      
      (1) will be able to skip some files depending one the key distribution.
      (2) can greatly reduce the range of binary search, especially for bottom
      levels, given that one file most likely only overlaps with N files from
      the level below (where N is max_bytes_for_level_multiplier). So on level
      L, we will only look at ~N files instead of N^L files.
      
      Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
      improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
      9.6M/s), it gives us ~13% improvement.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, dhruba, sdong, yhchiang
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17205
      0f2d7681
  25. 18 4月, 2014 1 次提交
    • S
      Minimize accessing multiple objects in Version::Get() · fa430bfd
      sdong 提交于
      Summary:
      One of our profilings shows that Version::Get() sometimes is slow when getting pointer of user comparators or other global objects. In this patch:
      (1) we keep pointers of immutable objects in Version to avoid accesses them though option objects or cfd objects
      (2) table_reader is directly cached in FileMetaData so that table cache don't have to go through handle first to fetch it
      (3) If level 0 has less than 3 files, skip the filtering logic based on SST tables' key range. Smallest and largest key are stored in separated memory locations, which has potential cache misses
      
      Test Plan: make all check
      
      Reviewers: haobo, ljin
      
      Reviewed By: haobo
      
      CC: igor, yhchiang, nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17739
      fa430bfd
  26. 16 4月, 2014 1 次提交
    • I
      RocksDBLite · 588bca20
      Igor Canadi 提交于
      Summary:
      Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.
      
      Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)
      
      Test Plan: compiles :)
      
      Reviewers: dhruba, haobo, ljin, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17835
      588bca20
  27. 10 4月, 2014 1 次提交
  28. 20 3月, 2014 1 次提交
    • I
      ComputeCompactionScore in CompactionPicker · fcd5c5e8
      Igor Canadi 提交于
      Summary:
      As it turns out, we need the call to ComputeCompactionScore (previously: Finalize) in CompactionPicker.
      
      The issue caused a deadlock in db_stress: http://ci-builds.fb.com/job/rocksdb_crashtest/290/console
      
      The last two lines before a deadlock were:
      2014/03/18-22:43:41.481029 7facafbee700 (Original Log Time 2014/03/18-22:43:41.480989) Compaction nothing to do
      2014/03/18-22:43:41.481041 7faccf7fc700 wait for fewer level0 files...
      
      "Compaction nothing to do" and other thread waiting for fewer level0 files. Hm hm.
      
      I moved the pre-sorting to SaveTo, which should fix both the original and the new issue.
      
      Test Plan: make check for now, will run db_stress in jenkins
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17037
      fcd5c5e8
  29. 19 3月, 2014 1 次提交
    • I
      Don't Finalize in CompactionPicker · 758fa8c3
      Igor Canadi 提交于
      Summary:
      Finalize re-sorts (read: mutates) the files_ in Version* and it is called by CompactionPicker during normal runtime. At the same time, this same Version* lives in the SuperVersion* and is accessed without the mutex in GetImpl() code path.
      
      Mutating the files_ in one thread and reading the same files_ in another thread is a bad idea. It caused this issue: http://ci-builds.fb.com/job/rocksdb_crashtest/285/console
      
      Long-term, we need to be more careful with method contracts and clearly document what state can be mutated when. Now that we are much faster because we don't lock in GetImpl(), we keep running into data races that were not a problem before when we were slower. db_stress has been very helpful in detecting those.
      
      Short-term, I removed Finalize() from CompactionPicker.
      
      Note: I believe this is an issue in current 2.7 version running in production.
      
      Test Plan:
      make check
      Will also run db_stress to see if issue is gone
      
      Reviewers: sdong, ljin, dhruba, haobo
      
      Reviewed By: sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16983
      758fa8c3
  30. 18 3月, 2014 1 次提交
    • I
      Fix race condition in manifest roll · ae25742a
      Igor Canadi 提交于
      Summary:
      When the manifest is getting rolled the following happens:
      1) manifest_file_number_ is assigned to a new manifest number (even though the old one is still current)
      2) mutex is unlocked
      3) SetCurrentFile() creates temporary file manifest_file_number_.dbtmp
      4) SetCurrentFile() renames manifest_file_number_.dbtmp to CURRENT
      5) mutex is locked
      
      If FindObsoleteFiles happens between (3) and (4) it will:
      1) Delete manifest_file_number_.dbtmp (because it's not in pending_outputs_)
      2) Delete old manifest (because the manifest_file_number_ already points to a new one)
      
      I introduce the concept of prev_manifest_file_number_ that will avoid the race condition.
      
      However, we should discuss the future of MANIFEST file rolling. We found some race conditions with it last week and who knows how many more are there. Nobody is using it in production because we don't trust the implementation. Should we even support it?
      
      Test Plan: make check
      
      Reviewers: ljin, dhruba, haobo, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16929
      ae25742a
  31. 14 3月, 2014 1 次提交
  32. 11 3月, 2014 1 次提交
    • H
      [RocksDB] LogBuffer Cleanup · 66da4679
      Haobo Xu 提交于
      Summary: Moved LogBuffer class to an internal header. Removed some unneccesary indirection. Enabled log buffer for BackgroundCallFlush. Forced log buffer flush right after Unlock to improve time ordering of info log.
      
      Test Plan: make check; db_bench compare LOG output
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      CC: leveldb, igor
      
      Differential Revision: https://reviews.facebook.net/D16707
      66da4679
  33. 06 3月, 2014 1 次提交
    • S
      Buffer info logs when picking compactions and write them out after releasing the mutex · ecb1ffa2
      sdong 提交于
      Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
      
      Test Plan:
      make all check
      check the log lines while running some tests that trigger compactions.
      
      Reviewers: haobo, igor, dhruba
      
      Reviewed By: dhruba
      
      CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
      
      Differential Revision: https://reviews.facebook.net/D16515
      ecb1ffa2
  34. 01 3月, 2014 1 次提交
    • I
      [CF] Rething LogAndApply for column families · 8ea21a77
      Igor Canadi 提交于
      Summary:
      I though I might get away with as little changes to LogAndApply() as possible. It turns out this is not the case.
      
      This diff introduces different behavior of LogAndApply() for three cases:
      1. column family add
      2. column family drop
      3. no-column family manipulation
      
      (1) and (2) don't support group commit yet.
      
      There were a lot of problems with old version od LogAndApply, detected by db_stress. The biggest was non-atomicity of manifest writes and metadata changes (i.e. if column family add is in manifest, it also has to be in in-memory data structure).
      
      Test Plan: db_stress
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16491
      8ea21a77
  35. 27 2月, 2014 1 次提交
  36. 26 2月, 2014 1 次提交
    • I
      [CF] Log deletion in column families · 5ad7ee03
      Igor Canadi 提交于
      Summary:
      * Added unit test that verifies that obsolete files are deleted.
      * Advance log number for empty column family when cutting log file.
      * MinLogNumber() bug fix! (caught by the new unit test)
      
      Test Plan: unit test
      
      Reviewers: dhruba, haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16311
      5ad7ee03
  37. 14 2月, 2014 1 次提交
  38. 13 2月, 2014 1 次提交
    • I
      [CF] Rethinking ColumnFamilyHandle and fix to dropping column families · b06840aa
      Igor Canadi 提交于
      Summary:
      The change to the public behavior:
      * When opening a DB or creating new column family client gets a ColumnFamilyHandle.
      * As long as column family handle is alive, client can do whatever he wants with it, even drop it
      * Dropped column family can still be read from (using the column family handle)
      * Added a new call CloseColumnFamily(). Client has to close all column families that he has opened before deleting the DB
      * As soon as column family is closed, any calls to DB using that column family handle will fail (also any outstanding calls)
      
      Internally:
      * Ref-counting ColumnFamilyData
      * New thread-safety for ColumnFamilySet
      * Dropped column families are now completely dropped and their memory cleaned-up
      
      Test Plan: added some tests to column_family_test
      
      Reviewers: dhruba, haobo, kailiu, sdong
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D16101
      b06840aa