1. 05 8月, 2014 1 次提交
  2. 31 7月, 2014 1 次提交
  3. 29 7月, 2014 2 次提交
    • S
      Add DB property estimated number of keys · f6784766
      sdong 提交于
      Summary: Add a DB property of estimated number of live keys, by adding number of entries of all mem tables and all files, subtracted by all deletions in all files.
      
      Test Plan: Add the case in unit tests
      
      Reviewers: hobbymanyp, ljin
      
      Reviewed By: ljin
      
      Subscribers: MarkCallaghan, yoshinorim, leveldb, igor, dhruba
      
      Differential Revision: https://reviews.facebook.net/D20631
      f6784766
    • L
      make statistics forward-able · 40fa8a4c
      Lei Jin 提交于
      Summary:
      Make StatisticsImpl being able to forward stats to provided statistics
      implementation. The main purpose is to allow us to collect internal
      stats in the future even when user supplies custom statistics
      implementation. It avoids intrumenting 2 sets of stats collection code.
      One immediate use case is tuning advisor, which needs to collect some
      internal stats, users may not be interested.
      
      Test Plan:
      ran db_bench and see stats show up at the end of run
      Will run make all check since some tests rely on statistics
      
      Reviewers: yhchiang, sdong, igor
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D20145
      40fa8a4c
  4. 25 7月, 2014 1 次提交
  5. 22 7月, 2014 1 次提交
  6. 18 7月, 2014 2 次提交
    • Y
      Add MaxInputLevel() to CompactionPicker · 052ddbe0
      Yueh-Hsuan Chiang 提交于
      Summary:
      Having if-then branch for different compaction strategies is considered
      hacky and make CompactionPicker less pluggable.  This diff removes two
      of such if-then branches in version_set.cc by adding MaxInputLevel() to
      CompactionPicker.
      
          // Given the current number of levels, returns the lowest allowed level
          // for compaction input.
          virtual int MaxInputLevel(int current_num_levels) const;
      
      Test Plan:
      make db_test
      export ROCKSDB_TESTS=Compaction
      ./db_test
      
      Reviewers: igor, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19971
      052ddbe0
    • R
      Guarding files_ attribute with #ifndef NDEBUG guard in FilePicker class. · 0d57e3ad
      Radheshyam Balasundaram 提交于
      Summary: Adding guards to files_ attribute of FilePicker class. This attribute is used only in DEBUG mode. This fixes build of static_lib in mac.
      
      Test Plan:
      make static_lib in mac
      make check all in devserver
      
      Reviewers: ljin, igor, sdong
      
      Reviewed By: sdong
      
      Differential Revision: https://reviews.facebook.net/D20163
      0d57e3ad
  7. 17 7月, 2014 3 次提交
    • Y
      Add struct CompactionInputFiles to manage compaction input files. · 296e3407
      Yueh-Hsuan Chiang 提交于
      Summary: Add struct CompactionInputFiles to manage compaction input files.
      
      Test Plan:
      export ROCKSDB_TESTS=Compact
      make db_test
      ./db_test
      
      Reviewers: ljin, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D20061
      296e3407
    • R
      Refactoring Version::Get() · 0418e66e
      Radheshyam Balasundaram 提交于
      Summary: Refactoring Version::Get() method to move file picker logic to a separate class.
      
      Test Plan: make check all
      
      Reviewers: igor, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19713
      0418e66e
    • F
      store file_indexer info in sequential memory · c11d604a
      Feng Zhu 提交于
      Summary:
        use arena to allocate space for next_level_index_ and level_rb_
        Thus increasing data locality and make Version::Get faster.
      
      Benchmark detail
      Base version: commit d2a727c1
      
      command used:
      ./db_bench --db=/mnt/db/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --block_size=4096 --cache_size=17179869184 --cache_numshardbits=6 --compression_type=none --compression_ratio=1 --min_level_to_compress=-1 --disable_seek_compaction=1 --hard_rate_limit=2 --write_buffer_size=134217728 --max_write_buffer_number=2 --level0_file_num_compaction_trigger=8 --target_file_size_base=2097152 --max_bytes_for_level_base=1073741824 --disable_wal=0 --sync=0 --disable_data_sync=1 --verify_checksum=1 --delete_obsolete_files_period_micros=314572800 --max_grandparent_overlap_factor=10 --max_background_compactions=4 --max_background_flushes=0 --level0_slowdown_writes_trigger=16 --level0_stop_writes_trigger=24 --statistics=0 --stats_per_interval=0 --stats_interval=1048576 --histogram=0 --use_plain_table=1 --open_files=-1 --mmap_read=1 --mmap_write=0 --memtablerep=prefix_hash --bloom_bits=10 --bloom_locality=1 --perf_level=0 --benchmarks=fillseq, readrandom,readrandom,readrandom --use_existing_db=0 --num=52428800 --threads=1
      
      Result:
      cpu running percentage:
      Version::Get, improved from 7.98% to 7.42%
      FileIndexer::GetNextLevelIndex, improved from 1.18% to 0.68%.
      
      Test Plan:
        make all check
      
      Reviewers: ljin, haobo, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, igor
      
      Differential Revision: https://reviews.facebook.net/D19845
      c11d604a
  8. 16 7月, 2014 1 次提交
  9. 12 7月, 2014 1 次提交
    • F
      use FileLevel in LevelFileNumIterator · 178fd6f9
      Feng Zhu 提交于
      Summary:
        Use FileLevel in LevelFileNumIterator, thus use new version of findFile.
        Old version of findFile function is deleted.
        Write a function in version_set.cc to generate FileLevel from files_.
        Add GenerateFileLevelTest in version_set_test.cc
      
      Test Plan:
        make all check
      
      Reviewers: ljin, haobo, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: igor, dhruba
      
      Differential Revision: https://reviews.facebook.net/D19659
      178fd6f9
  10. 10 7月, 2014 2 次提交
    • F
      create compressed_levels_ in Version, allocate its space using arena. Make... · f697cad1
      Feng Zhu 提交于
      create compressed_levels_ in Version, allocate its space using arena. Make Version::Get, Version::FindFile faster
      
      Summary:
          Define CompressedFileMetaData that just contains fd, smallest_slice, largest_slice. Create compressed_levels_ in Version, the space is allocated using arena
          Thus increase the file meta data locality, speed up "Get" and "FindFile"
      
          benchmark with in-memory tmpfs, could have 4% improvement under "random read" and 2% improvement under "read while writing"
      
      benchmark command:
      ./db_bench --db=/mnt/db/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --block_size=4096 --cache_size=17179869184 --cache_numshardbits=6 --compression_type=none --compression_ratio=1 --min_level_to_compress=-1 --disable_seek_compaction=1 --hard_rate_limit=2 --write_buffer_size=134217728 --max_write_buffer_number=2 --level0_file_num_compaction_trigger=8 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --disable_wal=0 --sync=0 --disable_data_sync=1 --verify_checksum=1 --delete_obsolete_files_period_micros=314572800 --max_grandparent_overlap_factor=10 --max_background_compactions=4 --max_background_flushes=0 --level0_slowdown_writes_trigger=16 --level0_stop_writes_trigger=24 --statistics=0 --stats_per_interval=0 --stats_interval=1048576 --histogram=0 --use_plain_table=1 --open_files=-1 --mmap_read=1 --mmap_write=0 --memtablerep=prefix_hash --bloom_bits=10 --bloom_locality=1 --perf_level=0 --benchmarks=readwhilewriting,readwhilewriting,readwhilewriting --use_existing_db=1 --num=52428800 --threads=1 —writes_per_second=81920
      
      Read Random:
      From 1.8363 ms/op, improve to 1.7587 ms/op.
      Read while writing:
      From 2.985 ms/op, improve to 2.924 ms/op.
      
      Test Plan:
          make all check
      
      Reviewers: ljin, haobo, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, igor
      
      Differential Revision: https://reviews.facebook.net/D19419
      f697cad1
    • Y
      Some fixes on size compensation logic for deletion entry in compaction · 70828557
      Yueh-Hsuan Chiang 提交于
      Summary:
      This patch include two fixes:
      1. newly created Version will now takes the aggregated stats for average-value-size from the latest Version.
      2. compensated size of a file is now computed only for newly created / loaded file, this addresses the issue where files are already sorted by their compensated file size but might sometimes observe some out-of-order due to later update on compensated file size.
      
      Test Plan:
      export ROCKSDB_TESTS=CompactionDele
      ./db_test
      
      Reviewers: ljin, igor, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19557
      70828557
  11. 04 7月, 2014 1 次提交
  12. 03 7月, 2014 1 次提交
  13. 01 7月, 2014 1 次提交
    • I
      No need for files_by_size_ in universal compaction · a2e0d890
      Igor Canadi 提交于
      Summary: files_by_size_ is sorted by time in case of universal compaction. However, Version::files_ is also sorted by time. So no need for files_by_size_
      
      Test Plan:
      1) make check with the change
      2) make check with `assert(last_index == c->input_version_->files_[level].size() - 1);` in compaction picker
      
      Reviewers: dhruba, haobo, yhchiang, sdong, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19125
      a2e0d890
  14. 25 6月, 2014 1 次提交
    • Y
      Allow compaction to reclaim storage more effectively. · e813f5b6
      Yueh-Hsuan Chiang 提交于
      Summary:
      This diff allows compaction to reclaim storage more effectively.
      In the current design, compactions are mainly triggered based on
      the file sizes.  However, since deletion entries does not have
      value, files which have many deletion entries are less likely
      to be compacted.  As a result, it may took a while to make
      deletion entries to be compacted.
      
      This diff address issue by compensating the size of deletion
      entries during compaction process: the size of each deletion
      entry in the compaction process is augmented by 2x average
      value size.  The diff applies to both leveled and universal
      compacitons.
      
      Test Plan:
      develop CompactionDeletionTrigger
      make db_test
      ./db_test
      
      Reviewers: haobo, igor, ljin, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19029
      e813f5b6
  15. 20 6月, 2014 2 次提交
    • I
      Remove seek compaction · d4a84233
      Igor Canadi 提交于
      Summary:
      As discussed in our internal group, we don't get much use of seek compaction at the moment, while it's making code more complicated and slower in some cases.
      
      This diff removes seek compaction and (hopefully) all code that was introduced to support seek compaction.
      
      There is one test case that relied on didIO information. I'll try to find another way to implement it.
      
      Test Plan: make check
      
      Reviewers: sdong, haobo, yhchiang, ljin, dhruba
      
      Reviewed By: ljin
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19161
      d4a84233
    • I
      Use same sorting for all level 0 files · 107e08ba
      Igor Canadi 提交于
      Summary:
      We decided that one of the long term goals is to unify level and universal compaction.
      
      As a small first step, I'm unifying level 0 sorting methods.
      
      Previously, we used to sort level 0 files in level compaction by file number and in universal compaction by sequence number.
      
      But it turns out that in level compaction, sorting by file number is exactly the same as sorting by sequence number.
      
      Test Plan:
      Ran make check with bunch of asserts to verify the sorting order is exactly the same.
      Also, make check with this patch
      
      Reviewers: haobo, yhchiang, ljin, dhruba, sdong
      
      Reviewed By: sdong
      
      Subscribers: leveldb
      
      Differential Revision: https://reviews.facebook.net/D19131
      107e08ba
  16. 17 6月, 2014 1 次提交
    • S
      Refactor: group metadata needed to open an SST file to a separate copyable struct · cadc1adf
      sdong 提交于
      Summary:
      We added multiple fields to FileMetaData recently and are planning to add more.
      This refactoring separate the minimum information for accessing the file. This object is copyable (FileMetaData is not copyable since the ref counter). I hope this refactoring can enable further improvements:
      
      (1) use it to design a more efficient data structure to speed up read queries.
      (2) in the future, when we add information of storage level, we can easily do the encoding, instead of enlarge this structure, which might expand memory work set for file meta data.
      
      The definition is same as current EncodedFileMetaData used in two level iterator, so now the logic in two level iterator is easier to understand.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, ljin
      
      Reviewed By: ljin
      
      Subscribers: leveldb, dhruba, yhchiang
      
      Differential Revision: https://reviews.facebook.net/D18933
      cadc1adf
  17. 14 6月, 2014 1 次提交
  18. 13 6月, 2014 1 次提交
  19. 03 6月, 2014 1 次提交
    • S
      In DB::NewIterator(), try to allocate the whole iterator tree in an arena · df9069d2
      sdong 提交于
      Summary:
      In this patch, try to allocate the whole iterator tree starting from DBIter from an arena
      1. ArenaWrappedDBIter is created when serves as the entry point of an iterator tree, with an arena in it.
      2. Add an option to create iterator from arena for following iterators: DBIter, MergingIterator, MemtableIterator, all mem table's iterators, all table reader's iterators and two level iterator.
      3. MergeIteratorBuilder is created to incrementally build the tree of internal iterators. It is passed to mem table list and version set and add iterators to it.
      
      Limitations:
      (1) Only DB::NewIterator() without tailing uses the arena. Other cases, including readonly DB and compactions are still from malloc
      (2) Two level iterator itself is allocated in arena, but not iterators inside it.
      
      Test Plan: make all check
      
      Reviewers: ljin, haobo
      
      Reviewed By: haobo
      
      Subscribers: leveldb, dhruba, yhchiang, igor
      
      Differential Revision: https://reviews.facebook.net/D18513
      df9069d2
  20. 22 5月, 2014 1 次提交
    • I
      FIFO compaction style · 6de6a066
      Igor Canadi 提交于
      Summary:
      Introducing new compaction style -- FIFO.
      
      FIFO compaction style has write amplification of 1 (+1 for WAL) and it deletes the oldest files when the total DB size exceeds pre-configured values.
      
      FIFO compaction style is suited for storing high-frequency event logs.
      
      Test Plan: Added a unit test
      
      Reviewers: dhruba, haobo, sdong
      
      Reviewed By: dhruba
      
      Subscribers: alberts, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18765
      6de6a066
  21. 15 5月, 2014 1 次提交
    • I
      Clean up compaction logging · f4574449
      Igor Canadi 提交于
      Summary: Cleaned up compaction logging a little bit. Now file sizes are easier to read. Also, removed the trailing space.
      
      Test Plan:
      verified that i'm happy with logging output:
      
              files_size[#33(seq=101,sz=98KB,0) #31(seq=81,sz=159KB,0) #26(seq=0,sz=637KB,0)]
      
      Reviewers: sdong, haobo, dhruba
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D18549
      f4574449
  22. 07 5月, 2014 1 次提交
    • S
      fsync directory after creating current file in NewDB() · 9efbd85a
      sdong 提交于
      Summary: One of our users reported current file corruption. The machine was rebooted during the time. This is the only think I can think of which could cause current file corruption. Just add this paranoid check.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor
      
      Reviewed By: haobo
      
      CC: yhchiang, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D18495
      9efbd85a
  23. 01 5月, 2014 1 次提交
  24. 26 4月, 2014 3 次提交
  25. 25 4月, 2014 1 次提交
    • I
      Column family logging · ad3cd39c
      Igor Canadi 提交于
      Summary:
      Now that we have column families involved, we need to add extra context to every log message. They now start with "[column family name] log message"
      
      Also added some logging that I think would be useful, like level summary after every flush (I often needed that when going through the logs).
      
      Test Plan: make check + ran db_bench to confirm I'm happy with log output
      
      Reviewers: dhruba, haobo, ljin, yhchiang, sdong
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D18303
      ad3cd39c
  26. 22 4月, 2014 1 次提交
    • L
      hints for narrowing down FindFile range and avoiding checking unrelevant L0 files · 0f2d7681
      Lei Jin 提交于
      Summary:
      The file tree structure in Version is prebuilt and the range of each file is known.
      On the Get() code path, we do binary search in FindFile() by comparing
      target key with each file's largest key and also check the range for each L0 file.
      With some pre-calculated knowledge, each key comparision that has been done can serve
      as a hint to narrow down further searches:
      (1) If a key falls within a L0 file's range, we can safely skip the next
      file if its range does not overlap with the current one.
      (2) If a key falls within a file's range in level L0 - Ln-1, we should only
      need to binary search in the next level for files that overlap with the current one.
      
      (1) will be able to skip some files depending one the key distribution.
      (2) can greatly reduce the range of binary search, especially for bottom
      levels, given that one file most likely only overlaps with N files from
      the level below (where N is max_bytes_for_level_multiplier). So on level
      L, we will only look at ~N files instead of N^L files.
      
      Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
      improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
      9.6M/s), it gives us ~13% improvement.
      
      Test Plan: make all check
      
      Reviewers: haobo, igor, dhruba, sdong, yhchiang
      
      Reviewed By: haobo
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17205
      0f2d7681
  27. 18 4月, 2014 2 次提交
    • S
      Fix bugs introduced by D17961 · 65179225
      sdong 提交于
      Summary:
      D17961 has two bugs:
      (1) two level iterator fails to populate FileMetaData.table_reader, causing performance regression.
      (2) table cache handle the !status.ok() case in the wrong place, causing seg fault which shouldn't happen.
      
      Test Plan: make all check
      
      Reviewers: ljin, igor, haobo
      
      Reviewed By: ljin
      
      CC: yhchiang, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17991
      65179225
    • S
      Minimize accessing multiple objects in Version::Get() · fa430bfd
      sdong 提交于
      Summary:
      One of our profilings shows that Version::Get() sometimes is slow when getting pointer of user comparators or other global objects. In this patch:
      (1) we keep pointers of immutable objects in Version to avoid accesses them though option objects or cfd objects
      (2) table_reader is directly cached in FileMetaData so that table cache don't have to go through handle first to fetch it
      (3) If level 0 has less than 3 files, skip the filtering logic based on SST tables' key range. Smallest and largest key are stored in separated memory locations, which has potential cache misses
      
      Test Plan: make all check
      
      Reviewers: haobo, ljin
      
      Reviewed By: haobo
      
      CC: igor, yhchiang, nkg-, leveldb
      
      Differential Revision: https://reviews.facebook.net/D17739
      fa430bfd
  28. 16 4月, 2014 2 次提交
    • I
      RocksDBLite · 588bca20
      Igor Canadi 提交于
      Summary:
      Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.
      
      Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)
      
      Test Plan: compiles :)
      
      Reviewers: dhruba, haobo, ljin, sdong, yhchiang
      
      Reviewed By: yhchiang
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17835
      588bca20
    • I
      Don't roll empty logs · e6acb874
      Igor Canadi 提交于
      Summary:
      With multiple column families, especially when manual Flush is executed, we might roll the log file, although the current log file is empty (no data has been written to the log).
      
      After the diff, we won't create new log file if current is empty.
      
      Next, I will write an algorithm that will flush column families that reference old log files (i.e., that weren't flushed in a while)
      
      Test Plan: Added an unit test. Confirmed that unit test failes in master
      
      Reviewers: dhruba, haobo, ljin, sdong
      
      Reviewed By: ljin
      
      CC: leveldb
      
      Differential Revision: https://reviews.facebook.net/D17631
      e6acb874
  29. 10 4月, 2014 2 次提交