1. 19 10月, 2016 2 次提交
    • A
      not split file in compaciton on level 0 · 52c9808c
      Aaron Gao 提交于
      Summary: we should not split file on level 0 in compaction because it will fail the following verification of seqno order on level 0
      
      Test Plan: check with filldeterministic in db_bench
      
      Reviewers: yhchiang, andrewkr
      
      Reviewed By: andrewkr
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D65193
      52c9808c
    • A
      Compaction Support for Range Deletion · 6fbe96ba
      Andrew Kryczka 提交于
      Summary:
      This diff introduces RangeDelAggregator, which takes ownership of iterators
      provided to it via AddTombstones(). The tombstones are organized in a two-level
      map (snapshot stripe -> begin key -> tombstone). Tombstone creation avoids data
      copy by holding Slices returned by the iterator, which remain valid thanks to pinning.
      
      For compaction, we create a hierarchical range tombstone iterator with structure
      matching the iterator over compaction input data. An aggregator based on that
      iterator is used by CompactionIterator to determine which keys are covered by
      range tombstones. In case of merge operand, the same aggregator is used by
      MergeHelper. Upon finishing each file in the compaction, relevant range tombstones
      are added to the output file's range tombstone metablock and file boundaries are
      updated accordingly.
      
      To check whether a key is covered by range tombstone, RangeDelAggregator::ShouldDelete()
      considers tombstones in the key's snapshot stripe. When this function is used outside of
      compaction, it also checks newer stripes, which can contain covering tombstones. Currently
      the intra-stripe check involves a linear scan; however, in the future we plan to collapse ranges
      within a stripe such that binary search can be used.
      
      RangeDelAggregator::AddToBuilder() adds all range tombstones in the table's key-range
      to a new table's range tombstone meta-block. Since range tombstones may fall in the gap
      between files, we may need to extend some files' key-ranges. The strategy is (1) first file
      extends as far left as possible and other files do not extend left, (2) all files extend right
      until either the start of the next file or the end of the last range tombstone in the gap,
      whichever comes first.
      
      One other notable change is adding release/move semantics to ScopedArenaIterator
      such that it can be used to transfer ownership of an arena-allocated iterator, similar to
      how unique_ptr is used for malloc'd data.
      
      Depends on D61473
      
      Test Plan: compaction_iterator_test, mock_table, end-to-end tests in D63927
      
      Reviewers: sdong, IslamAbdelRahman, wanning, yhchiang, lightmark
      
      Reviewed By: lightmark
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D62205
      6fbe96ba
  2. 24 9月, 2016 2 次提交
    • A
      not cut compaction output when compact to level 0 · c2a62a4c
      Aaron Gao 提交于
      Summary: we should not call ShouldStopBefore() in compaction when the compaction targets level 0. Otherwise, CheckConsistency will fail the assertion of seq number check on level 0.
      
      Test Plan:
      make all check -j64
      I also manully test that using db_bench to compact files to level 0. Without this line change, the assertion files and multiple files are generated on level 0 after compaction.
      
      Reviewers: yhchiang, andrewkr, yiwu, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64269
      c2a62a4c
    • Y
      Split DBOptions into ImmutableDBOptions and MutableDBOptions · 9ed928e7
      Yi Wu 提交于
      Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs.
      
      Test Plan:
        make all check
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64065
      9ed928e7
  3. 18 9月, 2016 1 次提交
  4. 17 9月, 2016 1 次提交
    • Y
      Remove ColumnFamilyData::options() · 0a88f38b
      Yi Wu 提交于
      Summary: One more small refactor before I split DBOptions into mutable and immutable parts.
      
      Test Plan: existing unit tests.
      
      Reviewers: yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D64047
      0a88f38b
  5. 03 9月, 2016 1 次提交
  6. 02 9月, 2016 1 次提交
    • S
      Merge options source_compaction_factor, max_grandparent_overlap_bytes and... · 32149059
      sdong 提交于
      Merge options source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes
      
      Summary: To reduce number of options, merge source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes.
      
      Test Plan: Add two new unit tests. Run all existing tests, including jtest.
      
      Reviewers: yhchiang, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: leveldb, andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D59829
      32149059
  7. 16 8月, 2016 1 次提交
    • A
      Single Delete Mismatch and Fallthrough statistics · 2fc2fd92
      Anirban Rahut 提交于
      Summary:
      Added 2 statistics in compaction job statistics, to
      identify if single deletes are not meeting a matching key
      (fallthrough) or single deletes are meeting a merge, delete or
      another single delete (i.e. not the expected case of put).
      
      Test Plan: Tested the statistics using write_stress and compaction_job_stats_test
      
      Reviewers: sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba
      
      Differential Revision: https://reviews.facebook.net/D61749
      2fc2fd92
  8. 18 5月, 2016 1 次提交
    • A
      [rocksdb] make more options dynamic · 43afd72b
      Aaron Gao 提交于
      Summary:
      make more ColumnFamilyOptions dynamic:
      - compression
      - soft_pending_compaction_bytes_limit
      - hard_pending_compaction_bytes_limit
      - min_partial_merge_operands
      - report_bg_io_stats
      - paranoid_file_checks
      
      Test Plan:
      Add sanity check in `db_test.cc` for all above options except for soft_pending_compaction_bytes_limit and hard_pending_compaction_bytes_limit.
      All passed.
      
      Reviewers: andrewkr, sdong, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D57519
      43afd72b
  9. 30 4月, 2016 2 次提交
  10. 28 4月, 2016 1 次提交
    • A
      Shared dictionary compression using reference block · 843d2e31
      Andrew Kryczka 提交于
      Summary:
      This adds a new metablock containing a shared dictionary that is used
      to compress all data blocks in the SST file. The size of the shared dictionary
      is configurable in CompressionOptions and defaults to 0. It's currently only
      used for zlib/lz4/lz4hc, but the block will be stored in the SST regardless of
      the compression type if the user chooses a nonzero dictionary size.
      
      During compaction, computes the dictionary by randomly sampling the first
      output file in each subcompaction. It pre-computes the intervals to sample
      by assuming the output file will have the maximum allowable length. In case
      the file is smaller, some of the pre-computed sampling intervals can be beyond
      end-of-file, in which case we skip over those samples and the dictionary will
      be a bit smaller. After the dictionary is generated using the first file in a
      subcompaction, it is loaded into the compression library before writing each
      block in each subsequent file of that subcompaction.
      
      On the read path, gets the dictionary from the metablock, if it exists. Then,
      loads that dictionary into the compression library before reading each block.
      
      Test Plan: new unit test
      
      Reviewers: yhchiang, IslamAbdelRahman, cyan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, yoshinorim, kradhakrishnan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D52287
      843d2e31
  11. 12 4月, 2016 1 次提交
    • Y
      Relax an assertion in Compaction::ShouldStopBefore · 13e6c8e9
      Yueh-Hsuan Chiang 提交于
      Summary:
      In some case, it is possible to have two concesutive SST files might sharing
      same boundary keys.  However, in the assertion in Compaction::ShouldStopBefore,
      it exclude such possibility.
      
      This patch fix this issue by relaxing the assertion to allow the equal case.
      
      Test Plan: rocksdb tests
      
      Reviewers: IslamAbdelRahman, kradhakrishnan, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D55875
      13e6c8e9
  12. 07 4月, 2016 1 次提交
    • A
      Embed column family name in SST file · 2391ef72
      Andrew Kryczka 提交于
      Summary:
      Added the column family name to the properties block. This property
      is omitted only if the property is unavailable, such as when RepairDB()
      writes SST files.
      
      In a next diff, I will change RepairDB to use this new property for
      deciding to which column family an existing SST file belongs. If this
      property is missing, it will add it to the "unknown" column family (same
      as its existing behavior).
      
      Test Plan:
      New unit test:
      
        $ ./db_table_properties_test --gtest_filter=DBTablePropertiesTest.GetColumnFamilyNameProperty
      
      Reviewers: IslamAbdelRahman, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: andrewkr, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D55605
      2391ef72
  13. 25 3月, 2016 1 次提交
    • Y
      Fix data race issue when sub-compaction is used in CompactionJob · be9816b3
      Yueh-Hsuan Chiang 提交于
      Summary:
      When subcompaction is used, all subcompactions share the same Compaction
      pointer in CompactionJob while each subcompaction all keeps their mutable
      stats in SubcompactionState.  However, there're still some mutable part
      that is currently store in the shared Compaction pointer.
      
      This patch makes two changes:
      
      1. Make the shared Compaction pointer const so that it can never be modified
         during the compaction.
      2. Move necessary states from Compaction to SubcompactionState.
      3. Make functions of Compaction const if the function does not modify
         its internal state.
      
      Test Plan: rocksdb and MyRocks test
      
      Reviewers: sdong, kradhakrishnan, andrewkr, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: andrewkr, dhruba, yoshinorim, gunnarku, leveldb
      
      Differential Revision: https://reviews.facebook.net/D55923
      be9816b3
  14. 03 3月, 2016 1 次提交
  15. 18 2月, 2016 1 次提交
  16. 10 2月, 2016 1 次提交
  17. 29 1月, 2016 1 次提交
  18. 27 1月, 2016 1 次提交
    • A
      [directory includes cleanup] Remove util->db dependency for ThreadStatusUtil · acd7d586
      Andrew Kryczka 提交于
      Summary:
      We can avoid the dependency by forward-declaring ColumnFamilyData and
      then treating it as a black box. That means callers of ThreadStatusUtil need to
      explicitly provide more options, even if they can be derived from the
      ColumnFamilyData, since ThreadStatusUtil doesn't include the definition.
      
      This is part of a series of diffs to eliminate circular dependencies between
      directories (e.g., db/* files depending on util/* files and vice-versa).
      
      Test Plan:
        $ ./db_test --gtest_filter=DBTest.GetThreadStatus
        $ make -j32 commit-prereq
      
      Reviewers: sdong, yhchiang, IslamAbdelRahman
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D53361
      acd7d586
  19. 22 12月, 2015 2 次提交
  20. 12 12月, 2015 1 次提交
  21. 09 12月, 2015 2 次提交
  22. 08 12月, 2015 1 次提交
    • A
      Support marking snapshots for write-conflict checking · ec704aaf
      agiardullo 提交于
      Summary:
      D50475 enables using SST files for transaction write-conflict checking.  In order for this to work, we need to make sure not to compact out SingleDeletes when there is an earlier transaction snapshot(D50295).  If there is a long-held snapshot, this could reduce the benefit of the SingleDelete optimization.
      
      This diff allows Transactions to mark snapshots as being used for write-conflict checking.  Then, during compaction, we will be able to optimize SingleDeletes better in the future.
      
      This diff adds a flag to SnapshotImpl which is used by Transactions.  This diff also passes the earliest write-conflict snapshot's sequence number to CompactionIterator.  This diff does not actually change Compaction (after this diff is pushed, D50295 will be able to use this information).
      
      Test Plan: no behavior change, ran existing tests
      
      Reviewers: rven, kradhakrishnan, yhchiang, IslamAbdelRahman, sdong
      
      Reviewed By: sdong
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D51183
      ec704aaf
  23. 31 10月, 2015 1 次提交
  24. 30 10月, 2015 1 次提交
  25. 20 10月, 2015 1 次提交
    • I
      Fix iOS build · 4e07c99a
      Igor Canadi 提交于
      Summary: We don't yet have a CI build for iOS, so our iOS compile gets broken sometimes. Most of the errors are from assumption that size_t is 64-bit, while it's actually 32-bit on some (all?) iOS platforms. This diff fixes the compile.
      
      Test Plan:
      TARGET_OS=IOS make static_lib
      
      Observe there are no warnings
      
      Reviewers: sdong, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D49029
      4e07c99a
  26. 17 10月, 2015 1 次提交
    • S
      Add more kill points · 277dea78
      sdong 提交于
      Summary:
      Add kill points in:
      1. after creating a file
      2. before writing a manifest record
      3. before syncing manifest
      4. before creating a new current file
      5. after creating a new current file
      
      Test Plan: Run all current tests.
      
      Reviewers: yhchiang, igor, anthony, IslamAbdelRahman, rven, kradhakrishnan
      
      Reviewed By: kradhakrishnan
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48855
      277dea78
  27. 14 10月, 2015 1 次提交
    • S
      Seperate InternalIterator from Iterator · 35ad531b
      sdong 提交于
      Summary:
      Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.
      
      This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
      At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.
      
      Test Plan: Run all existing tests.
      
      Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven
      
      Reviewed By: rven
      
      Subscribers: leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48549
      35ad531b
  28. 10 10月, 2015 2 次提交
    • A
      Passing table properties to compaction callback · 3d07b815
      Alexey Maykov 提交于
      Summary: It would be nice to have and access to table properties in compaction callbacks. In MyRocks project, it will make possible to update optimizer statistics online.
      
      Test Plan: ran the unit test. Ran myrocks with the new way of collecting stats.
      
      Reviewers: igor, rven, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D48267
      3d07b815
    • S
      Pass column family ID to table property collector · 776bd8d5
      sdong 提交于
      Summary: Pass column family ID through TablePropertiesCollectorFactory::CreateTablePropertiesCollector() so that users can identify which column family this file is for and handle it differently.
      
      Test Plan: Add unit test scenarios in tests related to table properties collectors to verify the information passed in is correct.
      
      Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor, IslamAbdelRahman
      
      Reviewed By: IslamAbdelRahman
      
      Subscribers: yoshinorim, leveldb, dhruba
      
      Differential Revision: https://reviews.facebook.net/D48411
      776bd8d5
  29. 08 10月, 2015 1 次提交
    • I
      Compaction filter on merge operands · d80ce7f9
      Igor Canadi 提交于
      Summary:
      Since Andres' internship is over, I took over https://reviews.facebook.net/D42555 and rebased and simplified it a bit.
      
      The behavior in this diff is a bit simpler than in D42555:
      * only merge operators are passed through FilterMergeValue(). If fitler function returns true, the merge operator is ignored
      * compaction filter is *not* called on: 1) results of merge operations and 2) base values that are getting merged with merge operands (the second case was also true in previous diff)
      
      Do we also need a compaction filter to get called on merge results?
      
      Test Plan: make && make check
      
      Reviewers: lovro, tnovak, rven, yhchiang, sdong
      
      Reviewed By: sdong
      
      Subscribers: noetzli, kolmike, leveldb, dhruba, sdong
      
      Differential Revision: https://reviews.facebook.net/D47847
      d80ce7f9
  30. 16 9月, 2015 1 次提交
    • A
      Add compaction time to log output · 5ba3297d
      Ari Ekmekji 提交于
      Summary:
      Although compaction time is recorded in the statistics,
      it is helpful to include this value in the log output corresponding
      to the end of compaction.
      
      Test Plan: make all && make check
      
      Reviewers: yhchiang, sdong, igor, noetzli, MarkCallaghan
      
      Reviewed By: MarkCallaghan
      
      Subscribers: dhruba
      
      Differential Revision: https://reviews.facebook.net/D47007
      5ba3297d
  31. 15 9月, 2015 1 次提交
    • A
      Add counters for L0 stall while L0-L1 compaction is taking place · 03ddce9a
      Ari Ekmekji 提交于
      Summary:
      Although there are currently counters to keep track of the
      stall caused by having too many L0 files, there is no distinction as
      to whether when that stall occurs either (A) L0-L1 compaction is taking
      place to try and mitigate it, or (B) no L0-L1 compaction has been scheduled
      at the moment. This diff adds a counter for (A) so that the nature of L0
      stalls can be better understood.
      
      Test Plan: make all && make check
      
      Reviewers: sdong, igor, anthony, noetzli, yhchiang
      
      Reviewed By: yhchiang
      
      Subscribers: MarkCallaghan, dhruba
      
      Differential Revision: https://reviews.facebook.net/D46749
      03ddce9a
  32. 11 9月, 2015 2 次提交
    • A
      Refactored common code of Builder/CompactionJob out into a CompactionIterator · 8aa1f151
      Andres Noetzli 提交于
      Summary:
      Builder and CompactionJob share a lot of fairly complex code. This patch
      refactors this code into a separate class, the CompactionIterator. Because the
      shared code is fairly complex, this patch hopefully improves maintainability.
      While there are is a lot of potential for further improvements, the patch is
      intentionally pretty close to the original structure because the change is
      already complex enough.
      
      Test Plan: make clean all check && ./db_stress
      
      Reviewers: rven, anthony, yhchiang, sdong, igor
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D46197
      8aa1f151
    • A
      Determine boundaries of subcompactions · 3c37b3cc
      Ari Ekmekji 提交于
      Summary:
      Up to this point, the subcompactions that make up a compaction
      job have been divided based on the key range of the L1 files, and each
      subcompaction has handled the key range of only one file. However
      DBOption.max_subcompactions allows the user to designate how many
      subcompactions at most to perform. This patch updates the
      CompactionJob::GetSubcompactionBoundaries() to determine these
      divisions accordingly based on that option and other input/system factors.
      
      The current approach orders the starting and/or ending keys of certain
      compaction input files and then generates a histogram to approximate the
      size covered by the key range between each consecutive pair of keys. Then
      it groups these ranges into groups so that the sizes are approximately equal
      to one another. The approach has also been adapted to work for universal
      compaction as well instead of just for level-based compaction as it was before.
      
      These subcompactions are then executed in parallel by locally spawning
      threads, one for each. The results are then aggregated and the compaction
      completed.
      
      Test Plan: make all && make check
      
      Reviewers: yhchiang, anthony, igor, noetzli, sdong
      
      Reviewed By: sdong
      
      Subscribers: MarkCallaghan, dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D43269
      3c37b3cc
  33. 09 9月, 2015 1 次提交
    • A
      Added Equal method to Comparator interface · 6bdc484f
      Andres Noetzli 提交于
      Summary:
      In some cases, equality comparisons can be done more efficiently than three-way
      comparisons. There are quite a few places in the code where we only care about
      equality. This patch adds an Equal() method that defaults to using the
      Compare() method.
      
      Test Plan: make clean all check
      
      Reviewers: rven, anthony, yhchiang, igor, sdong
      
      Reviewed By: igor
      
      Subscribers: dhruba, leveldb
      
      Differential Revision: https://reviews.facebook.net/D46233
      6bdc484f