• A
    Determine boundaries of subcompactions · 3c37b3cc
    Ari Ekmekji 提交于
    Summary:
    Up to this point, the subcompactions that make up a compaction
    job have been divided based on the key range of the L1 files, and each
    subcompaction has handled the key range of only one file. However
    DBOption.max_subcompactions allows the user to designate how many
    subcompactions at most to perform. This patch updates the
    CompactionJob::GetSubcompactionBoundaries() to determine these
    divisions accordingly based on that option and other input/system factors.
    
    The current approach orders the starting and/or ending keys of certain
    compaction input files and then generates a histogram to approximate the
    size covered by the key range between each consecutive pair of keys. Then
    it groups these ranges into groups so that the sizes are approximately equal
    to one another. The approach has also been adapted to work for universal
    compaction as well instead of just for level-based compaction as it was before.
    
    These subcompactions are then executed in parallel by locally spawning
    threads, one for each. The results are then aggregated and the compaction
    completed.
    
    Test Plan: make all && make check
    
    Reviewers: yhchiang, anthony, igor, noetzli, sdong
    
    Reviewed By: sdong
    
    Subscribers: MarkCallaghan, dhruba, leveldb
    
    Differential Revision: https://reviews.facebook.net/D43269
    3c37b3cc
compaction_job.cc 49.0 KB