• L
    Integrated blob garbage collection: relocate blobs (#7694) · 51a8dc6d
    Levi Tamasi 提交于
    Summary:
    The patch adds basic garbage collection support to the integrated BlobDB
    implementation. Valid blobs residing in the oldest blob files are relocated
    as they are encountered during compaction. The threshold that determines
    which blob files qualify is computed based on the configuration option
    `blob_garbage_collection_age_cutoff`, which was introduced in https://github.com/facebook/rocksdb/issues/7661 .
    Once a blob is retrieved for the purposes of relocation, it passes through the
    same logic that extracts large values to blob files in general. This means that
    if, for instance, the size threshold for key-value separation (`min_blob_size`)
    got changed or writing blob files got disabled altogether, it is possible for the
    value to be moved back into the LSM tree. In particular, one way to re-inline
    all blob values if needed would be to perform a full manual compaction with
    `enable_blob_files` set to `false`, `enable_blob_garbage_collection` set to
    `true`, and `blob_file_garbage_collection_age_cutoff` set to `1.0`.
    
    Some TODOs that I plan to address in separate PRs:
    
    1) We'll have to measure the amount of new garbage in each blob file and log
    `BlobFileGarbage` entries as part of the compaction job's `VersionEdit`.
    (For the time being, blob files are cleaned up solely based on the
    `oldest_blob_file_number` relationships.)
    2) When compression is used for blobs, the compression type hasn't changed,
    and the blob still qualifies for being written to a blob file, we can simply copy
    the compressed blob to the new file instead of going through decompression
    and compression.
    3) We need to update the formula for computing write amplification to account
    for the amount of data read from blob files as part of GC.
    
    Pull Request resolved: https://github.com/facebook/rocksdb/pull/7694
    
    Test Plan: `make check`
    
    Reviewed By: riversand963
    
    Differential Revision: D25069663
    
    Pulled By: ltamasi
    
    fbshipit-source-id: bdfa8feb09afcf5bca3b4eba2ba72ce2f15cd06a
    51a8dc6d
column_family.cc 60.1 KB