• Y
    Fix race condition causing double deletion of ssts · 1f5def16
    Yanqin Jin 提交于
    Summary:
    Possible interleaved execution of background compaction thread calling `FindObsoleteFiles (no full scan) / PurgeObsoleteFiles` and user thread calling `FindObsoleteFiles (full scan) / PurgeObsoleteFiles` can lead to race condition on which RocksDB attempts to delete a file twice. The second attempt will fail and return `IO error`. This may occur to other files,  but this PR targets sst.
    Also add a unit test to verify that this PR fixes the issue.
    
    The newly added unit test `obsolete_files_test` has a test case for this scenario, implemented in `ObsoleteFilesTest#RaceForObsoleteFileDeletion`. `TestSyncPoint`s are used to coordinate the interleaving the `user_thread` and background compaction thread. They execute as follows
    ```
    timeline              user_thread                background_compaction thread
    t1   |                                          FindObsoleteFiles(full_scan=false)
    t2   |     FindObsoleteFiles(full_scan=true)
    t3   |                                          PurgeObsoleteFiles
    t4   |     PurgeObsoleteFiles
         V
    ```
    When `user_thread` invokes `FindObsoleteFiles` with full scan, it collects ALL files in RocksDB directory, including the ones that background compaction thread have collected in its job context. Then `user_thread` will see an IO error when trying to delete these files in `PurgeObsoleteFiles` because background compaction thread has already deleted the file in `PurgeObsoleteFiles`.
    To fix this, we make RocksDB remember which (SST) files have been found by threads after calling `FindObsoleteFiles` (see `DBImpl#files_grabbed_for_purge_`). Therefore, when another thread calls `FindObsoleteFiles` with full scan, it will not collect such files.
    
    ajkr could you take a look and comment? Thanks!
    Closes https://github.com/facebook/rocksdb/pull/3638
    
    Differential Revision: D7384372
    
    Pulled By: riversand963
    
    fbshipit-source-id: 01489516d60012e722ee65a80e1449e589ce26d3
    1f5def16
Makefile 63.8 KB