• C
    Fix a recovery corner case (#7621) · 5e794b08
    Cheng Chang 提交于
    Summary:
    Consider the following sequence of events:
    
    1. Db flushed an SST with file number N, appended to MANIFEST, and tried to sync the MANIFEST.
    2. Syncing MANIFEST failed and db crashed.
    3. Db tried to recover with this MANIFEST. In the meantime, no entry about the newly-flushed SST was found in the MANIFEST. Therefore, RocksDB replayed WAL and tried to flush to an SST file reusing the same file number N. This failed because file system does not support overwrite. Then Db deleted this file.
    4. Db crashed again.
    5. Db tried to recover. When db read the MANIFEST, there was an entry referencing N.sst. This could happen probably because the append in step 1 finally reached the MANIFEST and became visible. Since N.sst had been deleted in step 3, recovery failed.
    
    It is possible that N.sst created in step 1 is valid. Although step 3 would still fail since the MANIFEST was not synced properly in step 1 and 2, deleting N.sst would make it impossible for the db to recover even if the remaining part of MANIFEST was appended and visible after step 5.
    
    After this PR, in step 3, immediately after recovering from MANIFEST, a new MANIFEST is created, then we find that N.sst is not referenced in the MANIFEST, so we delete it, and we'll not reuse N as file number. Then in step 5, since the new MANIFEST does not contain N.sst, the recovery failure situation in step 5 won't happen.
    
    Pull Request resolved: https://github.com/facebook/rocksdb/pull/7621
    
    Test Plan:
    1. some tests are updated, because these tests assume that new MANIFEST is created after WAL recovery.
    2. a new unit test is added in db_basic_test to simulate step 3.
    
    Reviewed By: riversand963
    
    Differential Revision: D24668144
    
    Pulled By: cheng-chang
    
    fbshipit-source-id: 90d7487fbad2bc3714f5ede46ea949895b15ae3b
    5e794b08
db_stress_test_base.cc 87.8 KB