1. 04 2月, 2020 2 次提交
    • A
      Fix a test failure in error_handler_test (#6367) · 7330ec0f
      anand76 提交于
      Summary:
      Fix an intermittent failure in
      DBErrorHandlingTest.CompactionManifestWriteError due to a race between
      background error recovery and the main test thread calling
      TEST_WaitForCompact().
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6367
      
      Test Plan: Run the test using gtest_parallel
      
      Differential Revision: D19713802
      
      Pulled By: anand1976
      
      fbshipit-source-id: 29e35dc26e0984fe8334c083e059f4fa1f335d68
      7330ec0f
    • H
      Error handler test fix (#6266) · eb4d6af5
      Huisheng Liu 提交于
      Summary:
      MultiDBCompactionError fails when it verifies the number of files on level 0 and level 1 without waiting for compaction to finish.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6266
      
      Differential Revision: D19701639
      
      Pulled By: riversand963
      
      fbshipit-source-id: e96d511bcde705075f073e0b550cebcd2ecfccdc
      eb4d6af5
  2. 31 1月, 2020 1 次提交
    • A
      Force a new manifest file if append to current one fails (#6331) · fb05b5a6
      anand76 提交于
      Summary:
      Fix for issue https://github.com/facebook/rocksdb/issues/6316
      
      When an append/sync of the manifest file fails due to an IO error such
      as NoSpace, we don't always put the DB in read-only mode. This is true
      for flush and compactions, as well as foreground operatons such as column family
      add/drop, CompactFiles etc. Subsequent changes to the DB will be
      recorded in the same manifest file, which would have a corrupted record
      in the middle due to the previous failure. On next DB::Open(), it will
      fail to process the full manifest and data will be lost.
      
      To fix this, we reset VersionSet::descriptor_log_ on append/sync
      failure, which will force a new manifest file to be written on the next
      append.
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6331
      
      Test Plan: Add new unit tests in error_handler_test.cc
      
      Differential Revision: D19632951
      
      Pulled By: anand1976
      
      fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3
      fb05b5a6
  3. 13 12月, 2019 1 次提交
    • C
      wait pending memtable writes on file ingestion or compact range (#6113) · a8445912
      Connor 提交于
      Summary:
      **Summary:**
      This PR fixes two unordered_write related issues:
      - ingestion job may skip the necessary memtable flush https://github.com/facebook/rocksdb/issues/6026
      - compact range may cause memtable is flushed before pending unordered write finished
          1. `CompactRange` triggers memtable flush but doesn't wait for pending-writes
          2.  there are some pending writes but memtable is already flushed
          3.  the memtable related WAL is removed( note that the pending-writes were recorded in that WAL).
          4.  pending-writes write to newer created memtable
          5. there is a restart
          6. missing the previous pending-writes because WAL is removed but they aren't included in SST.
      
      **How to solve:**
      - Wait pending memtable writes before ingestion job check memtable key range
      - Wait pending memtable writes before flush memtable.
      **Note that: `CompactRange` calls `RangesOverlapWithMemtables` too without waiting for pending waits, but I'm not sure whether it affects the correctness.**
      
      **Test Plan:**
      make check
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/6113
      
      Differential Revision: D18895674
      
      Pulled By: maysamyabandeh
      
      fbshipit-source-id: da22b4476fc7e06c176020e7cc171eb78189ecaf
      a8445912
  4. 31 5月, 2019 1 次提交
  5. 18 9月, 2018 1 次提交
  6. 16 9月, 2018 1 次提交
    • A
      Auto recovery from out of space errors (#4164) · a27fce40
      Anand Ananthabhotla 提交于
      Summary:
      This commit implements automatic recovery from a Status::NoSpace() error
      during background operations such as write callback, flush and
      compaction. The broad design is as follows -
      1. Compaction errors are treated as soft errors and don't put the
      database in read-only mode. A compaction is delayed until enough free
      disk space is available to accomodate the compaction outputs, which is
      estimated based on the input size. This means that users can continue to
      write, and we rely on the WriteController to delay or stop writes if the
      compaction debt becomes too high due to persistent low disk space
      condition
      2. Errors during write callback and flush are treated as hard errors,
      i.e the database is put in read-only mode and goes back to read-write
      only fater certain recovery actions are taken.
      3. Both types of recovery rely on the SstFileManagerImpl to poll for
      sufficient disk space. We assume that there is a 1-1 mapping between an
      SFM and the underlying OS storage container. For cases where multiple
      DBs are hosted on a single storage container, the user is expected to
      allocate a single SFM instance and use the same one for all the DBs. If
      no SFM is specified by the user, DBImpl::Open() will allocate one, but
      this will be one per DB and each DB will recover independently. The
      recovery implemented by SFM is as follows -
        a) On the first occurance of an out of space error during compaction,
      subsequent
        compactions will be delayed until the disk free space check indicates
        enough available space. The required space is computed as the sum of
        input sizes.
        b) The free space check requirement will be removed once the amount of
        free space is greater than the size reserved by in progress
        compactions when the first error occured
        c) If the out of space error is a hard error, a background thread in
        SFM will poll for sufficient headroom before triggering the recovery
        of the database and putting it in write-only mode. The headroom is
        calculated as the sum of the write_buffer_size of all the DB instances
        associated with the SFM
      4. EventListener callbacks will be called at the start and completion of
      automatic recovery. Users can disable the auto recov ery in the start
      callback, and later initiate it manually by calling DB::Resume()
      
      Todo:
      1. More extensive testing
      2. Add disk full condition to db_stress (follow-on PR)
      Pull Request resolved: https://github.com/facebook/rocksdb/pull/4164
      
      Differential Revision: D9846378
      
      Pulled By: anand1976
      
      fbshipit-source-id: 80ea875dbd7f00205e19c82215ff6e37da10da4a
      a27fce40
  7. 29 6月, 2018 1 次提交
    • A
      Allow DB resume after background errors (#3997) · 52d4c9b7
      Anand Ananthabhotla 提交于
      Summary:
      Currently, if RocksDB encounters errors during a write operation (user requested or BG operations), it sets DBImpl::bg_error_ and fails subsequent writes. This PR allows the DB to be resumed for certain classes of errors. It consists of 3 parts -
      1. Introduce Status::Severity in rocksdb::Status to indicate whether a given error can be recovered from or not
      2. Refactor the error handling code so that setting bg_error_ and deciding on severity is in one place
      3. Provide an API for the user to clear the error and resume the DB instance
      
      This whole change is broken up into multiple PRs. Initially, we only allow clearing the error for Status::NoSpace() errors during background flush/compaction. Subsequent PRs will expand this to include more errors and foreground operations such as Put(), and implement a polling mechanism for out-of-space errors.
      Closes https://github.com/facebook/rocksdb/pull/3997
      
      Differential Revision: D8653831
      
      Pulled By: anand1976
      
      fbshipit-source-id: 6dc835c76122443a7668497c0226b4f072bc6afd
      52d4c9b7