1. 31 10月, 2022 1 次提交
  2. 04 5月, 2022 5 次提交
  3. 13 4月, 2022 1 次提交
  4. 20 3月, 2022 1 次提交
  5. 23 10月, 2021 3 次提交
  6. 15 10月, 2021 1 次提交
    • D
      xfs: port the defer ops capture and continue to resource capture · 512edfac
      Darrick J. Wong 提交于
      When log recovery tries to recover a transaction that had log intent
      items attached to it, it has to save certain parts of the transaction
      state (reservation, dfops chain, inodes with no automatic unlock) so
      that it can finish single-stepping the recovered transactions before
      finishing the chains.
      
      This is done with the xfs_defer_ops_capture and xfs_defer_ops_continue
      functions.  Right now they open-code this functionality, so let's port
      this to the formalized resource capture structure that we introduced in
      the previous patch.  This enables us to hold up to two inodes and two
      buffers during log recovery, the same way we do for regular runtime.
      
      With this patch applied, we'll be ready to support atomic extent swap
      which holds two inodes; and logged xattrs which holds one inode and one
      xattr leaf buffer.
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      512edfac
  7. 10 8月, 2021 2 次提交
  8. 09 4月, 2021 1 次提交
  9. 23 1月, 2021 2 次提交
  10. 10 12月, 2020 4 次提交
  11. 07 10月, 2020 5 次提交
    • D
      xfs: periodically relog deferred intent items · 4e919af7
      Darrick J. Wong 提交于
      There's a subtle design flaw in the deferred log item code that can lead
      to pinning the log tail.  Taking up the defer ops chain examples from
      the previous commit, we can get trapped in sequences like this:
      
      Caller hands us a transaction t0 with D0-D3 attached.  The defer ops
      chain will look like the following if the transaction rolls succeed:
      
      t1: D0(t0), D1(t0), D2(t0), D3(t0)
      t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
      t3: d5(t1), D1(t0), D2(t0), D3(t0)
      ...
      t9: d9(t7), D3(t0)
      t10: D3(t0)
      t11: d10(t10), d11(t10)
      t12: d11(t10)
      
      In transaction 9, we finish d9 and try to roll to t10 while holding onto
      an intent item for D3 that we logged in t0.
      
      The previous commit changed the order in which we place new defer ops in
      the defer ops processing chain to reduce the maximum chain length.  Now
      make xfs_defer_finish_noroll capable of relogging the entire chain
      periodically so that we can always move the log tail forward.  Most
      chains will never get relogged, except for operations that generate very
      long chains (large extents containing many blocks with different sharing
      levels) or are on filesystems with small logs and a lot of ongoing
      metadata updates.
      
      Callers are now required to ensure that the transaction reservation is
      large enough to handle logging done items and new intent items for the
      maximum possible chain length.  Most callers are careful to keep the
      chain lengths low, so the overhead should be minimal.
      
      The decision to relog an intent item is made based on whether the intent
      was logged in a previous checkpoint, since there's no point in relogging
      an intent into the same checkpoint.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      4e919af7
    • D
      xfs: fix an incore inode UAF in xfs_bui_recover · ff4ab5e0
      Darrick J. Wong 提交于
      In xfs_bui_item_recover, there exists a use-after-free bug with regards
      to the inode that is involved in the bmap replay operation.  If the
      mapping operation does not complete, we call xfs_bmap_unmap_extent to
      create a deferred op to finish the unmapping work, and we retain a
      pointer to the incore inode.
      
      Unfortunately, the very next thing we do is commit the transaction and
      drop the inode.  If reclaim tears down the inode before we try to finish
      the defer ops, we dereference garbage and blow up.  Therefore, create a
      way to join inodes to the defer ops freezer so that we can maintain the
      xfs_inode reference until we're done with the inode.
      
      Note: This imposes the requirement that there be enough memory to keep
      every incore inode in memory throughout recovery.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      ff4ab5e0
    • D
      xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering · 64a3f331
      Darrick J. Wong 提交于
      In most places in XFS, we have a specific order in which we gather
      resources: grab the inode, allocate a transaction, then lock the inode.
      xfs_bui_item_recover doesn't do it in that order, so fix it to be more
      consistent.  This also makes the error bailout code a bit less weird.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      64a3f331
    • D
      xfs: clean up bmap intent item recovery checking · 919522e8
      Darrick J. Wong 提交于
      The bmap intent item checking code in xfs_bui_item_recover is spread all
      over the function.  We should check the recovered log item at the top
      before we allocate any resources or do anything else, so do that.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      919522e8
    • D
      xfs: proper replay of deferred ops queued during log recovery · e6fff81e
      Darrick J. Wong 提交于
      When we replay unfinished intent items that have been recovered from the
      log, it's possible that the replay will cause the creation of more
      deferred work items.  As outlined in commit 50995582 ("xfs: log
      recovery should replay deferred ops in order"), later work items have an
      implicit ordering dependency on earlier work items.  Therefore, recovery
      must replay the items (both recovered and created) in the same order
      that they would have been during normal operation.
      
      For log recovery, we enforce this ordering by using an empty transaction
      to collect deferred ops that get created in the process of recovering a
      log intent item to prevent them from being committed before the rest of
      the recovered intent items.  After we finish committing all the
      recovered log items, we allocate a transaction with an enormous block
      reservation, splice our huge list of created deferred ops into that
      transaction, and commit it, thereby finishing all those ops.
      
      This is /really/ hokey -- it's the one place in XFS where we allow
      nested transactions; the splicing of the defer ops list is is inelegant
      and has to be done twice per recovery function; and the broken way we
      handle inode pointers and block reservations cause subtle use-after-free
      and allocator problems that will be fixed by this patch and the two
      patches after it.
      
      Therefore, replace the hokey empty transaction with a structure designed
      to capture each chain of deferred ops that are created as part of
      recovering a single unfinished log intent.  Finally, refactor the loop
      that replays those chains to do so using one transaction per chain.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      e6fff81e
  12. 23 9月, 2020 3 次提交
  13. 29 7月, 2020 1 次提交
  14. 08 5月, 2020 8 次提交
  15. 07 5月, 2020 1 次提交
  16. 05 5月, 2020 1 次提交