• D
    iomap: don't invalidate folios after writeback errors · e9c3a8e8
    Darrick J. Wong 提交于
    XFS has the unique behavior (as compared to the other Linux filesystems)
    that on writeback errors it will completely invalidate the affected
    folio and force the page cache to reread the contents from disk.  All
    other filesystems leave the page mapped and up to date.
    
    This is a rude awakening for user programs, since (in the case where
    write fails but reread doesn't) file contents will appear to revert to
    old disk contents with no notification other than an EIO on fsync.  This
    might have been annoying back in the days when iomap dealt with one page
    at a time, but with multipage folios, we can now throw away *megabytes*
    worth of data for a single write error.
    
    On *most* Linux filesystems, a program can respond to an EIO on write by
    redirtying the entire file and scheduling it for writeback.  This isn't
    foolproof, since the page that failed writeback is no longer dirty and
    could be evicted, but programs that want to recover properly *also*
    have to detect XFS and regenerate every write they've made to the file.
    
    When running xfs/314 on arm64, I noticed a UAF when xfs_discard_folio
    invalidates multipage folios that could be undergoing writeback.  If,
    say, we have a 256K folio caching a mix of written and unwritten
    extents, it's possible that we could start writeback of the first (say)
    64K of the folio and then hit a writeback error on the next 64K.  We
    then free the iop attached to the folio, which is really bad because
    writeback completion on the first 64k will trip over the "blocks per
    folio > 1 && !iop" assertion.
    
    This can't be fixed by only invalidating the folio if writeback fails at
    the start of the folio, since the folio is marked !uptodate, which trips
    other assertions elsewhere.  Get rid of the whole behavior entirely.
    Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
    Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: NJeff Layton <jlayton@kernel.org>
    Reviewed-by: NChristoph Hellwig <hch@lst.de>
    e9c3a8e8
buffered-io.c 43.6 KB