• B
    xfs: use iomap new flag for newly allocated delalloc blocks · f65e6fad
    Brian Foster 提交于
    Commit fa7f138a ("xfs: clear delalloc and cache on buffered write
    failure") fixed one regression in the iomap error handling code and
    exposed another. The fundamental problem is that if a buffered write
    is a rewrite of preexisting delalloc blocks and the write fails, the
    failure handling code can punch out preexisting blocks with valid
    file data.
    
    This was reproduced directly by sub-block writes in the LTP
    kernel/syscalls/write/write03 test. A first 100 byte write allocates
    a single block in a file. A subsequent 100 byte write fails and
    punches out the block, including the data successfully written by
    the previous write.
    
    To address this problem, update the ->iomap_begin() handler to
    distinguish newly allocated delalloc blocks from preexisting
    delalloc blocks via the IOMAP_F_NEW flag. Use this flag in the
    ->iomap_end() handler to decide when a failed or short write should
    punch out delalloc blocks.
    
    This introduces the subtle requirement that ->iomap_begin() should
    never combine newly allocated delalloc blocks with existing blocks
    in the resulting iomap descriptor. This can occur when a new
    delalloc reservation merges with a neighboring extent that is part
    of the current write, for example. Therefore, drop the
    post-allocation extent lookup from xfs_bmapi_reserve_delalloc() and
    just return the record inserted into the fork. This ensures only new
    blocks are returned and thus that preexisting delalloc blocks are
    always handled as "found" blocks and not punched out on a failed
    rewrite.
    Reported-by: NXiong Zhou <xzhou@redhat.com>
    Signed-off-by: NBrian Foster <bfoster@redhat.com>
    Reviewed-by: NChristoph Hellwig <hch@lst.de>
    Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
    f65e6fad
xfs_iomap.c 32.4 KB