1. 06 6月, 2018 7 次提交
    • D
      xfs: validate btree records on retrieval · 9e6c08d4
      Dave Chinner 提交于
      So we don't check the validity of records as we walk the btree. When
      there are corrupt records in the free space btree (e.g. zero
      startblock/length or beyond EOAG) we just blindly use it and things
      go bad from there. That leads to assert failures on debug kernels
      like this:
      
      XFS: Assertion failed: fs_is_ok, file: fs/xfs/libxfs/xfs_alloc.c, line: 450
      ....
      Call Trace:
       xfs_alloc_fixup_trees+0x368/0x5c0
       xfs_alloc_ag_vextent_near+0x79a/0xe20
       xfs_alloc_ag_vextent+0x1d3/0x330
       xfs_alloc_vextent+0x5e9/0x870
      
      Or crashes like this:
      
      XFS (loop0): xfs_buf_find: daddr 0x7fb28 out of range, EOFS 0x8000
      .....
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
      ....
      Call Trace:
       xfs_bmap_add_extent_hole_real+0x67d/0x930
       xfs_bmapi_write+0x934/0xc90
       xfs_da_grow_inode_int+0x27e/0x2f0
       xfs_dir2_grow_inode+0x55/0x130
       xfs_dir2_sf_to_block+0x94/0x5d0
       xfs_dir2_sf_addname+0xd0/0x590
       xfs_dir_createname+0x168/0x1a0
       xfs_rename+0x658/0x9b0
      
      By checking that free space records pulled from the trees are
      within the valid range, we catch many of these corruptions before
      they can do damage.
      
      This is a generic btree record checking deficiency. We need to
      validate the records we fetch from all the different btrees before
      we use them to catch corruptions like this.
      
      This patch results in a corrupt record emitting an error message and
      returning -EFSCORRUPTED, and the higher layers catch that and abort:
      
       XFS (loop0): Size Freespace BTree record corruption in AG 0 detected!
       XFS (loop0): start block 0x0 block count 0x0
       XFS (loop0): Internal error xfs_trans_cancel at line 1012 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x42a/0x670
       .....
       Call Trace:
        dump_stack+0x85/0xcb
        xfs_trans_cancel+0x19f/0x1c0
        xfs_create+0x42a/0x670
        xfs_generic_create+0x1f6/0x2c0
        vfs_create+0xf9/0x180
        do_mknodat+0x1f9/0x210
        do_syscall_64+0x5a/0x180
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      .....
       XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1013 of file fs/xfs/xfs_trans.c.  Return address = ffffffff81500868
       XFS (loop0): Corruption of in-memory data detected.  Shutting down filesystem
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      9e6c08d4
    • D
      xfs: push corruption -> ESTALE conversion to xfs_nfs_get_inode() · 29cad0b3
      Dave Chinner 提交于
      In xfs_imap_to_bp(), we convert a -EFSCORRUPTED error to -EINVAL if
      we are doing an untrusted lookup. This is done because we need
      failed filehandle lookups to report -ESTALE to the caller, and it
      does this by converting -EINVAL and -ENOENT errors to -ESTALE.
      
      The squashing of EFSCORRUPTED in imap_to_bp makes it impossible for
      for xfs_iget(UNTRUSTED) callers to determine the difference between
      "inode does not exist" and "corruption detected during lookup". We
      realy need that distinction in places calling xfS_iget(UNTRUSTED),
      so move the filehandle error case handling all the way out to
      xfs_nfs_get_inode() where it is needed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      29cad0b3
    • D
      xfs: verify root inode more thoroughly · 541b5acc
      Dave Chinner 提交于
      When looking up the root inode at mount time, we don't actually do
      any verification to check that the inode is allocated and accounted
      for correctly in the INOBT. Make the checks on the root inode more
      robust by making it an untrusted lookup. This forces the inode
      lookup to use the inode btree to verify the inode is allocated
      and mapped correctly to disk. This will also have the effect of
      catching a significant number of AGI/INOBT related corruptions in
      AG 0 at mount time.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      541b5acc
    • D
      xfs: verify COW extent size hint is valid in inode verifier · 02a0fda8
      Dave Chinner 提交于
      There are rules for vald extent size hints. We enforce them when
      applications set them, but fuzzers violate those rules and that
      screws us over. Validate COW extent size hint rules in the inode
      verifier to catch this.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      02a0fda8
    • D
      xfs: verify extent size hint is valid in inode verifier · 7d71a671
      Dave Chinner 提交于
      There are rules for vald extent size hints. We enforce them when
      applications set them, but fuzzers violate those rules and that
      screws us over.
      
      This results in alignment assertion failures when setting up
      allocations such as this in direct IO:
      
      XFS: Assertion failed: ap->length, file: fs/xfs/libxfs/xfs_bmap.c, line: 3432
      ....
      Call Trace:
       xfs_bmap_btalloc+0x415/0x910
       xfs_bmapi_write+0x71c/0x12e0
       xfs_iomap_write_direct+0x2a9/0x420
       xfs_file_iomap_begin+0x4dc/0xa70
       iomap_apply+0x43/0x100
       iomap_file_buffered_write+0x62/0x90
       xfs_file_buffered_aio_write+0xba/0x300
       __vfs_write+0xd5/0x150
       vfs_write+0xb6/0x180
       ksys_write+0x45/0xa0
       do_syscall_64+0x5a/0x180
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      And from xfs_db:
      
      core.extsize = 10380288
      
      Which is not an integer multiple of the block size, and so violates
      Rule #7 for setting extent size hints. Validate extent size hint
      rules in the inode verifier to catch this.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      7d71a671
    • D
      xfs: catch bad stripe alignment configurations · fa4ca9c5
      Dave Chinner 提交于
      When stripe alignments are invalid, data alignment algorithms in the
      allocator may not work correctly. Ensure we catch superblocks with
      invalid stripe alignment setups at mount time. These data alignment
      mismatches are now detected at mount time like this:
      
      XFS (loop0): SB stripe unit sanity check failed
      XFS (loop0): Metadata corruption detected at xfs_sb_read_verify+0xab/0x110, xfs_sb block 0xffffffffffffffff
      XFS (loop0): Unmount and run xfs_repair
      XFS (loop0): First 128 bytes of corrupted metadata buffer:
      0000000091c2de02: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 10 00  XFSB............
      0000000023bff869: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      00000000cdd8c893: 17 32 37 15 ff ca 46 3d 9a 17 d3 33 04 b5 f1 a2  .27...F=...3....
      000000009fd2844f: 00 00 00 00 00 00 00 04 00 00 00 00 00 00 06 d0  ................
      0000000088e9b0bb: 00 00 00 00 00 00 06 d1 00 00 00 00 00 00 06 d2  ................
      00000000ff233a20: 00 00 00 01 00 00 10 00 00 00 00 01 00 00 00 00  ................
      000000009db0ac8b: 00 00 03 60 e1 34 02 00 08 00 00 02 00 00 00 00  ...`.4..........
      00000000f7022460: 00 00 00 00 00 00 00 00 0c 09 0b 01 0c 00 00 19  ................
      XFS (loop0): SB validate failed with error -117.
      
      And the mount fails.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      fa4ca9c5
    • D
      iomap: fsync swap files before iterating mappings · 117a148f
      Darrick J. Wong 提交于
      Swap files require that all the file mapping metadata be stable on disk.
      It is insufficient to flush dirty pages in the page cache because that
      won't necessarily result in filesystems pushing all their metadata out
      to disk.  Therefore, call fsync from iomap_swapfile_activate.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      117a148f
  2. 05 6月, 2018 14 次提交
  3. 04 6月, 2018 1 次提交
  4. 02 6月, 2018 18 次提交