1. 21 2月, 2019 2 次提交
    • C
      xfs: fix SEEK_DATA for speculative COW fork preallocation · 60271ab7
      Christoph Hellwig 提交于
      We speculatively allocate extents in the COW fork to reduce
      fragmentation.  But when we write data into such COW fork blocks that
      do now shadow an allocation in the data fork SEEK_DATA will not
      correctly report it, as it only looks at the data fork extents.
      The only reason why that hasn't been an issue so far is because
      we even use these speculative COW fork preallocations over holes in
      the data fork at all for buffered writes, and blocks in the COW
      fork that are written by direct writes are moved into the data
      fork immediately at I/O completion time.
      
      Add a new set of iomap_ops for SEEK_HOLE/SEEK_DATA which looks into
      both the COW and data fork, and reports all COW extents as unwritten
      to the iomap layer.  While this isn't strictly true for COW fork
      extents that were already converted to real extents, the practical
      semantics that you can't read data from them until they are moved
      into the data fork are very similar, and this will force the iomap
      layer into probing the extents for actually present data.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      60271ab7
    • C
      xfs: make xfs_bmbt_to_iomap more useful · 16be1433
      Christoph Hellwig 提交于
      Move checking for invalid zero blocks and setting of various iomap flags
      into this helper.  Also make it deal with "raw" delalloc extents to
      avoid clutter in the callers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      16be1433
  2. 19 2月, 2019 1 次提交
  3. 18 2月, 2019 10 次提交
  4. 15 2月, 2019 4 次提交
    • D
      xfs: don't ever put nlink > 0 inodes on the unlinked list · c4a6bf7f
      Darrick J. Wong 提交于
      When XFS creates an O_TMPFILE file, the inode is created with nlink = 1,
      put on the unlinked list, and then the VFS sets nlink = 0 in d_tmpfile.
      If we crash before anything logs the inode (it's dirty incore but the
      vfs doesn't tell us it's dirty so we never log that change), the iunlink
      processing part of recovery will then explode with a pile of:
      
      XFS: Assertion failed: VFS_I(ip)->i_nlink == 0, file:
      fs/xfs/xfs_log_recover.c, line: 5072
      
      Worse yet, since nlink is nonzero, the inodes also don't get cleaned up
      and they just leak until the next xfs_repair run.
      
      Therefore, change xfs_iunlink to require that inodes being put on the
      unlinked list have nlink == 0, change the tmpfile callers to instantiate
      nodes that way, and set the nlink to 1 just prior to calling d_tmpfile.
      Fix the comment for xfs_iunlink while we're at it.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c4a6bf7f
    • D
      xfs: reserve blocks for ifree transaction during log recovery · 15a268d9
      Darrick J. Wong 提交于
      Log recovery frees all the inodes stored in the unlinked list, which can
      cause expansion of the free inode btree.  The ifree code skips block
      reservations if it thinks there's a per-AG space reservation, but we
      don't set up the reservation until after log recovery, which means that
      a finobt expansion blows up in xfs_trans_mod_sb when we exceed the
      transaction's block reservation.
      
      To fix this, we set the "no finobt reservation" flag to true when we
      create the xfs_mount and only set it to false if we confirm that every
      AG had enough free space to put aside for the finobt.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      15a268d9
    • D
      xfs: rename m_inotbt_nores to m_finobt_nores · e1f6ca11
      Darrick J. Wong 提交于
      Rename this flag variable to imply more strongly that it's related to
      the free inode btree (finobt) operation.  No functional changes.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      e1f6ca11
    • D
      xfs: don't overflow xattr listent buffer · 3b50086f
      Darrick J. Wong 提交于
      For VFS listxattr calls, xfs_xattr_put_listent calls
      __xfs_xattr_put_listent twice if it sees an attribute
      "trusted.SGI_ACL_FILE": once for that name, and again for
      "system.posix_acl_access".  Unfortunately, if we happen to run out of
      buffer space while emitting the first name, we set count to -1 (so that
      we can feed ERANGE to the caller).  The second invocation doesn't check that
      the context parameters make sense and overwrites the byte before the
      buffer, triggering a KASAN report:
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in strncpy+0xb3/0xd0
      Write of size 1 at addr ffff88807fbd317f by task syz/1113
      
      CPU: 3 PID: 1113 Comm: syz Not tainted 5.0.0-rc6-xfsx #rc6
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       dump_stack+0xcc/0x180
       print_address_description+0x6c/0x23c
       kasan_report.cold.3+0x1c/0x35
       strncpy+0xb3/0xd0
       __xfs_xattr_put_listent+0x1a9/0x2c0 [xfs]
       xfs_attr_list_int_ilocked+0x11af/0x1800 [xfs]
       xfs_attr_list_int+0x20c/0x2e0 [xfs]
       xfs_vn_listxattr+0x225/0x320 [xfs]
       listxattr+0x11f/0x1b0
       path_listxattr+0xbd/0x130
       do_syscall_64+0x139/0x560
      
      While we're at it we add an assert to the other put_listent to avoid
      this sort of thing ever happening to the attrlist_by_handle code.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      3b50086f
  5. 12 2月, 2019 23 次提交