1. 10 5月, 2018 17 次提交
  2. 03 5月, 2018 1 次提交
  3. 18 4月, 2018 4 次提交
  4. 12 4月, 2018 1 次提交
  5. 11 4月, 2018 2 次提交
  6. 10 4月, 2018 3 次提交
  7. 03 4月, 2018 2 次提交
    • D
      xfs: fix intent use-after-free on abort · 0612d116
      Dave Chinner 提交于
      When an intent is aborted during it's initial commit through
      xfs_defer_trans_abort(), there is a use after free. The current
      report is for a RUI  through this path in generic/388:
      
       Freed by task 6274:
        __kasan_slab_free+0x136/0x180
        kmem_cache_free+0xe7/0x4b0
        xfs_trans_free_items+0x198/0x2e0
        __xfs_trans_commit+0x27f/0xcc0
        xfs_trans_roll+0x17b/0x2a0
        xfs_defer_trans_roll+0x6ad/0xe60
        xfs_defer_finish+0x2a6/0x2140
        xfs_alloc_file_space+0x53a/0xf90
        xfs_file_fallocate+0x5c6/0xac0
        vfs_fallocate+0x2f5/0x930
        ioctl_preallocate+0x1dc/0x320
        do_vfs_ioctl+0xfe4/0x1690
      
      The problem is that the RUI has two active references - one in the
      current transaction, and another held by the defer_ops structure
      that is passed to the RUD (intent done) so that both the intent and
      the intent done structures are freed on commit of the intent done.
      
      Hence during abort, we need to release the intent item, because the
      defer_ops reference is released separately via ->abort_intent
      callback. Fix all the intent code to do this correctly.
      Signed-Off-By: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0612d116
    • C
      xfs: Remove "committed" argument of xfs_dir_ialloc · c959025e
      Chandan Rajendra 提交于
      xfs_dir_ialloc() rolls the current transaction when allocation of a new
      inode required the space manager to perform an allocation and replinish
      the Inode btree.
      
      None of the callers of xfs_dir_ialloc() need to know if the
      transaction was committed. Hence this commit removes the "committed"
      argument of xfs_dir_ialloc.
      Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      c959025e
  8. 31 3月, 2018 1 次提交
    • D
      xfs, dax: introduce xfs_dax_aops · 6e2608df
      Dan Williams 提交于
      In preparation for the dax implementation to start associating dax pages
      to inodes via page->mapping, we need to provide a 'struct
      address_space_operations' instance for dax. Otherwise, direct-I/O
      triggers incorrect page cache assumptions and warnings like the
      following:
      
       WARNING: CPU: 27 PID: 1783 at fs/xfs/xfs_aops.c:1468
       xfs_vm_set_page_dirty+0xf3/0x1b0 [xfs]
       [..]
       CPU: 27 PID: 1783 Comm: dma-collision Tainted: G           O 4.15.0-rc2+ #984
       [..]
       Call Trace:
        set_page_dirty_lock+0x40/0x60
        bio_set_pages_dirty+0x37/0x50
        iomap_dio_actor+0x2b7/0x3b0
        ? iomap_dio_zero+0x110/0x110
        iomap_apply+0xa4/0x110
        iomap_dio_rw+0x29e/0x3b0
        ? iomap_dio_zero+0x110/0x110
        ? xfs_file_dio_aio_read+0x7c/0x1a0 [xfs]
        xfs_file_dio_aio_read+0x7c/0x1a0 [xfs]
        xfs_file_read_iter+0xa0/0xc0 [xfs]
        __vfs_read+0xf9/0x170
        vfs_read+0xa6/0x150
        SyS_pread64+0x93/0xb0
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      ...where the default set_page_dirty() handler assumes that dirty state
      is being tracked in 'struct page' flags.
      
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: NJan Kara <jack@suse.cz>
      Suggested-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      6e2608df
  9. 30 3月, 2018 1 次提交
  10. 26 3月, 2018 2 次提交
  11. 24 3月, 2018 6 次提交
    • D
      xfs: remove dead inode version setting code · fa4493f0
      Dave Chinner 提交于
      We can only get into the branch if CRCs are enabled, so there's no
      need to check inside the branch for CRCs being enabled....
      Signed-Off-By: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      fa4493f0
    • D
      xfs: catch inode allocation state mismatch corruption · ee457001
      Dave Chinner 提交于
      We recently came across a V4 filesystem causing memory corruption
      due to a newly allocated inode being setup twice and being added to
      the superblock inode list twice. From code inspection, the only way
      this could happen is if a newly allocated inode was not marked as
      free on disk (i.e. di_mode wasn't zero).
      
      Running the metadump on an upstream debug kernel fails during inode
      allocation like so:
      
      XFS: Assertion failed: ip->i_d.di_nblocks == 0, file: fs/xfs/xfs_inod=
      e.c, line: 838
       ------------[ cut here ]------------
      kernel BUG at fs/xfs/xfs_message.c:114!
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU: 11 PID: 3496 Comm: mkdir Not tainted 4.16.0-rc5-dgc #442
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/0=
      1/2014
      RIP: 0010:assfail+0x28/0x30
      RSP: 0018:ffffc9000236fc80 EFLAGS: 00010202
      RAX: 00000000ffffffea RBX: 0000000000004000 RCX: 0000000000000000
      RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff8227211b
      RBP: ffffc9000236fce8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000bec R11: f000000000000000 R12: ffffc9000236fd30
      R13: ffff8805c76bab80 R14: ffff8805c77ac800 R15: ffff88083fb12e10
      FS:  00007fac8cbff040(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000=
      000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fffa6783ff8 CR3: 00000005c6e2b003 CR4: 00000000000606e0
      Call Trace:
       xfs_ialloc+0x383/0x570
       xfs_dir_ialloc+0x6a/0x2a0
       xfs_create+0x412/0x670
       xfs_generic_create+0x1f7/0x2c0
       ? capable_wrt_inode_uidgid+0x3f/0x50
       vfs_mkdir+0xfb/0x1b0
       SyS_mkdir+0xcf/0xf0
       do_syscall_64+0x73/0x1a0
       entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Extracting the inode number we crashed on from an event trace and
      looking at it with xfs_db:
      
      xfs_db> inode 184452204
      xfs_db> p
      core.magic = 0x494e
      core.mode = 0100644
      core.version = 2
      core.format = 2 (extents)
      core.nlinkv2 = 1
      core.onlink = 0
      .....
      
      Confirms that it is not a free inode on disk. xfs_repair
      also trips over this inode:
      
      .....
      zero length extent (off = 0, fsbno = 0) in ino 184452204
      correcting nextents for inode 184452204
      bad attribute fork in inode 184452204, would clear attr fork
      bad nblocks 1 for inode 184452204, would reset to 0
      bad anextents 1 for inode 184452204, would reset to 0
      imap claims in-use inode 184452204 is free, would correct imap
      would have cleared inode 184452204
      .....
      disconnected inode 184452204, would move to lost+found
      
      And so we have a situation where the directory structure and the
      inobt thinks the inode is free, but the inode on disk thinks it is
      still in use. Where this corruption came from is not possible to
      diagnose, but we can detect it and prevent the kernel from oopsing
      on lookup. The reproducer now results in:
      
      $ sudo mkdir /mnt/scratch/{0,1,2,3,4,5}{0,1,2,3,4,5}
      mkdir: cannot create directory =E2=80=98/mnt/scratch/00=E2=80=99: File ex=
      ists
      mkdir: cannot create directory =E2=80=98/mnt/scratch/01=E2=80=99: File ex=
      ists
      mkdir: cannot create directory =E2=80=98/mnt/scratch/03=E2=80=99: Structu=
      re needs cleaning
      mkdir: cannot create directory =E2=80=98/mnt/scratch/04=E2=80=99: Input/o=
      utput error
      mkdir: cannot create directory =E2=80=98/mnt/scratch/05=E2=80=99: Input/o=
      utput error
      ....
      
      And this corruption shutdown:
      
      [   54.843517] XFS (loop0): Corruption detected! Free inode 0xafe846c not=
       marked free on disk
      [   54.845885] XFS (loop0): Internal error xfs_trans_cancel at line 1023 =
      of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x425/0x670
      [   54.848994] CPU: 10 PID: 3541 Comm: mkdir Not tainted 4.16.0-rc5-dgc #=
      443
      [   54.850753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIO=
      S 1.10.2-1 04/01/2014
      [   54.852859] Call Trace:
      [   54.853531]  dump_stack+0x85/0xc5
      [   54.854385]  xfs_trans_cancel+0x197/0x1c0
      [   54.855421]  xfs_create+0x425/0x670
      [   54.856314]  xfs_generic_create+0x1f7/0x2c0
      [   54.857390]  ? capable_wrt_inode_uidgid+0x3f/0x50
      [   54.858586]  vfs_mkdir+0xfb/0x1b0
      [   54.859458]  SyS_mkdir+0xcf/0xf0
      [   54.860254]  do_syscall_64+0x73/0x1a0
      [   54.861193]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      [   54.862492] RIP: 0033:0x7fb73bddf547
      [   54.863358] RSP: 002b:00007ffdaa553338 EFLAGS: 00000246 ORIG_RAX: 0000=
      000000000053
      [   54.865133] RAX: ffffffffffffffda RBX: 00007ffdaa55449a RCX: 00007fb73=
      bddf547
      [   54.866766] RDX: 0000000000000001 RSI: 00000000000001ff RDI: 00007ffda=
      a55449a
      [   54.868432] RBP: 00007ffdaa55449a R08: 00000000000001ff R09: 00005623a=
      8670dd0
      [   54.870110] R10: 00007fb73be72d5b R11: 0000000000000246 R12: 000000000=
      00001ff
      [   54.871752] R13: 00007ffdaa5534b0 R14: 0000000000000000 R15: 00007ffda=
      a553500
      [   54.873429] XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1=
      024 of file fs/xfs/xfs_trans.c.  Return address = ffffffff814cd050
      [   54.882790] XFS (loop0): Corruption of in-memory data detected.  Shutt=
      ing down filesystem
      [   54.884597] XFS (loop0): Please umount the filesystem and rectify the =
      problem(s)
      
      Note that this crash is only possible on v4 filesystemsi or v5
      filesystems mounted with the ikeep mount option. For all other V5
      filesystems, this problem cannot occur because we don't read inodes
      we are allocating from disk - we simply overwrite them with the new
      inode information.
      Signed-Off-By: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Tested-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      ee457001
    • D
      xfs: xfs_scrub_iallocbt_xref_rmap_inodes should use xref_set_corrupt · b83e4c3c
      Darrick J. Wong 提交于
      In xfs_scrub_iallocbt_xref_rmap_inodes we're checking inodes against
      rmap records, so we should use xfs_scrub_btree_xref_set_corrupt if we
      encounter discrepancies here so that we know that it's a cross
      referencing error, not necessarily a corruption in the inobt itself.
      
      The userspace xfs_scrub program will try to repair outright corruptions
      in the agi/inobt prior to phase 3 so that the inode scan will proceed.
      If only a cross-referencing error is noted, the repair program defers
      the repair attempt until it can check the other space metadata at least
      once.
      
      It is therefore essential that the inobt scrubber can correctly
      distinguish between corruptions and "unable to cross-reference something
      else with this inobt".  The same reasoning applies to "xfs: record inode
      buf errors as a xref error in inobt scrubber".
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      b83e4c3c
    • D
      xfs: flag inode corruption if parent ptr doesn't get us a real inode · 5927268f
      Darrick J. Wong 提交于
      If a directory's parent inode pointer doesn't point to an inode, the
      directory should be flagged as corrupt.  Enable IGET_UNTRUSTED here so
      that _iget will return -EINVAL if the inobt does not confirm that the
      inode is present and allocated and we can flag the directory corruption.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      5927268f
    • D
      xfs: don't accept inode buffers with suspicious unlinked chains · 6a96c565
      Darrick J. Wong 提交于
      When we're verifying inode buffers, sanity-check the unlinked pointer.
      We don't want to run the risk of trying to purge something that's
      obviously broken.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      6a96c565
    • D
      xfs: move inode extent size hint validation to libxfs · 8bb82bc1
      Darrick J. Wong 提交于
      Extent size hint validation is used by scrub to decide if there's an
      error, and it will be used by repair to decide to remove the hint.
      Since these use the same validation functions, move them to libxfs.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      8bb82bc1