1. 04 8月, 2014 11 次提交
    • D
      xfs: fix swapext ilock deadlock · 81217683
      Dave Chinner 提交于
      xfs_swap_extents() holds the ilock over a call to
      filemap_write_and_wait(), which can then try to write data and take
      the ilock. That causes a self-deadlock.
      
      Fix the deadlock and clean up the code by separating the locking
      appropriately. Add a lockflags variable to track what locks we are
      holding as we gain and drop them and cleanup the error handling to
      always use "out_unlock" with the lockflags variable.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      81217683
    • D
      xfs: kill xfs_vnode.h · b92cc59f
      Dave Chinner 提交于
      Move the IO flag definitions to xfs_inode.h and kill the header file
      as it is now empty.
      
      Removing the xfs_vnode.h file showed up an implicit header include
      path:
      	xfs_linux.h -> xfs_vnode.h -> xfs_fs.h
      
      And so every xfs header file has been inplicitly been including
      xfs_fs.h where it is needed or not. Hence the removal of xfs_vnode.h
      causes all sorts of build issues because BBTOB() and friends are no
      longer automatically included in the build. This also gets fixed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      b92cc59f
    • D
      xfs: kill VN_MAPPED · dd8c38ba
      Dave Chinner 提交于
      Only one user, no longer needed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      dd8c38ba
    • D
      xfs: kill VN_CACHED · 2667c6f9
      Dave Chinner 提交于
      Only has 2 users, has outlived it's usefulness.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2667c6f9
    • D
      xfs: kill VN_DIRTY() · eac152b4
      Dave Chinner 提交于
      Only one user of the macro and the dirty mapping check is redundant
      so just get rid of it.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      eac152b4
    • D
      xfs: dquot recovery needs verifiers · ad3714b8
      Dave Chinner 提交于
      dquot recovery should add verifiers to the dquot buffers that it
      recovers changes into. Unfortunately, it doesn't attached the
      verifiers to the buffers in a consistent manner. For example,
      xlog_recover_dquot_pass2() reads dquot buffers without a verifier
      and then writes it without ever having attached a verifier to the
      buffer.
      
      Further, dquot buffer recovery may write a dquot buffer that has not
      been modified, or indeed, shoul dbe written because quotas are not
      enabled and hence changes to the buffer were not replayed. In this
      case, we again write buffers without verifiers attached because that
      doesn't happen until after the buffer changes have been replayed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ad3714b8
    • D
      xfs: quotacheck leaves dquot buffers without verifiers · 5fd364fe
      Dave Chinner 提交于
      When running xfs/305, I noticed that quotacheck was flushing dquot
      buffers that did not have the xfs_dquot_buf_ops verifiers attached:
      
      XFS (vdb): _xfs_buf_ioapply: no ops on block 0x1dc8/0x1dc8
      ffff880052489000: 44 51 01 04 00 00 65 b8 00 00 00 00 00 00 00 00  DQ....e.........
      ffff880052489010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      ffff880052489020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      ffff880052489030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      CPU: 1 PID: 2376 Comm: mount Not tainted 3.16.0-rc2-dgc+ #306
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       ffff88006fe38000 ffff88004a0ffae8 ffffffff81cf1cca 0000000000000001
       ffff88004a0ffb88 ffffffff814d50ca 000010004a0ffc70 0000000000000000
       ffff88006be56dc4 0000000000000021 0000000000001dc8 ffff88007c773d80
      Call Trace:
       [<ffffffff81cf1cca>] dump_stack+0x45/0x56
       [<ffffffff814d50ca>] _xfs_buf_ioapply+0x3ca/0x3d0
       [<ffffffff810db520>] ? wake_up_state+0x20/0x20
       [<ffffffff814d51f5>] ? xfs_bdstrat_cb+0x55/0xb0
       [<ffffffff814d513b>] xfs_buf_iorequest+0x6b/0xd0
       [<ffffffff814d51f5>] xfs_bdstrat_cb+0x55/0xb0
       [<ffffffff814d53ab>] __xfs_buf_delwri_submit+0x15b/0x220
       [<ffffffff814d6040>] ? xfs_buf_delwri_submit+0x30/0x90
       [<ffffffff814d6040>] xfs_buf_delwri_submit+0x30/0x90
       [<ffffffff8150f89d>] xfs_qm_quotacheck+0x17d/0x3c0
       [<ffffffff81510591>] xfs_qm_mount_quotas+0x151/0x1e0
       [<ffffffff814ed01c>] xfs_mountfs+0x56c/0x7d0
       [<ffffffff814f0f12>] xfs_fs_fill_super+0x2c2/0x340
       [<ffffffff811c9fe4>] mount_bdev+0x194/0x1d0
       [<ffffffff814f0c50>] ? xfs_finish_flags+0x170/0x170
       [<ffffffff814ef0f5>] xfs_fs_mount+0x15/0x20
       [<ffffffff811ca8c9>] mount_fs+0x39/0x1b0
       [<ffffffff811e4d67>] vfs_kern_mount+0x67/0x120
       [<ffffffff811e757e>] do_mount+0x23e/0xad0
       [<ffffffff8117abde>] ? __get_free_pages+0xe/0x50
       [<ffffffff811e71e6>] ? copy_mount_options+0x36/0x150
       [<ffffffff811e8103>] SyS_mount+0x83/0xc0
       [<ffffffff81cfd40b>] tracesys+0xdd/0xe2
      
      This was caused by dquot buffer readahead not attaching a verifier
      structure to the buffer when readahead was issued, resulting in the
      followup read of the buffer finding a valid buffer and so not
      attaching new verifiers to the buffer as part of the read.
      
      Also, when a verifier failure occurs, we then read the buffer
      without verifiers. Attach the verifiers manually after this read so
      that if the buffer is then written it will be verified that the
      corruption has been repaired.
      
      Further, when flushing a dquot we don't ask for a verifier when
      reading in the dquot buffer the dquot belongs to. Most of the time
      this isn't an issue because the buffer is still cached, but when it
      is not cached it will result in writing the dquot buffer without
      having the verfier attached.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      5fd364fe
    • D
      xfs: ensure verifiers are attached to recovered buffers · 67dc288c
      Dave Chinner 提交于
      Crash testing of CRC enabled filesystems has resulted in a number of
      reports of bad CRCs being detected after the filesystem was mounted.
      Errors such as the following were being seen:
      
      XFS (sdb3): Mounting V5 Filesystem
      XFS (sdb3): Starting recovery (logdev: internal)
      XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1
      XFS (sdb3): Unmount and run xfs_repair
      XFS (sdb3): First 64 bytes of corrupted metadata buffer:
      ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40  XAGF...........@
      ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01  ..mS..w.........
      ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03  ................
      ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00  ................
      XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1
      
      The errors were typically being seen in AGF, AGI and their related
      btree block buffers some time after log recovery had run. Often it
      wasn't until later subsequent mounts that the problem was
      discovered. The common symptom was a buffer with the correct
      contents, but a CRC and an LSN that matched an older version of the
      contents.
      
      Some debug added to _xfs_buf_ioapply() indicated that buffers were
      being written without verifiers attached to them from log recovery,
      and Jan Kara isolated the cause to log recovery readahead an dit's
      interactions with buffers that had a more recent LSN on disk than
      the transaction being recovered. In this case, the buffer did not
      get a verifier attached, and os when the second phase of log
      recovery ran and recovered EFIs and unlinked inodes, the buffers
      were modified and written without the verifier running. Hence they
      had up to date contents, but stale LSNs and CRCs.
      
      Fix it by attaching verifiers to buffers we skip due to future LSN
      values so they don't escape into the buffer cache without the
      correct verifier attached.
      
      This patch is based on analysis and a patch from Jan Kara.
      
      cc: <stable@vger.kernel.org>
      Reported-by: NJan Kara <jack@suse.cz>
      Reported-by: NFanael Linithien <fanael4@gmail.com>
      Reported-by: NGrozdan <neutrino8@gmail.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      67dc288c
    • D
      xfs: catch buffers written without verifiers attached · 400b9d88
      Dave Chinner 提交于
      We recently had a bug where buffers were slipping through log
      recovery without any verifier attached to them. This was resulting
      in on-disk CRC mismatches for valid data. Add some warning code to
      catch this occurrence so that we catch such bugs during development
      rather than not being aware they exist.
      
      Note that we cannot do this verification unconditionally as non-CRC
      filesystems don't always attach verifiers to the buffers being
      written. e.g. during log recovery we cannot identify all the
      different types of buffers correctly on non-CRC filesystems, so we
      can't attach the correct verifiers in all cases and so we don't
      attach any. Hence we don't want on non-CRC filesystems to avoid
      spamming the logs with false indications.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      400b9d88
    • E
      xfs: avoid false quotacheck after unclean shutdown · 5ef828c4
      Eric Sandeen 提交于
      The commit
      
      83e782e1 xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD
      
      added a new function xfs_sb_quota_from_disk() which swaps
      on-disk XFS_OQUOTA_* flags for in-core XFS_GQUOTA_* and XFS_PQUOTA_*
      flags after the superblock is read.
      
      However, if log recovery is required, the superblock is read again,
      and the modified in-core flags are re-read from disk, so we have
      XFS_OQUOTA_* flags in memory again.  This causes the
      XFS_QM_NEED_QUOTACHECK() test to be true, because the XFS_OQUOTA_CHKD
      is still set, and not XFS_GQUOTA_CHKD or XFS_PQUOTA_CHKD.
      
      Change xfs_sb_from_disk to call xfs_sb_quota_from disk and always
      convert the disk flags to in-memory flags.
      
      Add a lower-level function which can be called with "false" to
      not convert the flags, so that the sb verifier can verify
      exactly what was on disk, per Brian Foster's suggestion.
      Reported-by: NCyril B. <cbay@excellency.fr>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      5ef828c4
    • B
      xfs: fix rounding error of fiemap length parameter · eedf32bf
      Brian Foster 提交于
      The offset and length parameters are converted from bytes to basic
      blocks by xfs_vn_fiemap(). The BTOBB() converter rounds the value up to
      the nearest basic block. This leads to unexpected behavior when
      unaligned offsets are provided to FIEMAP.
      
      Fix the conversions of byte values to block values to cover the provided
      offsets. Round down the start offset to the nearest basic block.
      Calculate the end offset based on the provided values, round up and
      calculate length based on the start block offset.
      Reported-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      eedf32bf
  2. 25 6月, 2014 4 次提交
    • D
      xfs: global error sign conversion · 2451337d
      Dave Chinner 提交于
      Convert all the errors the core XFs code to negative error signs
      like the rest of the kernel and remove all the sign conversion we
      do in the interface layers.
      
      Errors for conversion (and comparison) found via searches like:
      
      $ git grep " E" fs/xfs
      $ git grep "return E" fs/xfs
      $ git grep " E[A-Z].*;$" fs/xfs
      
      Negation points found via searches like:
      
      $ git grep "= -[a-z,A-Z]" fs/xfs
      $ git grep "return -[a-z,A-D,F-Z]" fs/xfs
      $ git grep " -[a-z].*;" fs/xfs
      
      [ with some bits I missed from Brian Foster ]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2451337d
    • D
      libxfs: move source files · 30f712c9
      Dave Chinner 提交于
      Move all the source files that are shared with userspace into
      libxfs/. This is done as one big chunk simpy to get it done
      quickly
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      30f712c9
    • D
      libxfs: move header files · 84be0ffc
      Dave Chinner 提交于
      Move all the header files that are shared with userspace into
      libxfs. This is done as one big chunk simpy to get it done quickly.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      84be0ffc
    • D
      xfs: create libxfs infrastructure · 69116a13
      Dave Chinner 提交于
      To minimise the differences between kernel and userspace code,
      split the kernel code into the same structure as the userspace code.
      That is, the gneric core functionality of XFS is moved to a libxfs/
      directory and treat it as a layering barrier in the XFS code.
      
      This patch introduces the libxfs directory, the build infrastructure
      and an initial source and header file to build. The libxfs directory
      will contain the header files that are needed to build libxfs - most
      of userspace does not care about the location of these header files
      as they are accessed indirectly. Hence keeping them inside libxfs
      makes it easy to track the changes and script the sync process as
      the directory structure will be identical.
      
      To allow this changeover to occur in the kernel code, there are some
      temporary infrastructure in the makefiles to grab the header
      filesystem from both locations. Once all the files are moved,
      modifications will be made in the source code that will make the
      need for these include directives go away.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      69116a13
  3. 22 6月, 2014 2 次提交
  4. 12 6月, 2014 1 次提交
    • A
      ->splice_write() via ->write_iter() · 8d020765
      Al Viro 提交于
      iter_file_splice_write() - a ->splice_write() instance that gathers the
      pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
      it to ->write_iter().  A bunch of simple cases coverted to that...
      
      [AV: fixed the braino spotted by Cyrill]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8d020765
  5. 11 6月, 2014 1 次提交
    • A
      fs,userns: Change inode_capable to capable_wrt_inode_uidgid · 23adbe12
      Andy Lutomirski 提交于
      The kernel has no concept of capabilities with respect to inodes; inodes
      exist independently of namespaces.  For example, inode_capable(inode,
      CAP_LINUX_IMMUTABLE) would be nonsense.
      
      This patch changes inode_capable to check for uid and gid mappings and
      renames it to capable_wrt_inode_uidgid, which should make it more
      obvious what it does.
      
      Fixes CVE-2014-4014.
      
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      23adbe12
  6. 10 6月, 2014 1 次提交
  7. 06 6月, 2014 20 次提交