1. 07 3月, 2016 10 次提交
  2. 29 2月, 2016 1 次提交
  3. 09 2月, 2016 11 次提交
  4. 08 2月, 2016 18 次提交
    • B
      xfs: fix xfs_log_ticket leak in xfs_end_io() after fs shutdown · af055e37
      Brian Foster 提交于
      If the filesystem has shut down, xfs_end_io() currently sets an
      error on the ioend and proceeds to ioend destruction. The ioend
      might contain a truncate transaction if the I/O extended the size of
      the file. This transaction is only cleaned up in
      xfs_setfilesize_ioend(), however, which is skipped in this case.
      This results in an xfs_log_ticket leak message when the associate
      cache slab is destroyed (e.g., on rmmod).
      
      This was originally reproduced by xfs/141 on a distro kernel. The
      problem is reproducible on an upstream kernel, but not easily
      detected in current upstream if the xfs_log_ticket cache happens to
      be merged with another cache. This can be reproduced more
      deterministically with the 'slab_nomerge' kernel boot option.
      
      Update xfs_end_io() to proceed with normal end I/O processing after
      an error is set on an ioend due to fs shutdown. The I/O type-based
      processing is already designed to handle an I/O error and ensure
      that the ioend is cleaned up correctly.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      af055e37
    • B
      xfs: clean up unwritten buffers on write failure · 60630fe6
      Brian Foster 提交于
      The xfs_vm_write_failed() handler is currently responsible for cleaning
      up any delalloc blocks over the range of a failed write beyond EOF.
      Failure to do so results in warning messages and other inconsistencies
      between buffer and extent state. The ->releasepage() handler currently
      warns in the event of a page being released with either unwritten or
      delalloc buffers, as neither is ever expected by the time a page is
      released.
      
      As has been reproduced by generic/083 on a -bsize=1k fs, it is currently
      possible to trigger the ->releasepage() warning for a page with
      unwritten buffers when a filesystem is near ENOSPC. This is reproduced
      by the following sequence:
      
        $ mkfs.xfs -f -b size=1k -d size=100m <dev>
        $ mount <dev> /mnt/
        $
        $ xfs_io -fc "falloc -k 0 1k" /mnt/file
        $ dd if=/dev/zero of=/mnt/enospc conv=notrunc oflag=append
        $
        $ xfs_io -c "pwrite 512 1k" /mnt/file
        $ xfs_io -d -c "pwrite 16k 1k" /mnt/file
      
      The first pwrite command attempts a block unaligned write across an
      unwritten block and a hole. The delalloc for the hole fails with ENOSPC
      and the subsequent error handling does not clean up the unwritten buffer
      that was instantiated during the first ->get_block() call.
      
      The second pwrite triggers a warning as part of the inode mapping
      invalidation that occurs prior to direct I/O. The releasepage() handler
      detects the unwritten buffer at this time, warns and prevents the
      release of the page.
      
      To deal with this problem, update xfs_vm_write_failed() to clean up
      unwritten as well as delalloc buffers that are beyond EOF and within the
      range of the failed write.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      60630fe6
    • D
      xfs: move struct xfs_attr_shortform to xfs_da_format.h · 244efeaf
      Darrick J. Wong 提交于
      Move the shortform attr structure definition to the same place as the
      other attribute structure definitions for consistency and also so that
      xfs/122 verifies the structure size.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      244efeaf
    • M
      xfs: Make xfsaild freezeable again · 18f1df4e
      Michal Hocko 提交于
      Hendik has reported suspend failures due to xfsaild blocking the freezer
      to settle down.
      Jan 17 19:59:56 linux-6380 kernel: PM: Syncing filesystems ... done.
      Jan 17 19:59:56 linux-6380 kernel: PM: Preparing system for sleep (mem)
      Jan 17 19:59:56 linux-6380 kernel: Freezing user space processes ... (elapsed 0.001 seconds) done.
      Jan 17 19:59:56 linux-6380 kernel: Freezing remaining freezable tasks ...
      Jan 17 19:59:56 linux-6380 kernel: Freezing of tasks failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0):
      Jan 17 19:59:56 linux-6380 kernel: xfsaild/dm-5    S 00000000     0  1293      2 0x00000080
      Jan 17 19:59:56 linux-6380 kernel:  f0ef5f00 00000046 00000200 00000000 ffff9022 c02d3800 00000000 00000032
      Jan 17 19:59:56 linux-6380 kernel:  ee0b2400 00000032 f71e0d00 f36fabc0 f0ef2d00 f0ef6000 f0ef2d00 f12f90c0
      Jan 17 19:59:56 linux-6380 kernel:  f0ef5f0c c0844e44 00000000 f0ef5f6c f811e0be 00000000 00000000 f0ef2d00
      Jan 17 19:59:56 linux-6380 kernel: Call Trace:
      Jan 17 19:59:56 linux-6380 kernel:  [<c0844e44>] schedule+0x34/0x90
      Jan 17 19:59:56 linux-6380 kernel:  [<f811e0be>] xfsaild+0x5de/0x600 [xfs]
      Jan 17 19:59:56 linux-6380 kernel:  [<c0286cbb>] kthread+0x9b/0xb0
      Jan 17 19:59:56 linux-6380 kernel:  [<c0848a79>] ret_from_kernel_thread+0x21/0x38
      
      The issue has been there for quite some time but it has been made
      visible by only by 24ba16bb ("xfs: clear PF_NOFREEZE for xfsaild
      kthread") because the suspend started seeing xfsaild.
      
      The above commit has missed that the !xfs_ail_min branch might call
      schedule with TASK_INTERRUPTIBLE without calling try_to_freeze so the pm
      suspend would wake up the kernel thread over and over again without any
      progress. What we want here is to use freezable_schedule instead to hide
      the thread from the suspend.
      
      While we are here also change schedule_timeout to freezable variant to
      prevent from spurious wakeups by suspend.
      
      [dchinner: re-add set_freezeable call so the freezer will account properly
       for this kthread. ]
      Reported-by: NHendrik Woltersdorf <hendrikw@arcor.de>
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      18f1df4e
    • E
      xfs: remove unused function definitions · de0b85a8
      Eric Sandeen 提交于
      Old leftovers.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      de0b85a8
    • C
      xfs: move buffer invalidation to xfs_btree_free_block · edfd9dd5
      Christoph Hellwig 提交于
      ... instead of leaving it in the methods.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      edfd9dd5
    • C
      c46ee8ad
    • C
    • C
      c19b104a
    • C
      xfs: don't use ioends for direct write completions · 273dda76
      Christoph Hellwig 提交于
      We only need to communicate two bits of information to the direct I/O
      completion handler:
      
       (1) do we need to convert any unwritten extents in the range
       (2) do we need to check if we need to update the inode size based
           on the range passed to the completion handler
      
      We can use the private data passed to the get_block handler and the
      completion handler as a simple bitmask to communicate this information
      instead of the current complicated infrastructure reusing the ioends
      from the buffer I/O path, and thus avoiding a memory allocation and
      a context switch for any non-trivial direct write.  As a nice side
      effect we also decouple the direct I/O path implementation from that
      of the buffered I/O path.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      
      273dda76
    • C
      direct-io: always call ->end_io if non-NULL · 187372a3
      Christoph Hellwig 提交于
      This way we can pass back errors to the file system, and allow for
      cleanup required for all direct I/O invocations.
      
      Also allow the ->end_io handlers to return errors on their own, so that
      I/O completion errors can be passed on to the callers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      187372a3
    • C
      xfs: Split default quota limits by quota type · be607946
      Carlos Maiolino 提交于
      Default quotas are globally set due historical reasons. IRIX only
      supported user and project quotas, and default quota was only
      applied to user quotas.
      
      In Linux, when a default quota is set, all different quota types
      inherits the same default value.
      
      An user with a quota limit larger than the default quota value, will
      still be limited to the default value because the group quotas also
      inherits the default quotas. Unless the group which the user belongs
      to have a custom quota limit set.
      
      This patch aims to split the default quota value by quota type.
      Allowing each quota type having different default values.
      
      Default time limits are still set globally. XFS does not set a
      per-user/group timer, but a single global timer. For changing this
      behavior, some changes should be made in user-space tools another
      bugs being fixed.
      Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      be607946
    • E
      xfs: wire up Q_XGETNEXTQUOTA / get_nextdqblk · 296c24e2
      Eric Sandeen 提交于
      Add code to allow the Q_XGETNEXTQUOTA quotactl to quickly find
      all active quotas by examining the quota inode, and skipping
      over unallocated or uninitialized regions.
      
      Userspace can then use this interface rather than i.e. a
      getpwent() loop when asked to report all active quotas.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      296c24e2
    • E
      xfs: Factor xfs_seek_hole_data into helper · 8aa7d37e
      Eric Sandeen 提交于
      Factor xfs_seek_hole_data into an unlocked helper which takes
      an xfs inode rather than a file for internal use.
      
      Also allow specification of "end" - the vfs lseek interface is
      defined such that any offset past eof/i_size shall return -ENXIO,
      but we will use this for quota code which does not maintain i_size,
      and we want to be able to SEEK_DATA past i_size as well.  So the
      lseek path can send in i_size, and the quota code can determine
      its own ending offset.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      8aa7d37e
    • E
      xfs: get quota inode from mp & flags rather than dqp · 4d4d9523
      Eric Sandeen 提交于
      Allow us to get the appropriate quota inode from any
      mp & quota flags, not necessarily associated with a
      particular dqp.  Needed for when we are searching for
      the next active ID with quotas and we want to examine
      the quota inode.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      4d4d9523
    • E
      xfs: don't overflow quota ID when initializing dqblk · a484bcdd
      Eric Sandeen 提交于
      Quota IDs are unsigned, and so we can pass in values up
      to 2^32-1.  But if we try to initialize a block containing
      values over MAX_INT, curid will overflow and assert.
      
      curid holds a quota ID, so give it the proper
      xfs_dqid_t type (and remove the now-impossible ASSERT).
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      a484bcdd
    • E
      quota: add new quotactl Q_GETNEXTQUOTA · 926132c0
      Eric Sandeen 提交于
      Q_GETNEXTQUOTA is exactly like Q_GETQUOTA, except that it
      will return quota information for the id equal to or greater
      than the id requested.  In other words, if the requested id has
      no quota, the command will return quota information for the
      next higher id which does have a quota set.  If no higher id
      has an active quota, -ESRCH is returned.
      
      This allows filesystems to do efficient iteration in kernelspace,
      much like extN filesystems do in userspace when asked to report
      all active quotas.
      
      This does require a new data structure for userspace, as the
      current structure does not include an ID for the returned quota
      information.
      
      Today, Ext4 with a hidden quota inode requires getpwent-style
      iterations, and for systems which have i.e. LDAP backends,
      this can be very slow, or even impossible if iteration is not
      allowed in the configuration.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      926132c0
    • E
      quota: add new quotactl Q_XGETNEXTQUOTA · 8b375249
      Eric Sandeen 提交于
      Q_XGETNEXTQUOTA is exactly like Q_XGETQUOTA, except that it
      will return quota information for the id equal to or greater
      than the id requested.  In other words, if the requested id has
      no quota, the command will return quota information for the
      next higher id which does have a quota set.  If no higher id
      has an active quota, -ESRCH is returned.
      
      This allows filesystems to do efficient iteration in kernelspace,
      much like extN filesystems do in userspace when asked to report
      all active quotas.
      
      The patch adds a d_id field to struct qc_dqblk so that we can
      pass back the id of the quota which was found, and return it
      to userspace.
      
      Today, filesystems such as XFS require getpwent-style iterations,
      and for systems which have i.e. LDAP backends, this can be very
      slow, or even impossible if iteration is not allowed in the
      configuration.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      8b375249