1. 16 1月, 2010 26 次提交
  2. 11 1月, 2010 4 次提交
    • D
      xfs: Ensure we force all busy extents in range to disk · fd45e478
      Dave Chinner 提交于
      When we search for and find a busy extent during allocation we
      force the log out to ensure the extent free transaction is on
      disk before the allocation transaction. The current implementation
      has a subtle bug in it--it does not handle multiple overlapping
      ranges.
      
      That is, if we free lots of little extents into a single
      contiguous extent, then allocate the contiguous extent, the busy
      search code stops searching at the first extent it finds that
      overlaps the allocated range. It then uses the commit LSN of the
      transaction to force the log out to.
      
      Unfortunately, the other busy ranges might have more recent
      commit LSNs than the first busy extent that is found, and this
      results in xfs_alloc_search_busy() returning before all the
      extent free transactions are on disk for the range being
      allocated. This can lead to potential metadata corruption or
      stale data exposure after a crash because log replay won't replay
      all the extent free transactions that cover the allocation range.
      Modified-by: NAlex Elder <aelder@sgi.com>
      
      (Dropped the "found" argument from the xfs_alloc_busysearch trace
      event.)
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      fd45e478
    • D
      xfs: Don't flush stale inodes · 44e08c45
      Dave Chinner 提交于
      Because inodes remain in cache much longer than inode buffers do
      under memory pressure, we can get the situation where we have
      stale, dirty inodes being reclaimed but the backing storage has
      been freed.  Hence we should never, ever flush XFS_ISTALE inodes
      to disk as there is no guarantee that the backing buffer is in
      cache and still marked stale when the flush occurs.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      44e08c45
    • C
      xfs: fix timestamp handling in xfs_setattr · d6d59bad
      Christoph Hellwig 提交于
      We currently have some rather odd code in xfs_setattr for
      updating the a/c/mtime timestamps:
      
       - first we do a non-transaction update if all three are updated
         together
       - second we implicitly update the ctime for various changes
         instead of relying on the ATTR_CTIME flag
       - third we set the timestamps to the current time instead of the
         arguments in the iattr structure in many cases.
      
      This patch makes sure we update it in a consistent way:
      
       - always transactional
       - ctime is only updated if ATTR_CTIME is set or we do a size
         update, which is a special case
       - always to the times passed in from the caller instead of the
         current time
      
      The only non-size caller of xfs_setattr that doesn't come from
      the VFS is updated to set ATTR_CTIME and pass in a valid ctime
      value.
      Reported-by: NEric Blake <ebb9@byu.net>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      d6d59bad
    • C
      xfs: use DECLARE_EVENT_CLASS · ea9a4888
      Christoph Hellwig 提交于
      Using DECLARE_EVENT_CLASS allows us to to use trace event code
      instead of duplicating it in the binary.  This was not available
      before 2.6.33 so it had to be done as a separate step once the
      prerequisite was merged.
      
      This only requires changes to xfs_trace.h and the results are
      rather impressive:
      
      hch@brick:~/work/linux-2.6/obj-kvm$ size fs/xfs/xfs.o*
      text	   data	    bss	    dec	    hex	filename
       607732	  41884	   3616	 653232	  9f7b0	fs/xfs/xfs.o
      1026732	  41884	   3808	1072424	 105d28	fs/xfs/xfs.o.old
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      ea9a4888
  3. 09 1月, 2010 1 次提交
  4. 18 12月, 2009 1 次提交
  5. 17 12月, 2009 6 次提交
    • D
      XFS: Free buffer pages array unconditionally · 3fc98b1a
      Dave Chinner 提交于
      The code in xfs_free_buf() only attempts to free the b_pages array if the
      buffer is a page cache backed or page allocated buffer. The extra log buffer
      that is used when the log wraps uses pages that are allocated to a different
      log buffer, but it still has a b_pages array allocated when those pages
      are associated to with the extra buffer in xfs_buf_associate_memory.
      
      Hence we need to always attempt to free the b_pages array when tearing
      down a buffer, not just on buffers that are explicitly marked as page bearing
      buffers. This fixes a leak detected by the kernel memory leak code.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      3fc98b1a
    • C
      xfs: kill xfs_bmbt_rec_32/64 types · a5f9be58
      Christoph Hellwig 提交于
      For a long time we've always stored bmap btree records in the 64bit format,
      so kill off the dead 32bit type, and make sure the 64bit type is named just
      xfs_bmbt_rec everywhere, without any size postfix.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      a5f9be58
    • D
      xfs: improve metadata I/O merging in the elevator · 2ee1abad
      Dave Chinner 提交于
      Change all async metadata buffers to use [READ|WRITE]_META I/O types
      so that the I/O doesn't get issued immediately. This allows merging of
      adjacent metadata requests but still prioritises them over bulk data.
      This shows a 10-15% improvement in sequential create speed of small
      files.
      
      Don't include the log buffers in this classification - leave them as
      sync types so they are issued immediately.
      Signed-off-by: NDave Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      2ee1abad
    • C
      xfs: check for not fully initialized inodes in xfs_ireclaim · b44b1126
      Christoph Hellwig 提交于
      Add an assert for inodes not added to the inode cache in xfs_ireclaim,
      to make sure we're not going to introduce something like the
      famous nfsd inode cache bug again.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      b44b1126
    • C
      cleanup blockdev_direct_IO locking · 1e431f5c
      Christoph Hellwig 提交于
      Currently the locking in blockdev_direct_IO is a mess, we have three different
      locking types and very confusing checks for some of them.  The most
      complicated one is DIO_OWN_LOCKING for reads, which happens to not actually be
      used.
      
      This patch gets rid of the DIO_OWN_LOCKING - as mentioned above the read case
      is unused anyway, and the write side is almost identical to DIO_NO_LOCKING.
      The difference is that DIO_NO_LOCKING always sets the create argument for
      the get_blocks callback to zero, but we can easily move that to the actual
      get_blocks callbacks.  There are four users of the DIO_NO_LOCKING mode:
      gfs already ignores the create argument and thus is fine with the new
      version, ocfs2 only errors out if create were ever set, and we can remove
      this dead code now, the block device code only ever uses create for an
      error message if we are fully beyond the device which can never happen,
      and last but not least XFS will need the new behavour for writes.
      
      Now we can replace the lock_type variable with a flags one, where no flag
      means the DIO_NO_LOCKING behaviour and DIO_LOCKING is kept as the first
      flag.  Separate out the check for not allowing to fill holes into a separate
      flag, although for now both flags always get set at the same time.
      
      Also revamp the documentation of the locking scheme to actually make sense.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1e431f5c
    • C
      sanitize xattr handler prototypes · 431547b3
      Christoph Hellwig 提交于
      Add a flags argument to struct xattr_handler and pass it to all xattr
      handler methods.  This allows using the same methods for multiple
      handlers, e.g. for the ACL methods which perform exactly the same action
      for the access and default ACLs, just using a different underlying
      attribute.  With a little more groundwork it'll also allow sharing the
      methods for the regular user/trusted/secure handlers in extN, ocfs2 and
      jffs2 like it's already done for xfs in this patch.
      
      Also change the inode argument to the handlers to a dentry to allow
      using the handlers mechnism for filesystems that require it later,
      e.g. cifs.
      
      [with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJames Morris <jmorris@namei.org>
      Acked-by: NJoel Becker <joel.becker@oracle.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      431547b3
  6. 16 12月, 2009 2 次提交
    • C
      direct-io: cleanup blockdev_direct_IO locking · 5fe878ae
      Christoph Hellwig 提交于
      Currently the locking in blockdev_direct_IO is a mess, we have three
      different locking types and very confusing checks for some of them.  The
      most complicated one is DIO_OWN_LOCKING for reads, which happens to not
      actually be used.
      
      This patch gets rid of the DIO_OWN_LOCKING - as mentioned above the read
      case is unused anyway, and the write side is almost identical to
      DIO_NO_LOCKING.  The difference is that DIO_NO_LOCKING always sets the
      create argument for the get_blocks callback to zero, but we can easily
      move that to the actual get_blocks callbacks.  There are four users of the
      DIO_NO_LOCKING mode: gfs already ignores the create argument and thus is
      fine with the new version, ocfs2 only errors out if create were ever set,
      and we can remove this dead code now, the block device code only ever uses
      create for an error message if we are fully beyond the device which can
      never happen, and last but not least XFS will need the new behavour for
      writes.
      
      Now we can replace the lock_type variable with a flags one, where no flag
      means the DIO_NO_LOCKING behaviour and DIO_LOCKING is kept as the first
      flag.  Separate out the check for not allowing to fill holes into a
      separate flag, although for now both flags always get set at the same
      time.
      
      Also revamp the documentation of the locking scheme to actually make
      sense.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alex Elder <aelder@sgi.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5fe878ae
    • J
      fs/xfs/xfs_log_recover.c: use %pU to print UUIDs · 03daa57c
      Joe Perches 提交于
      Signed-off-by: NJoe Perches <joe@perches.com>
      Acked-by: NAlex Elder <aelder@sgi.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      03daa57c