1. 13 1月, 2018 1 次提交
    • D
      xfs: use %px for data pointers when debugging · c9690043
      Darrick J. Wong 提交于
      Starting with commit 57e73442 ("vsprintf: refactor %pK code out of
      pointer"), the behavior of the raw '%p' printk format specifier was
      changed to print a 32-bit hash of the pointer value to avoid leaking
      kernel pointers into dmesg.  For most situations that's good.
      
      This is /undesirable/ behavior when we're trying to debug XFS, however,
      so define a PTR_FMT that prints the actual pointer when we're in debug
      mode.
      
      Note that %p for tracepoints still prints the raw pointer, so in the
      long run we could consider rewriting some of these messages as
      tracepoints.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      c9690043
  2. 09 1月, 2018 1 次提交
  3. 21 11月, 2017 1 次提交
  4. 07 11月, 2017 1 次提交
  5. 27 10月, 2017 1 次提交
  6. 13 9月, 2017 1 次提交
    • R
      xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present · b31ff3cd
      Richard Wareing 提交于
      If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on
      a directory in a filesystem that does not have a realtime device and
      create a new file in that directory, it gets marked as a real time file.
      When data is written and a fsync is issued, the filesystem attempts to
      flush a non-existent rt device during the fsync process.
      
      This results in a crash dereferencing a null buftarg pointer in
      xfs_blkdev_issue_flush():
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        IP: xfs_blkdev_issue_flush+0xd/0x20
        .....
        Call Trace:
          xfs_file_fsync+0x188/0x1c0
          vfs_fsync_range+0x3b/0xa0
          do_fsync+0x3d/0x70
          SyS_fsync+0x10/0x20
          do_syscall_64+0x4d/0xb0
          entry_SYSCALL64_slow_path+0x25/0x25
      
      Setting RT inode flags does not require special privileges so any
      unprivileged user can cause this oops to occur.  To reproduce, confirm
      kernel is compiled with CONFIG_XFS_RT=y and run:
      
        # mkfs.xfs -f /dev/pmem0
        # mount /dev/pmem0 /mnt/test
        # mkdir /mnt/test/foo
        # xfs_io -c 'chattr +t' /mnt/test/foo
        # xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar
      
      Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait.
      
      Kernels built with CONFIG_XFS_RT=n are not exposed to this bug.
      
      Fixes: f538d4da ("[XFS] write barrier support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRichard Wareing <rwareing@fb.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b31ff3cd
  7. 07 7月, 2017 1 次提交
    • D
      xfs: rename MAXPATHLEN to XFS_SYMLINK_MAXLEN · 6eb0b8df
      Darrick J. Wong 提交于
      XFS has a maximum symlink target length of 1024 bytes; this is a
      holdover from the Irix days.  Unfortunately, the constant establishing
      this is 'MAXPATHLEN' and is /not/ the same as the Linux MAXPATHLEN,
      which is 4096.
      
      The kernel enforces its 1024 byte MAXPATHLEN on symlink targets, but
      xfsprogs picks up the (Linux) system 4096 byte MAXPATHLEN, which means
      that xfs_repair doesn't complain about oversized symlinks.
      
      Since this is an on-disk format constraint, put the define in the XFS
      namespace and move everything over to use the new name.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      6eb0b8df
  8. 20 6月, 2017 1 次提交
    • D
      xfs: remove double-underscore integer types · c8ce540d
      Darrick J. Wong 提交于
      This is a purely mechanical patch that removes the private
      __{u,}int{8,16,32,64}_t typedefs in favor of using the system
      {u,}int{8,16,32,64}_t typedefs.  This is the sed script used to perform
      the transformation and fix the resulting whitespace and indentation
      errors:
      
      s/typedef\t__uint8_t/typedef __uint8_t\t/g
      s/typedef\t__uint/typedef __uint/g
      s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g
      s/__uint8_t\t/__uint8_t\t\t/g
      s/__uint/uint/g
      s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g
      s/__int/int/g
      /^typedef.*int[0-9]*_t;$/d
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c8ce540d
  9. 05 6月, 2017 3 次提交
  10. 12 4月, 2017 1 次提交
  11. 02 3月, 2017 1 次提交
  12. 18 1月, 2017 1 次提交
  13. 25 12月, 2016 1 次提交
  14. 07 12月, 2016 1 次提交
    • L
      xfs: use rhashtable to track buffer cache · 6031e73a
      Lucas Stach 提交于
      On filesystems with a lot of metadata and in metadata intensive workloads
      xfs_buf_find() is showing up at the top of the CPU cycles trace. Most of
      the CPU time is spent on CPU cache misses while traversing the rbtree.
      
      As the buffer cache does not need any kind of ordering, but fast lookups
      a hashtable is the natural data structure to use. The rhashtable
      infrastructure provides a self-scaling hashtable implementation and
      allows lookups to proceed while the table is going through a resize
      operation.
      
      This reduces the CPU-time spent for the lookups to 1/3 even for small
      filesystems with a relatively small number of cached buffers, with
      possibly much larger gains on higher loaded filesystems.
      
      [dchinner: reduce minimum hash size to an acceptable size for large
      	   filesystems with many AGs with no active use.]
      [dchinner: remove stale rbtree asserts.]
      [dchinner: use xfs_buf_map for compare function argument.]
      [dchinner: make functions static.]
      [dchinner: remove redundant comments.]
      Signed-off-by: NLucas Stach <dev@lynxeye.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      6031e73a
  15. 06 10月, 2016 1 次提交
    • D
      xfs: garbage collect old cowextsz reservations · 83104d44
      Darrick J. Wong 提交于
      Trim CoW reservations made on behalf of a cowextsz hint if they get too
      old or we run low on quota, so long as we don't have dirty data awaiting
      writeback or directio operations in progress.
      
      Garbage collection of the cowextsize extents are kept separate from
      prealloc extent reaping because setting the CoW prealloc lifetime to a
      (much) higher value than the regular prealloc extent lifetime has been
      useful for combatting CoW fragmentation on VM hosts where the VMs
      experience bursty write behaviors and we can keep the utilization ratios
      low enough that we don't start to run out of space.  IOWs, it benefits
      us to keep the CoW fork reservations around for as long as we can unless
      we run out of blocks or hit inode reclaim.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      83104d44
  16. 20 7月, 2016 1 次提交
  17. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  18. 12 10月, 2015 1 次提交
    • B
      xfs: pass xfsstats structures to handlers and macros · 80529c45
      Bill O'Donnell 提交于
      This patch is the next step toward per-fs xfs stats. The patch makes
      the show and clear routines able to handle any stats structure
      associated with a kobject.
      
      Instead of a single global xfsstats structure, add kobject and a pointer
      to a per-cpu struct xfsstats. Modify the macros that manipulate the stats
      accordingly: XFS_STATS_INC, XFS_STATS_DEC, and XFS_STATS_ADD now access
      xfsstats->xs_stats.
      
      The sysfs functions need to get from the kobject back to the xfsstats
      structure which contains it, and pass the pointer to the ->xs_stats
      percpu structure into the show & clear routines.
      Signed-off-by: NBill O'Donnell <billodo@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      80529c45
  19. 22 6月, 2015 3 次提交
  20. 23 2月, 2015 1 次提交
  21. 28 11月, 2014 1 次提交
  22. 02 10月, 2014 1 次提交
  23. 04 8月, 2014 1 次提交
    • D
      xfs: kill xfs_vnode.h · b92cc59f
      Dave Chinner 提交于
      Move the IO flag definitions to xfs_inode.h and kill the header file
      as it is now empty.
      
      Removing the xfs_vnode.h file showed up an implicit header include
      path:
      	xfs_linux.h -> xfs_vnode.h -> xfs_fs.h
      
      And so every xfs header file has been inplicitly been including
      xfs_fs.h where it is needed or not. Hence the removal of xfs_vnode.h
      causes all sorts of build issues because BBTOB() and friends are no
      longer automatically included in the build. This also gets fixed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      b92cc59f
  24. 30 7月, 2014 1 次提交
  25. 15 7月, 2014 1 次提交
    • B
      xfs: add xfs_mount sysfs kobject · a31b1d3d
      Brian Foster 提交于
      Embed a base kobject into xfs_mount. This creates a kobject associated
      with each XFS mount and a subdirectory in sysfs with the name of the
      filesystem. The subdirectory lifecycle matches that of the mount. Also
      add the new xfs_sysfs.[c,h] source files with some XFS sysfs
      infrastructure to facilitate attribute creation.
      
      Note that there are currently no attributes exported as part of the
      xfs_mount kobject. It exists solely to serve as a per-mount container
      for child objects.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      a31b1d3d
  26. 22 6月, 2014 1 次提交
  27. 27 2月, 2014 2 次提交
  28. 21 8月, 2013 1 次提交
  29. 16 8月, 2013 1 次提交
  30. 13 8月, 2013 1 次提交
  31. 08 5月, 2013 1 次提交
    • D
      xfs: introduce CONFIG_XFS_WARN · 742ae1e3
      Dave Chinner 提交于
      Running a CONFIG_XFS_DEBUG kernel in production environments is not
      the best idea as it introduces significant overhead, can change
      the behaviour of algorithms (such as allocation) to improve test
      coverage, and (most importantly) panic the machine on non-fatal
      errors.
      
      There are many cases where all we want to do is run a
      kernel with more bounds checking enabled, such as is provided by the
      ASSERT() statements throughout the code, but without all the
      potential overhead and drawbacks.
      
      This patch converts all the ASSERT statements to evaluate as
      WARN_ON(1) statements and hence if they fail dump a warning and a
      stack trace to the log. This has minimal overhead and does not
      change any algorithms, and will allow us to find strange "out of
      bounds" problems more easily on production machines.
      
      There are a few places where assert statements contain debug only
      code. These are converted to be debug-or-warn only code so that we
      still get all the assert checks in the code.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      742ae1e3
  32. 04 4月, 2013 1 次提交
  33. 20 11月, 2012 1 次提交
  34. 09 11月, 2012 1 次提交
  35. 12 10月, 2011 1 次提交
    • C
      xfs: revert to using a kthread for AIL pushing · 0030807c
      Christoph Hellwig 提交于
      Currently we have a few issues with the way the workqueue code is used to
      implement AIL pushing:
      
       - it accidentally uses the same workqueue as the syncer action, and thus
         can be prevented from running if there are enough sync actions active
         in the system.
       - it doesn't use the HIGHPRI flag to queue at the head of the queue of
         work items
      
      At this point I'm not confident enough in getting all the workqueue flags and
      tweaks right to provide a perfectly reliable execution context for AIL
      pushing, which is the most important piece in XFS to make forward progress
      when the log fills.
      
      Revert back to use a kthread per filesystem which fixes all the above issues
      at the cost of having a task struct and stack around for each mounted
      filesystem.  In addition this also gives us much better ways to diagnose
      any issues involving hung AIL pushing and removes a small amount of code.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reported-by: NStefan Priebe <s.priebe@profihost.ag>
      Tested-by: NStefan Priebe <s.priebe@profihost.ag>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      0030807c