1. 06 9月, 2019 2 次提交
    • R
      xfs: fix missed wakeup on l_flush_wait · cdea5459
      Rik van Riel 提交于
      The code in xlog_wait uses the spinlock to make adding the task to
      the wait queue, and setting the task state to UNINTERRUPTIBLE atomic
      with respect to the waker.
      
      Doing the wakeup after releasing the spinlock opens up the following
      race condition:
      
      Task 1					task 2
      add task to wait queue
      					wake up task
      set task state to UNINTERRUPTIBLE
      
      This issue was found through code inspection as a result of kworkers
      being observed stuck in UNINTERRUPTIBLE state with an empty
      wait queue. It is rare and largely unreproducable.
      
      Simply moving the spin_unlock to after the wake_up_all results
      in the waker not being able to see a task on the waitqueue before
      it has set its state to UNINTERRUPTIBLE.
      
      This bug dates back to the conversion of this code to generic
      waitqueue infrastructure from a counting semaphore back in 2008
      which didn't place the wakeups consistently w.r.t. to the relevant
      spin locks.
      
      [dchinner: Also fix a similar issue in the shutdown path on
      xc_commit_wait. Update commit log with more details of the issue.]
      
      Fixes: d748c623 ("[XFS] Convert l_flushsema to a sv_t")
      Reported-by: NChris Mason <clm@fb.com>
      Signed-off-by: NRik van Riel <riel@surriel.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      cdea5459
    • D
      xfs: push the AIL in xlog_grant_head_wake · 7c107afb
      Dave Chinner 提交于
      In the situation where the log is full and the CIL has not recently
      flushed, the AIL push threshold is throttled back to the where the
      last write of the head of the log was completed. This is stored in
      log->l_last_sync_lsn. Hence if the CIL holds > 25% of the log space
      pinned by flushes and/or aggregation in progress, we can get the
      situation where the head of the log lags a long way behind the
      reservation grant head.
      
      When this happens, the AIL push target is trimmed back from where
      the reservation grant head wants to push the log tail to, back to
      where the head of the log currently is. This means the push target
      doesn't reach far enough into the log to actually move the tail
      before the transaction reservation goes to sleep.
      
      When the CIL push completes, it moves the log head forward such that
      the AIL push target can now be moved, but that has no mechanism for
      puhsing the log tail. Further, if the next tail movement of the log
      is not large enough wake the waiter (i.e. still not enough space for
      it to have a reservation granted), we don't wake anything up, and
      hence we do not update the AIL push target to take into account the
      head of the log moving and allowing the push target to be moved
      forwards.
      
      To avoid this particular condition, if we fail to wake the first
      waiter on the grant head because we don't have enough space,
      push on the AIL again. This will pick up any movement of the log
      head and allow the push target to move forward due to completion of
      CIL pushing.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      7c107afb
  2. 27 8月, 2019 2 次提交
    • D
      xfs: add kmem_alloc_io() · f8f9ee47
      Dave Chinner 提交于
      Memory we use to submit for IO needs strict alignment to the
      underlying driver contraints. Worst case, this is 512 bytes. Given
      that all allocations for IO are always a power of 2 multiple of 512
      bytes, the kernel heap provides natural alignment for objects of
      these sizes and that suffices.
      
      Until, of course, memory debugging of some kind is turned on (e.g.
      red zones, poisoning, KASAN) and then the alignment of the heap
      objects is thrown out the window. Then we get weird IO errors and
      data corruption problems because drivers don't validate alignment
      and do the wrong thing when passed unaligned memory buffers in bios.
      
      TO fix this, introduce kmem_alloc_io(), which will guaranteeat least
      512 byte alignment of buffers for IO, even if memory debugging
      options are turned on. It is assumed that the minimum allocation
      size will be 512 bytes, and that sizes will be power of 2 mulitples
      of 512 bytes.
      
      Use this everywhere we allocate buffers for IO.
      
      This no longer fails with log recovery errors when KASAN is enabled
      due to the brd driver not handling unaligned memory buffers:
      
      # mkfs.xfs -f /dev/ram0 ; mount /dev/ram0 /mnt/test
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      f8f9ee47
    • T
      fs: xfs: Remove KM_NOSLEEP and KM_SLEEP. · 707e0dda
      Tetsuo Handa 提交于
      Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP,
      we can remove KM_NOSLEEP and replace KM_SLEEP with 0.
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      707e0dda
  3. 04 8月, 2019 1 次提交
    • T
      fs: xfs: xfs_log: Don't use KM_MAYFAIL at xfs_log_reserve(). · 294fc7a4
      Tetsuo Handa 提交于
      When the system is close-to-OOM, fsync() may fail due to -ENOMEM because
      xfs_log_reserve() is using KM_MAYFAIL. It is a bad thing to fail writeback
      operation due to user-triggerable OOM condition. Since we are not using
      KM_MAYFAIL at xfs_trans_alloc() before calling xfs_log_reserve(), let's
      use the same flags at xfs_log_reserve().
      
        oom-torture: page allocation failure: order:0, mode:0x46c40(GFP_NOFS|__GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_COMP), nodemask=(null)
        CPU: 7 PID: 1662 Comm: oom-torture Kdump: loaded Not tainted 5.3.0-rc2+ #925
        Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00
        Call Trace:
         dump_stack+0x67/0x95
         warn_alloc+0xa9/0x140
         __alloc_pages_slowpath+0x9a8/0xbce
         __alloc_pages_nodemask+0x372/0x3b0
         alloc_slab_page+0x3a/0x8d0
         new_slab+0x330/0x420
         ___slab_alloc.constprop.94+0x879/0xb00
         __slab_alloc.isra.89.constprop.93+0x43/0x6f
         kmem_cache_alloc+0x331/0x390
         kmem_zone_alloc+0x9f/0x110 [xfs]
         kmem_zone_alloc+0x9f/0x110 [xfs]
         xlog_ticket_alloc+0x33/0xd0 [xfs]
         xfs_log_reserve+0xb4/0x410 [xfs]
         xfs_trans_reserve+0x1d1/0x2b0 [xfs]
         xfs_trans_alloc+0xc9/0x250 [xfs]
         xfs_setfilesize_trans_alloc.isra.27+0x44/0xc0 [xfs]
         xfs_submit_ioend.isra.28+0xa5/0x180 [xfs]
         xfs_vm_writepages+0x76/0xa0 [xfs]
         do_writepages+0x17/0x80
         __filemap_fdatawrite_range+0xc1/0xf0
         file_write_and_wait_range+0x53/0xa0
         xfs_file_fsync+0x87/0x290 [xfs]
         vfs_fsync_range+0x37/0x80
         do_fsync+0x38/0x60
         __x64_sys_fsync+0xf/0x20
         do_syscall_64+0x4a/0x1c0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: eb01c9cd ("[XFS] Remove the xlog_ticket allocator")
      Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      294fc7a4
  4. 03 7月, 2019 1 次提交
  5. 29 6月, 2019 18 次提交
  6. 24 5月, 2019 1 次提交
    • D
      xfs: fix broken log reservation debugging · d31d7185
      Darrick J. Wong 提交于
      xlog_print_tic_res() is supposed to print a human readable string for
      each element of the log ticket reservation array.  Unfortunately, I
      forgot to update the string array when we added rmap & reflink support,
      so the debug message prints "region[3]: (null) - 352 bytes" which isn't
      useful at all.  Add the missing elements and add a build check so that
      we don't forget again to add a string when adding a new XLOG_REG_TYPE.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      d31d7185
  7. 15 4月, 2019 1 次提交
  8. 03 8月, 2018 1 次提交
  9. 01 8月, 2018 1 次提交
  10. 24 7月, 2018 2 次提交
  11. 09 6月, 2018 1 次提交
  12. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  13. 10 5月, 2018 2 次提交
  14. 10 4月, 2018 1 次提交
  15. 24 3月, 2018 2 次提交
  16. 15 3月, 2018 3 次提交