1. 26 4月, 2023 2 次提交
  2. 19 4月, 2023 1 次提交
    • D
      xfs: don't use BMBT btree split workers for IO completion · 2367be02
      Dave Chinner 提交于
      mainline inclusion
      from mainline-v6.3-rc1
      commit c85007e2
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
      CVE: NA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c85007e2e3942da1f9361e4b5a9388ea3a8dcc5b
      
      --------------------------------
      
      When we split a BMBT due to record insertion, we offload it to a
      worker thread because we can be deep in the stack when we try to
      allocate a new block for the BMBT. Allocation can use several
      kilobytes of stack (full memory reclaim, swap and/or IO path can
      end up on the stack during allocation) and we can already be several
      kilobytes deep in the stack when we need to split the BMBT.
      
      A recent workload demonstrated a deadlock in this BMBT split
      offload. It requires several things to happen at once:
      
      1. two inodes need a BMBT split at the same time, one must be
      unwritten extent conversion from IO completion, the other must be
      from extent allocation.
      
      2. there must be a no available xfs_alloc_wq worker threads
      available in the worker pool.
      
      3. There must be sustained severe memory shortages such that new
      kworker threads cannot be allocated to the xfs_alloc_wq pool for
      both threads that need split work to be run
      
      4. The split work from the unwritten extent conversion must run
      first.
      
      5. when the BMBT block allocation runs from the split work, it must
      loop over all AGs and not be able to either trylock an AGF
      successfully, or each AGF is is able to lock has no space available
      for a single block allocation.
      
      6. The BMBT allocation must then attempt to lock the AGF that the
      second task queued to the rescuer thread already has locked before
      it finds an AGF it can allocate from.
      
      At this point, we have an ABBA deadlock between tasks queued on the
      xfs_alloc_wq rescuer thread and a locked AGF. i.e. The queued task
      holding the AGF lock can't be run by the rescuer thread until the
      task the rescuer thread is runing gets the AGF lock....
      
      This is a highly improbably series of events, but there it is.
      
      There's a couple of ways to fix this, but the easiest way to ensure
      that we only punt tasks with a locked AGF that holds enough space
      for the BMBT block allocations to the worker thread.
      
      This works for unwritten extent conversion in IO completion (which
      doesn't have a locked AGF and space reservations) because we have
      tight control over the IO completion stack. It is typically only 6
      functions deep when xfs_btree_split() is called because we've
      already offloaded the IO completion work to a worker thread and
      hence we don't need to worry about stack overruns here.
      
      The other place we can be called for a BMBT split without a
      preceeding allocation is __xfs_bunmapi() when punching out the
      center of an existing extent. We don't remove extents in the IO
      path, so these operations don't tend to be called with a lot of
      stack consumed. Hence we don't really need to ship the split off to
      a worker thread in these cases, either.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: Nyangerkun <yangerkun@huaweicloud.com>
      Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
      2367be02
  3. 12 4月, 2023 2 次提交
  4. 19 1月, 2023 1 次提交
  5. 29 9月, 2022 2 次提交
  6. 18 3月, 2020 3 次提交
    • D
      xfs: support bulk loading of staged btrees · 60e3d707
      Darrick J. Wong 提交于
      Add a new btree function that enables us to bulk load a btree cursor.
      This will be used by the upcoming online repair patches to generate new
      btrees.  This avoids the programmatic inefficiency of calling
      xfs_btree_insert in a loop (which generates a lot of log traffic) in
      favor of stamping out new btree blocks with ordered buffers, and then
      committing both the new root and scheduling the removal of the old btree
      blocks in a single transaction commit.
      
      The design of this new generic code is based off the btree rebuilding
      code in xfs_repair's phase 5 code, with the explicit goal of enabling us
      to share that code between scrub and repair.  It has the additional
      feature of being able to control btree block loading factors.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      60e3d707
    • D
      xfs: introduce fake roots for inode-rooted btrees · 349e1c03
      Darrick J. Wong 提交于
      Create an in-core fake root for inode-rooted btree types so that callers
      can generate a whole new btree using the upcoming btree bulk load
      function without making the new tree accessible from the rest of the
      filesystem.  It is up to the individual btree type to provide a function
      to create a staged cursor (presumably with the appropriate callouts to
      update the fakeroot) and then commit the staged root back into the
      filesystem.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      349e1c03
    • D
      xfs: introduce fake roots for ag-rooted btrees · e06536a6
      Darrick J. Wong 提交于
      Create an in-core fake root for AG-rooted btree types so that callers
      can generate a whole new btree using the upcoming btree bulk load
      function without making the new tree accessible from the rest of the
      filesystem.  It is up to the individual btree type to provide a function
      to create a staged cursor (presumably with the appropriate callouts to
      update the fakeroot) and then commit the staged root back into the
      filesystem.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      e06536a6
  7. 14 3月, 2020 3 次提交
  8. 12 3月, 2020 1 次提交
  9. 27 1月, 2020 2 次提交
  10. 08 1月, 2020 1 次提交
  11. 19 11月, 2019 1 次提交
  12. 14 11月, 2019 1 次提交
  13. 13 11月, 2019 1 次提交
    • D
      xfs: kill the XFS_WANT_CORRUPT_* macros · f9e03706
      Darrick J. Wong 提交于
      The XFS_WANT_CORRUPT_* macros conceal subtle side effects such as the
      creation of local variables and redirections of the code flow.  This is
      pretty ugly, so replace them with explicit XFS_IS_CORRUPT tests that
      remove both of those ugly points.  The change was performed with the
      following coccinelle script:
      
      @@
      expression mp, test;
      identifier label;
      @@
      
      - XFS_WANT_CORRUPTED_GOTO(mp, test, label);
      + if (XFS_IS_CORRUPT(mp, !test)) { error = -EFSCORRUPTED; goto label; }
      
      @@
      expression mp, test;
      @@
      
      - XFS_WANT_CORRUPTED_RETURN(mp, test);
      + if (XFS_IS_CORRUPT(mp, !test)) return -EFSCORRUPTED;
      
      @@
      expression mp, lval, rval;
      @@
      
      - XFS_IS_CORRUPT(mp, !(lval == rval))
      + XFS_IS_CORRUPT(mp, lval != rval)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 && e2))
      + XFS_IS_CORRUPT(mp, !e1 || !e2)
      
      @@
      expression e1, e2;
      @@
      
      - !(e1 == e2)
      + e1 != e2
      
      @@
      expression e1, e2, e3, e4, e5, e6;
      @@
      
      - !(e1 == e2 && e3 == e4) || e5 != e6
      + e1 != e2 || e3 != e4 || e5 != e6
      
      @@
      expression e1, e2, e3, e4, e5, e6;
      @@
      
      - !(e1 == e2 || (e3 <= e4 && e5 <= e6))
      + e1 != e2 && (e3 > e4 || e5 > e6)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 <= e2))
      + XFS_IS_CORRUPT(mp, e1 > e2)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 < e2))
      + XFS_IS_CORRUPT(mp, e1 >= e2)
      
      @@
      expression mp, e1;
      @@
      
      - XFS_IS_CORRUPT(mp, !!e1)
      + XFS_IS_CORRUPT(mp, e1)
      
      @@
      expression mp, e1, e2;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 || e2))
      + XFS_IS_CORRUPT(mp, !e1 && !e2)
      
      @@
      expression mp, e1, e2, e3, e4;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 == e4))
      + XFS_IS_CORRUPT(mp, e1 != e2 && e3 != e4)
      
      @@
      expression mp, e1, e2, e3, e4;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 <= e2) || !(e3 >= e4))
      + XFS_IS_CORRUPT(mp, e1 > e2 || e3 < e4)
      
      @@
      expression mp, e1, e2, e3, e4;
      @@
      
      - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 <= e4))
      + XFS_IS_CORRUPT(mp, e1 != e2 && e3 > e4)
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      f9e03706
  14. 12 11月, 2019 1 次提交
  15. 05 11月, 2019 1 次提交
  16. 30 10月, 2019 1 次提交
  17. 30 8月, 2019 1 次提交
  18. 27 8月, 2019 1 次提交
  19. 29 6月, 2019 2 次提交
  20. 13 6月, 2019 1 次提交
  21. 05 12月, 2018 1 次提交
  22. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  23. 05 6月, 2018 5 次提交
  24. 16 5月, 2018 2 次提交
  25. 10 4月, 2018 1 次提交
  26. 12 3月, 2018 1 次提交