提交 · 089558bc7ba785c03815a49c89e28ad9b8de51f9 · openeuler / Kernel

08 12月, 2021 1 次提交

xfs: remove all COW fork extents when remounting readonly · 089558bc

由 Darrick J. Wong 提交于 12月 06, 2021

As part of multiple customer escalations due to file data corruption
after copy on write operations, I wrote some fstests that use fsstress
to hammer on COW to shake things loose.  Regrettably, I caught some
filesystem shutdowns due to incorrect rmap operations with the following
loop:

mount <filesystem>				# (0)
fsstress <run only readonly ops> &		# (1)
while true; do
	fsstress <run all ops>
	mount -o remount,ro			# (2)
	fsstress <run only readonly ops>
	mount -o remount,rw			# (3)
done

When (2) happens, notice that (1) is still running.  xfs_remount_ro will
call xfs_blockgc_stop to walk the inode cache to free all the COW
extents, but the blockgc mechanism races with (1)'s reader threads to
take IOLOCKs and loses, which means that it doesn't clean them all out.
Call such a file (A).

When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which
walks the ondisk refcount btree and frees any COW extent that it finds.
This function does not check the inode cache, which means that incore
COW forks of inode (A) is now inconsistent with the ondisk metadata.  If
one of those former COW extents are allocated and mapped into another
file (B) and someone triggers a COW to the stale reservation in (A), A's
dirty data will be written into (B) and once that's done, those blocks
will be transferred to (A)'s data fork without bumping the refcount.

The results are catastrophic -- file (B) and the refcount btree are now
corrupt.  Solve this race by forcing the xfs_blockgc_free_space to run
synchronously, which causes xfs_icwalk to return to inodes that were
skipped because the blockgc code couldn't take the IOLOCK.  This is safe
to do here because the VFS has already prohibited new writer threads.

Fixes: 10ddf64e ("xfs: remove leftover CoW reservations when remounting ro")
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>

089558bc

31 10月, 2021 1 次提交

xfs: Remove duplicated include in xfs_super · 0b9007ec

由 Wan Jiabing 提交于 10月 25, 2021

Fix following checkincludes.pl warning:
./fs/xfs/xfs_super.c: xfs_btree.h is included more than once.

The include is in line 15. Remove the duplicated here.
Signed-off-by: NWan Jiabing <wanjiabing@vivo.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

0b9007ec

23 10月, 2021 3 次提交

xfs: rename xfs_bmap_add_free to xfs_free_extent_later · c201d9ca

由 Darrick J. Wong 提交于 10月 12, 2021

xfs_bmap_add_free isn't a block mapping function; it schedules deferred
freeing operations for a later point in a compound transaction chain.
While it's primarily used by bunmapi, its use has expanded beyond that.
Move it to xfs_alloc.c and rename the function since it's now general
freeing functionality. Bring the slab cache bits in line with the
way we handle the other intent items.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>

c201d9ca

xfs: create slab caches for frequently-used deferred items · f3c799c2

由 Darrick J. Wong 提交于 10月 12, 2021

Create slab caches for the high-level structures that coordinate
deferred intent items, since they're used fairly heavily.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>

f3c799c2

xfs: rename _zone variables to _cache · 182696fb

由 Darrick J. Wong 提交于 10月 12, 2021

Now that we've gotten rid of the kmem_zone_t typedef, rename the
variables to _cache since that's what they are.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>

182696fb

20 10月, 2021 3 次提交

xfs: use separate btree cursor cache for each btree type · 9fa47bdc

由 Darrick J. Wong 提交于 9月 23, 2021

Now that we have the infrastructure to track the max possible height of
each btree type, we can create a separate slab cache for cursors of each
type of btree.  For smaller indices like the free space btrees, this
means that we can pack more cursors into a slab page, improving slab
utilization.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

9fa47bdc

xfs: dynamically allocate cursors based on maxlevels · c940a0c5

由 Darrick J. Wong 提交于 9月 16, 2021

To support future btree code, we need to be able to size btree cursors
dynamically for very large btrees. Switch the maxlevels computation to
use the precomputed values in the superblock, and create cursors that
can handle a certain height. For now, we retain the btree cursor cache
that can handle up to 9-level btrees, though a subsequent patch
introduces separate caches for each btree type, where each cache's
objects will be exactly tall enough to handle the specific btree type.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

c940a0c5

xfs: prepare xfs_btree_cur for dynamic cursor heights · 6ca444cf

由 Darrick J. Wong 提交于 9月 16, 2021

Split out the btree level information into a separate struct and put it
at the end of the cursor structure as a VLA.  Files with huge data forks
(and in the future, the realtime rmap btree) will require the ability to
support many more levels than a per-AG btree cursor, which means that
we're going to create per-btree type cursor caches to conserve memory
for the more common case.

Note that a subsequent patch actually introduces dynamic cursor heights.
This one merely rearranges the structure to prepare for that.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

6ca444cf

27 8月, 2021 2 次提交

dax: remove bdev_dax_supported · bdd3c50d

由 Christoph Hellwig 提交于 8月 26, 2021

All callers already have a dax_device obtained from fs_dax_get_by_bdev
at hand, so just pass that to dax_supported() insted of doing another
lookup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20210826135510.6293-10-hch@lst.deSigned-off-by: NDan Williams <dan.j.williams@intel.com>

bdd3c50d

xfs: factor out a xfs_buftarg_is_dax helper · a384f088

由 Christoph Hellwig 提交于 8月 26, 2021

Refactor the DAX setup code in preparation of removing
bdev_dax_supported.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20210826135510.6293-9-hch@lst.deSigned-off-by: NDan Williams <dan.j.williams@intel.com>

a384f088

20 8月, 2021 7 次提交

xfs: introduce xfs_sb_is_v5 helper · d6837c1a