提交 · dba19fb8c8265bd66b689403c62ba87e74382425 · openeuler / Kernel

29 6月, 2023 1 次提交

xfs: fix mounting failed caused by sequencing problem in the log records · dba19fb8

由 yangerkun 提交于 6月 29, 2023

Offering: HULK
hulk inclusion
category: bugfix
bugzilla: 188870, https://gitee.com/openeuler/kernel/issues/I76JSK

--------------------------------

During the test of growfs + power-off, we encountered a mounting failure
issue.
The specific call stack is as follows:

[584505.210179] XFS (loop0): xfs_buf_find: daddr 0x6d6002 out of range,
EOFS 0x6d6000
...
[584505.210739] Call Trace:
[584505.210776]  xfs_buf_get_map+0x44/0x230 [xfs]
[584505.210780]  ? trace_event_buffer_commit+0x57/0x140
[584505.210818]  xfs_buf_read_map+0x54/0x280 [xfs]
[584505.210858]  ? xlog_recover_items_pass2+0x53/0xb0 [xfs]
[584505.210899]  xlog_recover_buf_commit_pass2+0x112/0x440 [xfs]
[584505.210939]  ? xlog_recover_items_pass2+0x53/0xb0 [xfs]
[584505.210980]  xlog_recover_items_pass2+0x53/0xb0 [xfs]
[584505.211020]  xlog_recover_commit_trans+0x2ca/0x320 [xfs]
[584505.211061]  xlog_recovery_process_trans+0xc6/0xf0 [xfs]
[584505.211101]  xlog_recover_process_data+0x9e/0x110 [xfs]
[584505.211141]  xlog_do_recovery_pass+0x3b4/0x5c0 [xfs]
[584505.211181]  xlog_do_log_recovery+0x5e/0x80 [xfs]
[584505.211223]  xlog_do_recover+0x33/0x1a0 [xfs]
[584505.211262]  xlog_recover+0xd7/0x170 [xfs]
[584505.211303]  xfs_log_mount+0x217/0x2b0 [xfs]
[584505.211341]  xfs_mountfs+0x3da/0x870 [xfs]
[584505.211384]  xfs_fc_fill_super+0x3fa/0x7a0 [xfs]
[584505.211428]  ? xfs_setup_devices+0x80/0x80 [xfs]
[584505.211432]  get_tree_bdev+0x16f/0x260
[584505.211434]  vfs_get_tree+0x25/0xc0
[584505.211436]  do_new_mount+0x156/0x1b0
[584505.211438]  __se_sys_mount+0x165/0x1d0
[584505.211440]  do_syscall_64+0x33/0x40
[584505.211442]  entry_SYSCALL_64_after_hwframe+0x61/0xc6

After analyzing the log records, we have discovered the following
content:

============================================================================
cycle: 173  version: 2    lsn: 173,2742 tail_lsn: 173,1243
length of Log Record: 25600 prev offset: 2702   num ops: 258
uuid: fb958458-48a3-4c76-ae23-7a1cf3053065   format: little endian linux
h_size: 32768
----------------------------------------------------------------------------
...
----------------------------------------------------------------------------
Oper (100): tid: 1c010724  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 7168002 (0x6d6002)  len: 1  bmap size: 1
flags: 0x3800
Oper (101): tid: 1c010724  len: 128  clientid: TRANS  flags: none
AGI Buffer: XAGI
ver: 1  seq#: 28  len: 2048  cnt: 0  root: 3
level: 1  free#: 0x0  newino: 0x140
bucket[0 - 3]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
bucket[4 - 7]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
bucket[8 - 11]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
bucket[12 - 15]: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
bucket[16 - 19]: 0xffffffff
----------------------------------------------------------------------------
...
----------------------------------------------------------------------------
Oper (108): tid: 1c010724  len: 24  clientid: TRANS  flags: none
BUF:  #regs: 2   start blkno: 0 (0x0)  len: 1  bmap size: 1  flags:
0x9000
Oper (109): tid: 1c010724  len: 384  clientid: TRANS  flags: none
SUPER BLOCK Buffer:
icount: 6360863066640355328  ifree: 898048  fdblks: 0  frext: 0
----------------------------------------------------------------------------
...

We found that in the log records, the modification transaction for the
expanded block is before the growfs transaction, which leads to
verification
failure during log replay.

We need to ensure that when replaying logs, transactions related to the
superblock are replayed first.
Signed-off-by: NWu Guanghao <wuguanghao3@huawei.com>
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Signed-off-by: NLong Li <leo.lilong@huawei.com>

dba19fb8

07 6月, 2023 2 次提交

xfs: log shutdown triggers should only shut down the log · 50092d28

由 Dave Chinner 提交于 6月 07, 2023

mainline inclusion
from mainline-v5.17-rc6
commit b5f17bec
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b5f17bec1213a3ed2f4d79ad4c566e00cabe2a9b

--------------------------------

We've got a mess on our hands.

1. xfs_trans_commit() cannot cancel transactions because the mount is
shut down - that causes dirty, aborted, unlogged log items to sit
unpinned in memory and potentially get written to disk before the
log is shut down. Hence xfs_trans_commit() can only abort
transactions when xlog_is_shutdown() is true.

2. xfs_force_shutdown() is used in places to cause the current
modification to be aborted via xfs_trans_commit() because it may be
impractical or impossible to cancel the transaction directly, and
hence xfs_trans_commit() must cancel transactions when
xfs_is_shutdown() is true in this situation. But we can't do that
because of #1.

3. Log IO errors cause log shutdowns by calling xfs_force_shutdown()
to shut down the mount and then the log from log IO completion.

4. xfs_force_shutdown() can result in a log force being issued,
which has to wait for log IO completion before it will mark the log
as shut down. If #3 races with some other shutdown trigger that runs
a log force, we rely on xfs_force_shutdown() silently ignoring #3
and avoiding shutting down the log until the failed log force
completes.

5. To ensure #2 always works, we have to ensure that
xfs_force_shutdown() does not return until the the log is shut down.
But in the case of #4, this will result in a deadlock because the
log Io completion will block waiting for a log force to complete
which is blocked waiting for log IO to complete....

So the very first thing we have to do here to untangle this mess is
dissociate log shutdown triggers from mount shutdowns. We already
have xlog_forced_shutdown, which will atomically transistion to the
log a shutdown state. Due to internal asserts it cannot be called
multiple times, but was done simply because the only place that
could call it was xfs_do_force_shutdown() (i.e. the mount shutdown!)
and that could only call it once and once only. So the first thing
we do is remove the asserts.

We then convert all the internal log shutdown triggers to call
xlog_force_shutdown() directly instead of xfs_force_shutdown(). This
allows the log shutdown triggers to shut down the log without
needing to care about mount based shutdown constraints. This means
we shut down the log independently of the mount and the mount may
not notice this until it's next attempt to read or modify metadata.
At that point (e.g. xfs_trans_commit()) it will see that the log is
shutdown, error out and shutdown the mount.

To ensure that all the unmount behaviours and asserts track
correctly as a result of a log shutdown, propagate the shutdown up
to the mount if it is not already set. This keeps the mount and log
state in sync, and saves a huge amount of hassle where code fails
because of a log shutdown but only checks for mount shutdowns and
hence ends up doing the wrong thing. Cleaning up that mess is
an exercise for another day.

This enables us to address the other problems noted above in
followup patches.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLong Li <leo.lilong@huawei.com>

50092d28

xfs: shutdown in intent recovery has non-intent items in the AIL · ce25eef7

由 Dave Chinner 提交于 6月 07, 2023

mainline inclusion
from mainline-v5.17-rc6
commit ab9c81ef
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I76JSK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab9c81ef321f90dd208b1d4809c196c2794e4b15

--------------------------------

generic/388 triggered a failure in RUI recovery due to a corrupted
btree record and the system then locked up hard due to a subsequent
assert failure while holding a spinlock cancelling intents:

 XFS (pmem1): Corruption of in-memory data (0x8) detected at xfs_do_force_shutdown+0x1a/0x20 (fs/xfs/xfs_trans.c:964).  Shutting down filesystem.
 XFS (pmem1): Please unmount the filesystem and rectify the problem(s)
 XFS: Assertion failed: !xlog_item_is_intent(lip), file: fs/xfs/xfs_log_recover.c, line: 2632
 Call Trace:
  <TASK>
  xlog_recover_cancel_intents.isra.0+0xd1/0x120
  xlog_recover_finish+0xb9/0x110
  xfs_log_mount_finish+0x15a/0x1e0
  xfs_mountfs+0x540/0x910
  xfs_fs_fill_super+0x476/0x830
  get_tree_bdev+0x171/0x270
  ? xfs_init_fs_context+0x1e0/0x1e0
  xfs_fs_get_tree+0x15/0x20
  vfs_get_tree+0x24/0xc0
  path_mount+0x304/0xba0
  ? putname+0x55/0x60
  __x64_sys_mount+0x108/0x140
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Essentially, there's dirty metadata in the AIL from intent recovery
transactions, so when we go to cancel the remaining intents we assume
that all objects after the first non-intent log item in the AIL are
not intents.

This is not true. Intent recovery can log new intents to continue
the operations the original intent could not complete in a single
transaction. The new intents are committed before they are deferred,
which means if the CIL commits in the background they will get
inserted into the AIL at the head.

Hence if we shut down the filesystem while processing intent
recovery, the AIL may have new intents active at the current head.
Hence this check:

                /*
                 * We're done when we see something other than an intent.
                 * There should be no intents left in the AIL now.
                 */
                if (!xlog_item_is_intent(lip)) {
                        for (; lip; lip = xfs_trans_ail_cursor_next(ailp, &cur))
                                ASSERT(!xlog_item_is_intent(lip));
                        break;
                }

in both xlog_recover_process_intents() and
log_recover_cancel_intents() is simply not valid. It was valid back
when we only had EFI/EFD intents and didn't chain intents, but it
hasn't been valid ever since intent recovery could create and commit
new intents.

Given that crashing the mount task like this pretty much prevents
diagnosing what went wrong that lead to the initial failure that
triggered intent cancellation, just remove the checks altogether.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLong Li <leo.lilong@huawei.com>

ce25eef7

06 6月, 2023 4 次提交

xfs: introduce xfs_sb_is_v5 helper · 5c55caf2

由 Dave Chinner 提交于 6月 06, 2023

mainline inclusion
from mainline-v5.14-rc4
commit d6837c1a
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d6837c1aab42e70141fd3875ba05eb69ffb220f0

--------------------------------

Rather than open coding XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5
checks everywhere, add a simple wrapper to encapsulate this and make
the code easier to read.

This allows us to remove the xfs_sb_version_has_v3inode() wrapper
which is only used in xfs_format.h now and is just a version number
check.

There are a couple of places where we should be checking the mount
feature bits rather than the superblock version (e.g. remount), so
those are converted to use xfs_has_crc(mp) instead.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLong Li <leo.lilong@huawei.com>

5c55caf2

xfs: convert remaining mount flags to state flags · 19e16281

由 Dave Chinner 提交于 6月 06, 2023

mainline inclusion
from mainline-v5.14-rc4
commit 2e973b2c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2e973b2cd4cdb993be94cca4c33f532f1ed05316

--------------------------------

The remaining mount flags kept in m_flags are actually runtime state
flags. These change dynamically, so they really should be updated
atomically so we don't potentially lose an update due to racing
modifications.

Convert these remaining flags to be stored in m_opstate and use
atomic bitops to set and clear the flags. This also adds a couple of
simple wrappers for common state checks - read only and shutdown.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

Conflicts:
	fs/xfs/xfs_mount.c
	fs/xfs/xfs_super.c
Signed-off-by: NLong Li <leo.lilong@huawei.com>

19e16281

xfs: replace xfs_sb_version checks with feature flag checks · 5f4623da

由 Dave Chinner 提交于 6月 06, 2023

mainline inclusion
from mainline-v5.14-rc4
commit 38c26bfd
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=38c26bfd90e1999650d5ef40f90d721f05916643

--------------------------------

Convert the xfs_sb_version_hasfoo() to checks against
mp->m_features. Checks of the superblock itself during disk
operations (e.g. in the read/write verifiers and the to/from disk
formatters) are not converted - they operate purely on the
superblock state. Everything else should use the mount features.

Large parts of this conversion were done with sed with commands like
this:

for f in `git grep -l xfs_sb_version_has fs/xfs/*.c`; do
	sed -i -e 's/xfs_sb_version_has\(.*\)(&\(.*\)->m_sb)/xfs_has_\1(\2)/' $f
done

With manual cleanups for things like "xfs_has_extflgbit" and other
little inconsistencies in naming.

The result is ia lot less typing to check features and an XFS binary
size reduced by a bit over 3kB:

$ size -t fs/xfs/built-in.a
	text	   data	    bss	    dec	    hex	filenam
before	1130866  311352     484 1442702  16038e (TOTALS)
after	1127727  311352     484 1439563  15f74b (TOTALS)
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

Conflicts:
	fs/xfs/libxfs/xfs_btree.c
	fs/xfs/libxfs/xfs_ialloc.c
	fs/xfs/xfs_buf_item.c
	fs/xfs/xfs_fsmap.c
	fs/xfs/xfs_fsops.c
	fs/xfs/xfs_inode.c
	fs/xfs/xfs_ioctl.c
	fs/xfs/xfs_itable.c
	fs/xfs/xfs_log.c
	fs/xfs/xfs_refcount_item.c
	fs/xfs/xfs_rmap_item.c
	fs/xfs/xfs_rtalloc.c
Signed-off-by: NLong Li <leo.lilong@huawei.com>

5f4623da

xfs: reflect sb features in xfs_mount · fd78d2ab

由 Dave Chinner 提交于 6月 06, 2023

mainline inclusion
from mainline-v5.14-rc4
commit a1d86e8d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a1d86e8dec8c1325d301c9d5594bb794bc428fc3

--------------------------------

Currently on-disk feature checks require decoding the superblock
fileds and so can be non-trivial. We have almost 400 hundred
individual feature checks in the XFS code, so this is a significant
amount of code. To reduce runtime check overhead, pre-process all
the version flags into a features field in the xfs_mount at mount
time so we can convert all the feature checks to a simple flag
check.

There is also a need to convert the dynamic feature flags to update
the m_features field. This is required for attr, attr2 and quota
features. New xfs_mount based wrappers are added for this.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

Conflicts:
	fs/xfs/libxfs/xfs_sb.c
Signed-off-by: NLong Li <leo.lilong@huawei.com>

fd78d2ab

19 4月, 2023 2 次提交

xfs: convert buf_cancel_table allocation to kmalloc_array · cb02d6fe

由 Darrick J. Wong 提交于 4月 18, 2023

mainline inclusion
from mainline-v5.19-rc1
commit 910bbdf2
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=910bbdf2f4d7df46781bc9b723048f5ebed3d0d7

--------------------------------

While we're messing around with how recovery allocates and frees the
buffer cancellation table, convert the allocation to use kmalloc_array
instead of the old kmem_alloc APIs, and make it handle a null return,
even though that's not likely.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

Conflicts:
	fs/xfs/libxfs/xfs_log_recover.h
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

cb02d6fe

xfs: refactor buffer cancellation table allocation · ad99191a

由 Darrick J. Wong 提交于 4月 18, 2023

mainline inclusion
from mainline-v5.19-rc1
commit 27232349
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2723234923b3294dbcf6019c288c87465e927ed4

--------------------------------

Move the code that allocates and frees the buffer cancellation tables
used by log recovery into the file that actually uses the tables.  This
is a precursor to some cleanups and a memory leak fix.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <david@fromorbit.com>

Conflicts:
	fs/xfs/libxfs/xfs_log_recover.h
	fs/xfs/xfs_log_priv.h
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>

ad99191a

21 11月, 2022 2 次提交

xfs: flush inode gc workqueue before clearing agi bucket · ef4894f0

由 Zhang Yi 提交于 11月 21, 2022

mainline inclusion
from mainline-v5.19-rc2
commit 04a98a03
category: bugfix
bugzilla: 187526,https://gitee.com/openeuler/kernel/issues/I4KIAO

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=04a98a036cf8b810dda172a9dcfcbd783bf63655

--------------------------------

In the procedure of recover AGI unlinked lists, if something bad
happenes on one of the unlinked inode in the bucket list, we would call
xlog_recover_clear_agi_bucket() to clear the whole unlinked bucket list,
not the unlinked inodes after the bad one. If we have already added some
inodes to the gc workqueue before the bad inode in the list, we could
get below error when freeing those inodes, and finaly fail to complete
the log recover procedure.

 XFS (ram0): Internal error xfs_iunlink_remove at line 2456 of file
 fs/xfs/xfs_inode.c.  Caller xfs_ifree+0xb0/0x360 [xfs]

The problem is xlog_recover_clear_agi_bucket() clear the bucket list, so
the gc worker fail to check the agino in xfs_verify_agino(). Fix this by
flush workqueue before clearing the bucket.

Fixes: ab23a776 ("xfs: per-cpu deferred inode inactivation queues")
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ef4894f0

xfs: only run COW extent recovery when there are no live extents · 8efeef76

由 Darrick J. Wong 提交于 11月 21, 2022

mainline inclusion
from mainline-v5.16-rc5
commit 7993f1a4
category: bugfix
bugzilla: 186901,https://gitee.com/openeuler/kernel/issues/I4KIAO

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7993f1a431bc5271369d359941485a9340658ac3

--------------------------------

As part of multiple customer escalations due to file data corruption
after copy on write operations, I wrote some fstests that use fsstress
to hammer on COW to shake things loose.  Regrettably, I caught some
filesystem shutdowns due to incorrect rmap operations with the following
loop:

mount <filesystem>				# (0)
fsstress <run only readonly ops> &		# (1)
while true; do
	fsstress <run all ops>
	mount -o remount,ro			# (2)
	fsstress <run only readonly ops>
	mount -o remount,rw			# (3)
done

When (2) happens, notice that (1) is still running.  xfs_remount_ro will
call xfs_blockgc_stop to walk the inode cache to free all the COW
extents, but the blockgc mechanism races with (1)'s reader threads to
take IOLOCKs and loses, which means that it doesn't clean them all out.
Call such a file (A).

When (3) happens, xfs_remount_rw calls xfs_reflink_recover_cow, which
walks the ondisk refcount btree and frees any COW extent that it finds.
This function does not check the inode cache, which means that incore
COW forks of inode (A) is now inconsistent with the ondisk metadata.  If
one of those former COW extents are allocated and mapped into another
file (B) and someone triggers a COW to the stale reservation in (A), A's
dirty data will be written into (B) and once that's done, those blocks
will be transferred to (A)'s data fork without bumping the refcount.

The results are catastrophic -- file (B) and the refcount btree are now
corrupt.  In the first patch, we fixed the race condition in (2) so that
(A) will always flush the COW fork.  In this second patch, we move the
_recover_cow call to the initial mount call in (0) for safety.

As mentioned previously, xfs_reflink_recover_cow walks the refcount
btree looking for COW staging extents, and frees them.  This was
intended to be run at mount time (when we know there are no live inodes)
to clean up any leftover staging events that may have been left behind
during an unclean shutdown.  As a time "optimization" for readonly
mounts, we deferred this to the ro->rw transition, not realizing that
any failure to clean all COW forks during a rw->ro transition would
result in catastrophic corruption.

Therefore, remove this optimization and only run the recovery routine
when we're guaranteed not to have any COW staging extents anywhere,
which means we always run this at mount time.  While we're at it, move
the callsite to xfs_log_mount_finish because any refcount btree
expansion (however unlikely given that we're removing records from the
right side of the index) must be fed by a per-AG reservation, which
doesn't exist in its current location.

Fixes: 174edb0e ("xfs: store in-progress CoW allocations in the refcount btree")
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8efeef76

09 3月, 2022 3 次提交

xfs: convert log flags to an operational state field · d5ca9d5d

由 Dave Chinner 提交于 3月 09, 2022

mainline-inclusion
from mainline-v5.14-rc4
commit e1d06e5f
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4V7IK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e1d06e5f668a403f48538f0d6b163edfd4342adf

-------------------------------------------------

log->l_flags doesn't actually contain "flags" as such, it contains
operational state information that can change at runtime. For the
shutdown state, this at least should be an atomic bit because
it is read without holding locks in many places and so using atomic
bitops for the state field modifications makes sense.

This allows us to use things like test_and_set_bit() on state
changes (e.g. setting XLOG_TAIL_WARN) to avoid races in setting the
state when we aren't holding locks.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: Nguoxuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d5ca9d5d

xfs: move recovery needed state updates to xfs_log_mount_finish · 32a8357e

由 Dave Chinner 提交于 3月 09, 2022

mainline-inclusion
from mainline-v5.14-rc4
commit fd67d8a0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4V7IK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fd67d8a07208ab06560287b7b9334c2d50b7d6d7

-------------------------------------------------

xfs_log_mount_finish() needs to know if recovery is needed or not to
make decisions on whether to flush the log and AIL.  Move the
handling of the NEED_RECOVERY state out to this function rather than
needing a temporary variable to store this state over the call to
xlog_recover_finish().
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: Nguoxuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

32a8357e

xfs: convert XLOG_FORCED_SHUTDOWN() to xlog_is_shutdown() · 8804a686

由 Dave Chinner 提交于 3月 09, 2022

mainline-inclusion
from mainline-v5.14-rc4
commit 2039a272
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4V7IK
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2039a272300b949c05888428877317b834c0b1fb

-------------------------------------------------

Make it less shouty and a static inline before adding more calls
through the log code.

Also convert internal log code that uses XFS_FORCED_SHUTDOWN(mount)
to use xlog_is_shutdown(log) as well.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: Nguoxuenan <guoxuenan@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8804a686

07 1月, 2022 1 次提交

xfs: per-cpu deferred inode inactivation queues · 705fccfb

由 Dave Chinner 提交于 1月 07, 2022

mainline-inclusion
from mainline-v5.14-rc4
commit ab23a776
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab23a7768739a23d21d8a16ca37dff96b1ca957a

-------------------------------------------------

Move inode inactivation to background work contexts so that it no
longer runs in the context that releases the final reference to an
inode. This will allow process work that ends up blocking on
inactivation to continue doing work while the filesytem processes
the inactivation in the background.

A typical demonstration of this is unlinking an inode with lots of
extents. The extents are removed during inactivation, so this blocks
the process that unlinked the inode from the directory structure. By
moving the inactivation to the background process, the userspace
applicaiton can keep working (e.g. unlinking the next inode in the
directory) while the inactivation work on the previous inode is
done by a different CPU.

The implementation of the queue is relatively simple. We use a
per-cpu lockless linked list (llist) to queue inodes for
inactivation without requiring serialisation mechanisms, and a work
item to allow the queue to be processed by a CPU bound worker
thread. We also keep a count of the queue depth so that we can
trigger work after a number of deferred inactivations have been
queued.

The use of a bound workqueue with a single work depth allows the
workqueue to run one work item per CPU. We queue the work item on
the CPU we are currently running on, and so this essentially gives
us affine per-cpu worker threads for the per-cpu queues. THis
maintains the effective CPU affinity that occurs within XFS at the
AG level due to all objects in a directory being local to an AG.
Hence inactivation work tends to run on the same CPU that last
accessed all the objects that inactivation accesses and this
maintains hot CPU caches for unlink workloads.

A depth of 32 inodes was chosen to match the number of inodes in an
inode cluster buffer. This hopefully allows sequential
allocation/unlink behaviours to defering inactivation of all the
inodes in a single cluster buffer at a time, further helping
maintain hot CPU and buffer cache accesses while running
inactivations.

A hard per-cpu queue throttle of 256 inode has been set to avoid
runaway queuing when inodes that take a long to time inactivate are
being processed. For example, when unlinking inodes with large
numbers of extents that can take a lot of processing to free.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
[djwong: tweak comments and tracepoints, convert opflags to state bits]
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

705fccfb

27 12月, 2021 4 次提交

xfs: replace kmem_alloc_large() with kvmalloc() · a210d784

由 Dave Chinner 提交于 12月 27, 2021

mainline-inclusion
from mainline-v5.14-rc4
commit d634525d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d634525db63e9e946c3229fb93c8d9b763afbaf3

-------------------------------------------------

There is no reason for this wrapper existing anymore. All the places
that use KM_NOFS allocation are within transaction contexts and
hence covered by memalloc_nofs_save/restore contexts. Hence we don't
need any special handling of vmalloc for large IOs anymore and
so special casing this code isn't necessary.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a210d784

xfs: remove kmem_alloc_io() · eb5cf1cc

由 Dave Chinner 提交于 12月 27, 2021

mainline-inclusion
from mainline-v5.14-rc4
commit 98fe2c3c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=98fe2c3cef21b784e2efd1d9d891430d95b4f073

-------------------------------------------------

Since commit 59bb4798 ("mm, sl[aou]b: guarantee natural alignment
for kmalloc(power-of-two)"), the core slab code now guarantees slab
alignment in all situations sufficient for IO purposes (i.e. minimum
of 512 byte alignment of >= 512 byte sized heap allocations) we no
longer need the workaround in the XFS code to provide this
guarantee.

Replace the use of kmem_alloc_io() with kmem_alloc() or
kmem_alloc_large() appropriately, and remove the kmem_alloc_io()
interface altogether.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

eb5cf1cc

mm: Add kvrealloc() · 97316767

由 Dave Chinner 提交于 12月 27, 2021

mainline-inclusion
from mainline-v5.14-rc4
commit de2860f4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de2860f4636256836450c6543be744a50118fc66

-------------------------------------------------

During log recovery of an XFS filesystem with 64kB directory
buffers, rebuilding a buffer split across two log records results
in a memory allocation warning from krealloc like this:

xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
XFS (dm-0): Unmounting Filesystem
XFS (dm-0): Mounting V5 Filesystem
XFS (dm-0): Starting recovery (logdev: internal)
------------[ cut here ]------------
WARNING: CPU: 5 PID: 3435170 at mm/page_alloc.c:3539 get_page_from_freelist+0xdee/0xe40
.....
RIP: 0010:get_page_from_freelist+0xdee/0xe40
Call Trace:
 ? complete+0x3f/0x50
 __alloc_pages+0x16f/0x300
 alloc_pages+0x87/0x110
 kmalloc_order+0x2c/0x90
 kmalloc_order_trace+0x1d/0x90
 __kmalloc_track_caller+0x215/0x270
 ? xlog_recover_add_to_cont_trans+0x63/0x1f0
 krealloc+0x54/0xb0
 xlog_recover_add_to_cont_trans+0x63/0x1f0
 xlog_recovery_process_trans+0xc1/0xd0
 xlog_recover_process_ophdr+0x86/0x130
 xlog_recover_process_data+0x9f/0x160
 xlog_recover_process+0xa2/0x120
 xlog_do_recovery_pass+0x40b/0x7d0
 ? __irq_work_queue_local+0x4f/0x60
 ? irq_work_queue+0x3a/0x50
 xlog_do_log_recovery+0x70/0x150
 xlog_do_recover+0x38/0x1d0
 xlog_recover+0xd8/0x170
 xfs_log_mount+0x181/0x300
 xfs_mountfs+0x4a1/0x9b0
 xfs_fs_fill_super+0x3c0/0x7b0
 get_tree_bdev+0x171/0x270
 ? suffix_kstrtoint.constprop.0+0xf0/0xf0
 xfs_fs_get_tree+0x15/0x20
 vfs_get_tree+0x24/0xc0
 path_mount+0x2f5/0xaf0
 __x64_sys_mount+0x108/0x140
 do_syscall_64+0x3a/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Essentially, we are taking a multi-order allocation from kmem_alloc()
(which has an open coded no fail, no warn loop) and then
reallocating it out to 64kB using krealloc(__GFP_NOFAIL) and that is
then triggering the above warning.

This is a regression caused by converting this code from an open
coded no fail/no warn reallocation loop to using __GFP_NOFAIL.

What we actually need here is kvrealloc(), so that if contiguous
page allocation fails we fall back to vmalloc() and we don't
get nasty warnings happening in XFS.

Fixes: 771915c4 ("xfs: remove kmem_realloc()")
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Acked-by: NMel Gorman <mgorman@techsingularity.net>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

97316767

xfs: force the log offline when log intent item recovery fails · 6bbb9a25

由 Darrick J. Wong 提交于 12月 27, 2021

mainline-inclusion
from mainline-v5.13-rc4
commit 4e6b8270
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e6b8270c820c8c57a73f869799a0af2b56eff3e

-------------------------------------------------

If any part of log intent item recovery fails, we should shut down the
log immediately to stop the log from writing a clean unmount record to
disk, because the metadata is not consistent.  The inability to cancel a
dirty transaction catches most of these cases, but there are a few
things that have slipped through the cracks, such as ENOSPC from a
transaction allocation, or runtime errors that result in cancellation of
a non-dirty transaction.

This solves some weird behaviors reported by customers where a system
goes down, the first mount fails, the second succeeds, but then the fs
goes down later because of inconsistent metadata.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
Reviewed-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6bbb9a25

22 10月, 2020 1 次提交

xfs: cancel intents immediately if process_intents fails · 2e76f188

由 Darrick J. Wong 提交于 10月 19, 2020

If processing recovered log intent items fails, we need to cancel all
the unprocessed recovered items immediately so that a subsequent AIL
push in the bail out path won't get wedged on the pinned intent items
that didn't get processed.

This can happen if the log contains (1) an intent that gets and releases
an inode, (2) an intent that cannot be recovered successfully, and (3)
some third intent item. When recovery of (2) fails, we leave (3) pinned
in memory. Inode reclamation is called in the error-out path of
xfs_mountfs before xfs_log_cancel_mount. Reclamation calls
xfs_ail_push_all_sync, which gets stuck waiting for (3).

Therefore, call xlog_recover_cancel_intents if _process_intents fails.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

2e76f188

07 10月, 2020 5 次提交

xfs: fix an incore inode UAF in xfs_bui_recover · ff4ab5e0

由 Darrick J. Wong 提交于 9月 25, 2020

In xfs_bui_item_recover, there exists a use-after-free bug with regards
to the inode that is involved in the bmap replay operation.  If the
mapping operation does not complete, we call xfs_bmap_unmap_extent to
create a deferred op to finish the unmapping work, and we retain a
pointer to the incore inode.

Unfortunately, the very next thing we do is commit the transaction and
drop the inode.  If reclaim tears down the inode before we try to finish
the defer ops, we dereference garbage and blow up.  Therefore, create a
way to join inodes to the defer ops freezer so that we can maintain the
xfs_inode reference until we're done with the inode.

Note: This imposes the requirement that there be enough memory to keep
every incore inode in memory throughout recovery.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

ff4ab5e0

xfs: xfs_defer_capture should absorb remaining transaction reservation · 929b92f6

由 Darrick J. Wong 提交于 9月 25, 2020

When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the transaction reservation type
from the old transaction so that when we continue the dfops chain, we
still use the same reservation parameters.

Doing this means that the log item recovery functions get to determine
the transaction reservation instead of abusing tr_itruncate in yet
another part of xfs.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

929b92f6

xfs: xfs_defer_capture should absorb remaining block reservations · 4f9a60c4

由 Darrick J. Wong 提交于 9月 25, 2020

When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the remaining block reservations so
that when we continue the dfops chain, we can reserve the same number of
blocks to use.  We capture the reservations for both data and realtime
volumes.

This adds the requirement that every log intent item recovery function
must be careful to reserve enough blocks to handle both itself and all
defer ops that it can queue.  On the other hand, this enables us to do
away with the handwaving block estimation nonsense that was going on in
xlog_finish_defer_ops.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

4f9a60c4

xfs: proper replay of deferred ops queued during log recovery · e6fff81e

由 Darrick J. Wong 提交于 9月 25, 2020

When we replay unfinished intent items that have been recovered from the
log, it's possible that the replay will cause the creation of more
deferred work items. As outlined in commit 50995582 ("xfs: log
recovery should replay deferred ops in order"), later work items have an
implicit ordering dependency on earlier work items. Therefore, recovery
must replay the items (both recovered and created) in the same order
that they would have been during normal operation.

For log recovery, we enforce this ordering by using an empty transaction
to collect deferred ops that get created in the process of recovering a
log intent item to prevent them from being committed before the rest of
the recovered intent items. After we finish committing all the
recovered log items, we allocate a transaction with an enormous block
reservation, splice our huge list of created deferred ops into that
transaction, and commit it, thereby finishing all those ops.

This is /really/ hokey -- it's the one place in XFS where we allow
nested transactions; the splicing of the defer ops list is is inelegant
and has to be done twice per recovery function; and the broken way we
handle inode pointers and block reservations cause subtle use-after-free
and allocator problems that will be fixed by this patch and the two
patches after it.

Therefore, replace the hokey empty transaction with a structure designed
to capture each chain of deferred ops that are created as part of
recovering a single unfinished log intent. Finally, refactor the loop
that replays those chains to do so using one transaction per chain.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

e6fff81e

xfs: remove XFS_LI_RECOVERED · 901219bb

由 Darrick J. Wong 提交于 9月 28, 2020

The ->iop_recover method of a log intent item removes the recovered
intent item from the AIL by logging an intent done item and committing
the transaction, so it's superfluous to have this flag check.  Nothing
else uses it, so get rid of the flag entirely.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

901219bb

26 9月, 2020 1 次提交

xfs: do the assert for all the log done items in xfs_trans_cancel · d6b8fc6c

由 Kaixu Xia 提交于 9月 23, 2020

We should do the assert for all the log intent-done items if they appear
here. This patch detect intent-done items by the fact that their item ops
don't have iop_unpin and iop_push methods and also move the helper
xlog_item_is_intent to xfs_trans.h.
Signed-off-by: NKaixu Xia <kaixuxia@tencent.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d6b8fc6c

24 9月, 2020 1 次提交

xfs: clean up calculation of LR header blocks · 0c771b99

由 Gao Xiang 提交于 9月 22, 2020

Let's use DIV_ROUND_UP() to calculate log record header
blocks as what did in xlog_get_iclog_buffer_size() and
wrap up a common helper for log recovery.
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NGao Xiang <hsiangkao@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

0c771b99

23 9月, 2020 1 次提交

xfs: avoid LR buffer overrun due to crafted h_len · f692d09e

由 Gao Xiang 提交于 9月 22, 2020

Currently, crafted h_len has been blocked for the log
header of the tail block in commit a70f9fe5 ("xfs:
detect and handle invalid iclog size set by mkfs").

However, each log record could still have crafted h_len
and cause log record buffer overrun. So let's check
h_len vs buffer size for each log record as well.
Signed-off-by: NGao Xiang <hsiangkao@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

f692d09e

16 9月, 2020 5 次提交

xfs: reuse _xfs_buf_read for re-reading the superblock · 26e32875

由 Christoph Hellwig 提交于 9月 01, 2020

Instead of poking deeply into buffer cache internals when re-reading the
superblock during log recovery just generalize _xfs_buf_read and use it
there.  Note that we don't have to explicitly set up the ops as they
must be set from the initial read.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

26e32875

xfs: remove xfs_getsb · b3f8e08c

由 Christoph Hellwig 提交于 9月 01, 2020

Merge xfs_getsb into its only caller, and clean that one up a little bit
as well.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

b3f8e08c

xfs: remove xlog_recover_iodone · 22c10589

由 Christoph Hellwig 提交于 9月 01, 2020

The log recovery I/O completion handler does not substancially differ from
the normal one except for the fact that it:

 a) never retries failed writes
 b) can have log items that aren't on the AIL
 c) never has inode/dquot log items attached and thus don't need to
    handle them

Add conditionals for (a) and (b) to the ioend code, while (c) doesn't
need special handling anyway.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

22c10589

xfs: fold xfs_buf_ioend_finish into xfs_ioend · 6a7584b1

由 Christoph Hellwig 提交于 9月 01, 2020

No need to keep a separate helper for this logic.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

6a7584b1

xfs: refactor xfs_buf_ioend · 23fb5a93

由 Christoph Hellwig 提交于 9月 01, 2020

Move the log recovery I/O completion handling entirely into the log
recovery code, and re-arrange the normal I/O completion handler flow
to prepare to lifting more logic into common code in the next commits.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

23fb5a93

07 9月, 2020 1 次提交

xfs: remove kmem_realloc() · 771915c4

由 Carlos Maiolino 提交于 8月 26, 2020

Remove kmem_realloc() function and convert its users to use MM API
directly (krealloc())
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

771915c4

05 8月, 2020 1 次提交

xfs: delete duplicated words + other fixes · b63da6c8

由 Randy Dunlap 提交于 8月 05, 2020

Delete repeated words in fs/xfs/.
{we, that, the, a, to, fork}
Change "it it" to "it is" in one location.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
To: linux-fsdevel@vger.kernel.org
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

b63da6c8

07 7月, 2020 1 次提交

xfs: mark log recovery buffers for completion · 9fe5c77c

由 Dave Chinner 提交于 6月 29, 2020

Log recovery has it's own buffer write completion handler for
buffers that it directly recovers. Convert these to direct calls by
flagging these buffers as being log recovery buffers. The flag will
get cleared by the log recovery IO completion routine, so it will
never leak out of log recovery.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

9fe5c77c

08 5月, 2020 4 次提交

xfs: remove unnecessary includes from xfs_log_recover.c · 6ea670ad

由 Darrick J. Wong 提交于 5月 01, 2020

Remove unnecessary includes from the log recovery code.
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>

6ea670ad

xfs: move log recovery buffer cancellation code to xfs_buf_item_recover.c · 17d29bf2

由 Darrick J. Wong 提交于 5月 01, 2020

Move the helpers that handle incore buffer cancellation records to
xfs_buf_item_recover.c since they're not directly related to the main
log recovery machinery.  No functional changes.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

17d29bf2

xfs: hoist setting of XFS_LI_RECOVERED to caller · cc560a5a

由 Darrick J. Wong 提交于 5月 01, 2020

The only purpose of XFS_LI_RECOVERED is to prevent log recovery from
trying to replay recovered intents more than once.  Therefore, we can
move the bit setting up to the ->iop_recover caller.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

cc560a5a

xfs: refactor intent item iop_recover calls · 96b60f82

由 Darrick J. Wong 提交于 5月 01, 2020

Now that we've made the recovered item tests all the same, we can hoist
the test and the ail locking code to the ->iop_recover caller and call
the recovery function directly.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

96b60f82

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功