提交 · fd920008784ead369e79c2be2f8d9cc736e306ca · openeuler / Kernel

04 5月, 2022 1 次提交

xfs: Set up infrastructure for log attribute replay · fd920008

由 Allison Henderson 提交于 5月 04, 2022

Currently attributes are modified directly across one or more
transactions. But they are not logged or replayed in the event of an
error. The goal of log attr replay is to enable logging and replaying
of attribute operations using the existing delayed operations
infrastructure. This will later enable the attributes to become part of
larger multi part operations that also must first be recorded to the
log. This is mostly of interest in the scheme of parent pointers which
would need to maintain an attribute containing parent inode information
any time an inode is moved, created, or removed. Parent pointers would
then be of interest to any feature that would need to quickly derive an
inode path from the mount point. Online scrub, nfs lookups and fs grow
or shrink operations are all features that could take advantage of this.

This patch adds two new log item types for setting or removing
attributes as deferred operations. The xfs_attri_log_item will log an
intent to set or remove an attribute. The corresponding
xfs_attrd_log_item holds a reference to the xfs_attri_log_item and is
freed once the transaction is done. Both log items use a generic
xfs_attr_log_format structure that contains the attribute name, value,
flags, inode, and an op_flag that indicates if the operations is a set
or remove.

[dchinner: added extra little bits needed for intent whiteouts]
Signed-off-by: NAllison Henderson <allison.henderson@oracle.com>
Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDave Chinner <david@fromorbit.com>

fd920008

23 10月, 2021 1 次提交

xfs: create slab caches for frequently-used deferred items · f3c799c2

由 Darrick J. Wong 提交于 10月 12, 2021

Create slab caches for the high-level structures that coordinate
deferred intent items, since they're used fairly heavily.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>

f3c799c2

15 10月, 2021 2 次提交

xfs: port the defer ops capture and continue to resource capture · 512edfac

由 Darrick J. Wong 提交于 9月 16, 2021

When log recovery tries to recover a transaction that had log intent
items attached to it, it has to save certain parts of the transaction
state (reservation, dfops chain, inodes with no automatic unlock) so
that it can finish single-stepping the recovered transactions before
finishing the chains.

This is done with the xfs_defer_ops_capture and xfs_defer_ops_continue
functions.  Right now they open-code this functionality, so let's port
this to the formalized resource capture structure that we introduced in
the previous patch.  This enables us to hold up to two inodes and two
buffers during log recovery, the same way we do for regular runtime.

With this patch applied, we'll be ready to support atomic extent swap
which holds two inodes; and logged xattrs which holds one inode and one
xattr leaf buffer.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>

512edfac

xfs: formalize the process of holding onto resources across a defer roll · c5db9f93

由 Darrick J. Wong 提交于 9月 16, 2021

Transaction users are allowed to flag up to two buffers and two inodes
for ownership preservation across a deferred transaction roll. Hoist
the variables and code responsible for this out of xfs_defer_trans_roll
so that we can use it for the defer capture mechanism.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>

c5db9f93

07 10月, 2020 4 次提交

xfs: fix an incore inode UAF in xfs_bui_recover · ff4ab5e0

由 Darrick J. Wong 提交于 9月 25, 2020

In xfs_bui_item_recover, there exists a use-after-free bug with regards
to the inode that is involved in the bmap replay operation.  If the
mapping operation does not complete, we call xfs_bmap_unmap_extent to
create a deferred op to finish the unmapping work, and we retain a
pointer to the incore inode.

Unfortunately, the very next thing we do is commit the transaction and
drop the inode.  If reclaim tears down the inode before we try to finish
the defer ops, we dereference garbage and blow up.  Therefore, create a
way to join inodes to the defer ops freezer so that we can maintain the
xfs_inode reference until we're done with the inode.

Note: This imposes the requirement that there be enough memory to keep
every incore inode in memory throughout recovery.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

ff4ab5e0

xfs: xfs_defer_capture should absorb remaining transaction reservation · 929b92f6

由 Darrick J. Wong 提交于 9月 25, 2020

When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the transaction reservation type
from the old transaction so that when we continue the dfops chain, we
still use the same reservation parameters.

Doing this means that the log item recovery functions get to determine
the transaction reservation instead of abusing tr_itruncate in yet
another part of xfs.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

929b92f6

xfs: xfs_defer_capture should absorb remaining block reservations · 4f9a60c4

由 Darrick J. Wong 提交于 9月 25, 2020

When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the remaining block reservations so
that when we continue the dfops chain, we can reserve the same number of
blocks to use.  We capture the reservations for both data and realtime
volumes.

This adds the requirement that every log intent item recovery function
must be careful to reserve enough blocks to handle both itself and all
defer ops that it can queue.  On the other hand, this enables us to do
away with the handwaving block estimation nonsense that was going on in
xlog_finish_defer_ops.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

4f9a60c4

xfs: proper replay of deferred ops queued during log recovery · e6fff81e

由 Darrick J. Wong 提交于 9月 25, 2020

When we replay unfinished intent items that have been recovered from the
log, it's possible that the replay will cause the creation of more
deferred work items. As outlined in commit 50995582 ("xfs: log
recovery should replay deferred ops in order"), later work items have an
implicit ordering dependency on earlier work items. Therefore, recovery
must replay the items (both recovered and created) in the same order
that they would have been during normal operation.

For log recovery, we enforce this ordering by using an empty transaction
to collect deferred ops that get created in the process of recovering a
log intent item to prevent them from being committed before the rest of
the recovered intent items. After we finish committing all the
recovered log items, we allocate a transaction with an enormous block
reservation, splice our huge list of created deferred ops into that
transaction, and commit it, thereby finishing all those ops.

This is /really/ hokey -- it's the one place in XFS where we allow
nested transactions; the splicing of the defer ops list is is inelegant
and has to be done twice per recovery function; and the broken way we
handle inode pointers and block reservations cause subtle use-after-free
and allocator problems that will be fixed by this patch and the two
patches after it.

Therefore, replace the hokey empty transaction with a structure designed
to capture each chain of deferred ops that are created as part of
recovering a single unfinished log intent. Finally, refactor the loop
that replays those chains to do so using one transaction per chain.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

e6fff81e

23 9月, 2020 1 次提交

xfs: log new intent items created as part of finishing recovered intent items · 93293bcb

由 Darrick J. Wong 提交于 9月 21, 2020

During a code inspection, I found a serious bug in the log intent item
recovery code when an intent item cannot complete all the work and
decides to requeue itself to get that done. When this happens, the
item recovery creates a new incore deferred op representing the
remaining work and attaches it to the transaction that it allocated. At
the end of _item_recover, it moves the entire chain of deferred ops to
the dummy parent_tp that xlog_recover_process_intents passed to it, but
fail to log a new intent item for the remaining work before committing
the transaction for the single unit of work.

xlog_finish_defer_ops logs those new intent items once recovery has
finished dealing with the intent items that it recovered, but this isn't
sufficient. If the log is forced to disk after a recovered log item
decides to requeue itself and the system goes down before we call
xlog_finish_defer_ops, the second log recovery will never see the new
intent item and therefore has no idea that there was more work to do.
It will finish recovery leaving the filesystem in a corrupted state.

The same logic applies to /any/ deferred ops added during intent item
recovery, not just the one handling the remaining work.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

93293bcb

14 5月, 2020 1 次提交

xfs: Use the correct style for SPDX License Identifier · 508578f2

由 Nishad Kamdar 提交于 5月 12, 2020

This patch corrects the SPDX License Identifier style in header files
related to XFS File System support. For C header files
Documentation/process/license-rules.rst mandates C-like comments.
(opposed to C source files where C++ style should be used).

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: NJoe Perches <joe@perches.com>
Signed-off-by: NNishad Kamdar <nishadkamdar@gmail.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

508578f2

05 5月, 2020 6 次提交

xfs: spell out the parameter name for ->cancel_item · 2f88f1ef

由 Christoph Hellwig 提交于 4月 30, 2020

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

2f88f1ef

xfs: use a xfs_btree_cur for the ->finish_cleanup state · 3ec1b26c

由 Christoph Hellwig 提交于 4月 30, 2020

Given how XFS is all based around btrees it doesn't make much sense
to offer a totally generic state when we can just use the btree cursor.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3ec1b26c

xfs: turn dfp_done into a xfs_log_item · f09d167c

由 Christoph Hellwig 提交于 4月 30, 2020

All defer op instance place their own extension of the log item into
the dfp_done field.  Replace that with a xfs_log_item to improve type
safety and make the code easier to follow.

Also use the opportunity to improve the ->finish_item calling conventions
to place the done log item as the higher level structure before the
list_entry used for the individual items.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

f09d167c

xfs: turn dfp_intent into a xfs_log_item · 13a83333

由 Christoph Hellwig 提交于 4月 30, 2020

All defer op instance place their own extension of the log item into
the dfp_intent field.  Replace that with a xfs_log_item to improve type
safety and make the code easier to follow.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

13a83333

xfs: merge the ->diff_items defer op into ->create_intent · d367a868

由 Christoph Hellwig 提交于 4月 30, 2020

This avoids a per-item indirect call, and also simplifies the interface
a bit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d367a868

xfs: merge the ->log_item defer op into ->create_intent · c1f09188

由 Christoph Hellwig 提交于 4月 30, 2020

These are aways called together, and my merging them we reduce the amount
of indirect calls, improve type safety and in general clean up the code
a bit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

c1f09188

13 12月, 2018 2 次提交

xfs: streamline defer op type handling · 02b100fb

由 Darrick J. Wong 提交于 12月 12, 2018

There's no need to bundle a pointer to the defer op type into the defer
op control structure.  Instead, store the defer op type enum, which
enables us to shorten some of the lines.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

02b100fb

xfs: idiotproof defer op type configuration · bc9f2b7c

由 Darrick J. Wong 提交于 12月 12, 2018

Recently, we forgot to port a new defer op type to xfsprogs, which
caused us some userspace pain.  Reorganize the way we make libxfs
clients supply defer op type information so that all type information
has to be provided at build time instead of risky runtime dynamic
configuration.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

bc9f2b7c

03 8月, 2018 10 次提交

xfs: fold dfops into the transaction · 9d9e6233