- 10 12月, 2020 1 次提交
-
-
由 Darrick J. Wong 提交于
When we recover a bmap intent from the log, we need to validate its contents before we try to replay them. Hoist the checking code into a separate function in preparation to refactor this code to use validation helpers. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 07 10月, 2020 5 次提交
-
-
由 Darrick J. Wong 提交于
There's a subtle design flaw in the deferred log item code that can lead to pinning the log tail. Taking up the defer ops chain examples from the previous commit, we can get trapped in sequences like this: Caller hands us a transaction t0 with D0-D3 attached. The defer ops chain will look like the following if the transaction rolls succeed: t1: D0(t0), D1(t0), D2(t0), D3(t0) t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0) t3: d5(t1), D1(t0), D2(t0), D3(t0) ... t9: d9(t7), D3(t0) t10: D3(t0) t11: d10(t10), d11(t10) t12: d11(t10) In transaction 9, we finish d9 and try to roll to t10 while holding onto an intent item for D3 that we logged in t0. The previous commit changed the order in which we place new defer ops in the defer ops processing chain to reduce the maximum chain length. Now make xfs_defer_finish_noroll capable of relogging the entire chain periodically so that we can always move the log tail forward. Most chains will never get relogged, except for operations that generate very long chains (large extents containing many blocks with different sharing levels) or are on filesystems with small logs and a lot of ongoing metadata updates. Callers are now required to ensure that the transaction reservation is large enough to handle logging done items and new intent items for the maximum possible chain length. Most callers are careful to keep the chain lengths low, so the overhead should be minimal. The decision to relog an intent item is made based on whether the intent was logged in a previous checkpoint, since there's no point in relogging an intent into the same checkpoint. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
In xfs_bui_item_recover, there exists a use-after-free bug with regards to the inode that is involved in the bmap replay operation. If the mapping operation does not complete, we call xfs_bmap_unmap_extent to create a deferred op to finish the unmapping work, and we retain a pointer to the incore inode. Unfortunately, the very next thing we do is commit the transaction and drop the inode. If reclaim tears down the inode before we try to finish the defer ops, we dereference garbage and blow up. Therefore, create a way to join inodes to the defer ops freezer so that we can maintain the xfs_inode reference until we're done with the inode. Note: This imposes the requirement that there be enough memory to keep every incore inode in memory throughout recovery. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
In most places in XFS, we have a specific order in which we gather resources: grab the inode, allocate a transaction, then lock the inode. xfs_bui_item_recover doesn't do it in that order, so fix it to be more consistent. This also makes the error bailout code a bit less weird. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
The bmap intent item checking code in xfs_bui_item_recover is spread all over the function. We should check the recovered log item at the top before we allocate any resources or do anything else, so do that. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
When we replay unfinished intent items that have been recovered from the log, it's possible that the replay will cause the creation of more deferred work items. As outlined in commit 50995582 ("xfs: log recovery should replay deferred ops in order"), later work items have an implicit ordering dependency on earlier work items. Therefore, recovery must replay the items (both recovered and created) in the same order that they would have been during normal operation. For log recovery, we enforce this ordering by using an empty transaction to collect deferred ops that get created in the process of recovering a log intent item to prevent them from being committed before the rest of the recovered intent items. After we finish committing all the recovered log items, we allocate a transaction with an enormous block reservation, splice our huge list of created deferred ops into that transaction, and commit it, thereby finishing all those ops. This is /really/ hokey -- it's the one place in XFS where we allow nested transactions; the splicing of the defer ops list is is inelegant and has to be done twice per recovery function; and the broken way we handle inode pointers and block reservations cause subtle use-after-free and allocator problems that will be fixed by this patch and the two patches after it. Therefore, replace the hokey empty transaction with a structure designed to capture each chain of deferred ops that are created as part of recovering a single unfinished log intent. Finally, refactor the loop that replays those chains to do so using one transaction per chain. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 23 9月, 2020 3 次提交
-
-
由 Darrick J. Wong 提交于
Nowadays, log recovery will call ->release on the recovered intent items if recovery fails. Therefore, it's redundant to release them from inside the ->recover functions when they're about to return an error. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
In the bmap intent item recovery code, we must be careful to attach the inode to its dquots (if quotas are enabled) so that a change in the shape of the bmap btree doesn't cause the quota counters to be incorrect. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
During a code inspection, I found a serious bug in the log intent item recovery code when an intent item cannot complete all the work and decides to requeue itself to get that done. When this happens, the item recovery creates a new incore deferred op representing the remaining work and attaches it to the transaction that it allocated. At the end of _item_recover, it moves the entire chain of deferred ops to the dummy parent_tp that xlog_recover_process_intents passed to it, but fail to log a new intent item for the remaining work before committing the transaction for the single unit of work. xlog_finish_defer_ops logs those new intent items once recovery has finished dealing with the intent items that it recovered, but this isn't sufficient. If the log is forced to disk after a recovered log item decides to requeue itself and the system goes down before we call xlog_finish_defer_ops, the second log recovery will never see the new intent item and therefore has no idea that there was more work to do. It will finish recovery leaving the filesystem in a corrupted state. The same logic applies to /any/ deferred ops added during intent item recovery, not just the one handling the remaining work. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 29 7月, 2020 1 次提交
-
-
由 Carlos Maiolino 提交于
Use kmem_cache_zalloc() directly. With the exception of xlog_ticket_alloc() which will be dealt on the next patch for readability. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 08 5月, 2020 8 次提交
-
-
由 Darrick J. Wong 提交于
The only purpose of XFS_LI_RECOVERED is to prevent log recovery from trying to replay recovered intents more than once. Therefore, we can move the bit setting up to the ->iop_recover caller. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Now that we've made the recovered item tests all the same, we can hoist the test and the ail locking code to the ->iop_recover caller and call the recovery function directly. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Rename XFS_{EFI,BUI,RUI,CUI}_RECOVERED to XFS_LI_RECOVERED so that we track recovery status in the log item, then get rid of the now unused flags fields in each of those log item types. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
During recovery, every intent that we recover from the log has to be added to the AIL. Replace the open-coded addition with a helper. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
-
由 Darrick J. Wong 提交于
Replace the open-coded AIL item walking with a proper helper when we're trying to release an intent item that has been finished. We add a new ->iop_match method to decide if an intent item matches a supplied ID. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Move the code that processes the log items created from the recovered log items into the per-item source code files and use dispatch functions to call them. No functional changes. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Move the bmap update intent and intent-done pass2 commit code into the per-item source code files and use dispatch functions to call them. We do these one at a time because there's a lot of code to move. No functional changes. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Create a generic dispatch structure to delegate recovery of different log item types into various code modules. This will enable us to move code specific to a particular log item type out of xfs_log_recover.c and into the log item source. The first operation we virtualize is the log item sorting. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 07 5月, 2020 1 次提交
-
-
由 Brian Foster 提交于
Various intent log items call xfs_trans_ail_remove() with a log I/O error shutdown type, but this helper historically checks whether an item is in the AIL before calling xfs_trans_ail_delete(). This means the shutdown check is essentially a no-op for users of xfs_trans_ail_remove(). It is possible that some items might not be AIL resident when the AIL remove attempt occurs, but this should be isolated to cases where the filesystem has already shutdown. For example, this includes abort of the transaction committing the intent and I/O error of the iclog buffer committing the intent to the log. Therefore, update these callsites to use xfs_trans_ail_delete() to provide AIL state validation for the common path of items being released and removed when associated done items commit to the physical log. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAllison Collins <allison.henderson@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 05 5月, 2020 5 次提交
-
-
由 Christoph Hellwig 提交于
Given how XFS is all based around btrees it doesn't make much sense to offer a totally generic state when we can just use the btree cursor. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
All defer op instance place their own extension of the log item into the dfp_done field. Replace that with a xfs_log_item to improve type safety and make the code easier to follow. Also use the opportunity to improve the ->finish_item calling conventions to place the done log item as the higher level structure before the list_entry used for the individual items. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
All defer op instance place their own extension of the log item into the dfp_intent field. Replace that with a xfs_log_item to improve type safety and make the code easier to follow. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This avoids a per-item indirect call, and also simplifies the interface a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
These are aways called together, and my merging them we reduce the amount of indirect calls, improve type safety and in general clean up the code a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 19 11月, 2019 1 次提交
-
-
由 Carlos Maiolino 提交于
We can remove it now, without needing to rework the KM_ flags. Use kmem_cache_free() directly. Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 11 11月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Convert EIO to EFSCORRUPTED in the logging code when we can determine that the log contents are invalid. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 05 11月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Make sure we log something to dmesg whenever we return -EFSCORRUPTED up the call stack. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 28 8月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Remove the return value from the functions that schedule deferred bmap operations since they never fail and do not return status. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 27 8月, 2019 1 次提交
-
-
由 Tetsuo Handa 提交于
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP, we can remove KM_NOSLEEP and replace KM_SLEEP with 0. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 29 6月, 2019 8 次提交
-
-
由 Eric Sandeen 提交于
There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Keep all bmap item related code together. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
There is no good reason to keep these two functions separate. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We have various items that are released from ->iop_comitting. Add a flag to just call ->iop_release from the commit path to avoid tons of boilerplate code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
The iop_unlock method is called when comitting or cancelling a transaction. In the latter case, the transaction may or may not be aborted. While there is no known problem with the current code in practice, this implementation is limited in that any log item implementation that might want to differentiate between a commit and a cancellation must rely on the aborted state. The aborted bit is only set when the cancelled transaction is dirty, however. This means that there is no way to distinguish between a commit and a clean transaction cancellation. For example, intent log items currently rely on this distinction. The log item is either transferred to the CIL on commit or released on transaction cancel. There is currently no possibility for a clean intent log item in a transaction, but if that state is ever introduced a cancel of such a transaction will immediately result in memory leaks of the associated log item(s). This is an interface deficiency and landmine. To clean this up, replace the iop_unlock method with an iop_release method that is specific to transaction cancel. The existing iop_committing method occurs at the same time as iop_unlock in the commit path and there is no need for two separate callbacks here. Overload the iop_committing method with the current commit time iop_unlock implementations to eliminate the need for the latter and further simplify the interface. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Just check if they are present first. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Darrick J. Wong 提交于
The inode geometry structure isn't related to ondisk format; it's support for the mount structure. Move it to xfs_shared.h. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 03 8月, 2018 3 次提交
-
-
由 Brian Foster 提交于
The majority of remaining references to struct xfs_defer_ops in XFS are associated with xfs_defer_add(). At this point, there are no more external xfs_defer_ops users left. All instances of xfs_defer_ops are embedded in the transaction, which means we can safely pass the transaction down to the dfops add interface. Update xfs_defer_add() to receive the transaction as a parameter. Various subsystems implement wrappers to allocate and construct the context specific data structures for the associated deferred operation type. Update these to also carry the transaction down as needed and clean up unused dfops parameters along the way. This removes most of the remaining references to struct xfs_defer_ops throughout the code and facilitates removal of the structure. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> [darrick: fix unused variable warnings with ftrace disabled] Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
The dfops infrastructure ->finish_item() callback passes the transaction and dfops as separate parameters. Since dfops is always part of a transaction, the latter parameter is no longer necessary. Remove it from the various callbacks. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
All callers pass ->t_dfops of the associated transactions. Refactor the helpers to receive the transactions and facilitate further cleanups between xfs_defer_ops and xfs_trans. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-