- 31 10月, 2022 1 次提交
-
-
由 Darrick J. Wong 提交于
Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of memcpy. Unfortunately, it doesn't handle flex arrays correctly: ------------[ cut here ]------------ memcpy: detected field-spanning write (size 48) of single field "dst_bui_fmt" at fs/xfs/xfs_bmap_item.c:628 (size 16) Fix this by refactoring the xfs_bui_copy_format function to handle the copying of the head and the flex array members separately. While we're at it, fix a minor validation deficiency in the recovery function. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 04 5月, 2022 5 次提交
-
-
由 Dave Chinner 提交于
When we release an intent that a whiteout applies to, it will not have been committed to the journal and so won't be in the AIL. Hence when we drop the last reference to the intent, we do not want to try to remove it from the AIL as that will trigger a filesystem shutdown. Hence make the removal of intents from the AIL conditional on them actually being in the AIL so we do the correct thing. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
To apply a whiteout to an intent item when an intent done item is committed, we need to be able to retrieve the intent item from the the intent done item. Add a log item op method for doing this, and wire all the intent done items up to it. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
Intent whiteouts will require extra work to be done during transaction commit if the transaction contains an intent done item. To determine if a transaction contains an intent done item, we want to avoid having to walk all the items in the transaction to check if they are intent done items. Hence when we add an intent done item to a transaction, tag the transaction to indicate that it contains such an item. We don't tag the transaction when the defer ops is relogging an intent to move it forward in the log. Whiteouts will never apply to these cases, so we don't need to bother looking for them. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
We currently have a couple of helper functions that try to infer whether the log item is an intent or intent done item from the combinations of operations it supports. This is incredibly fragile and not very efficient as it requires checking specific combinations of ops. We need to be able to identify intent and intent done items quickly and easily in upcoming patches, so simply add intent and intent done type flags to the log item ops flags. These are static flags to begin with, so intent items should have been typed like this from the start. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
Ever since we added shadown format buffers to the log items, log items need to handle the item being released with shadow buffers attached. Due to the fact this requirement was added at the same time we added new rmap/reflink intents, we missed the cleanup of those items. In theory, this means shadow buffers can be leaked in a very small window when a shutdown is initiated. Testing with KASAN shows this leak does not happen in practice - we haven't identified a single leak in several years of shutdown testing since ~v4.8 kernels. However, the intent whiteout cleanup mechanism results in every cancelled intent in exactly the same state as this tiny race window creates and so if intents down clean up shadow buffers on final release we will leak the shadow buffer for just about every intent we create. Hence we start with this patch to close this condition off and ensure that when whiteouts start to be used we don't leak lots of memory. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 13 4月, 2022 1 次提交
-
-
由 Chandan Babu R 提交于
This commit enables upgrading existing inodes to use large extent counters provided that underlying filesystem's superblock has large extent counter feature enabled. Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NChandan Babu R <chandan.babu@oracle.com>
-
- 20 3月, 2022 1 次提交
-
-
由 Dave Chinner 提交于
Log items belong to the log, not the xfs_mount. Convert the mount pointer in the log item to a xlog pointer in preparation for upcoming log centric changes to the log items. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
- 23 10月, 2021 3 次提交
-
-
由 Darrick J. Wong 提交于
Create slab caches for the high-level structures that coordinate deferred intent items, since they're used fairly heavily. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
-
由 Darrick J. Wong 提交于
Now that we've gotten rid of the kmem_zone_t typedef, rename the variables to _cache since that's what they are. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
-
由 Darrick J. Wong 提交于
Remove these typedefs by referencing kmem_cache directly. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandan.babu@oracle.com>
-
- 15 10月, 2021 1 次提交
-
-
由 Darrick J. Wong 提交于
When log recovery tries to recover a transaction that had log intent items attached to it, it has to save certain parts of the transaction state (reservation, dfops chain, inodes with no automatic unlock) so that it can finish single-stepping the recovered transactions before finishing the chains. This is done with the xfs_defer_ops_capture and xfs_defer_ops_continue functions. Right now they open-code this functionality, so let's port this to the formalized resource capture structure that we introduced in the previous patch. This enables us to hold up to two inodes and two buffers during log recovery, the same way we do for regular runtime. With this patch applied, we'll be ready to support atomic extent swap which holds two inodes; and logged xattrs which holds one inode and one xattr leaf buffer. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
-
- 10 8月, 2021 2 次提交
-
-
由 Darrick J. Wong 提交于
Hoist the code from xfs_bui_item_recover that igets an inode and marks it as being part of log intent recovery. The next patch will want a common function. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
-
由 Darrick J. Wong 提交于
If we try to recover a log intent item and the operation fails due to filesystem corruption, dump the contents of the item to the log for further analysis. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 09 4月, 2021 1 次提交
-
-
由 Sami Tolvanen 提交于
list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKees Cook <keescook@chromium.org> Tested-by: NNick Desaulniers <ndesaulniers@google.com> Tested-by: NNathan Chancellor <nathan@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
-
- 23 1月, 2021 2 次提交
-
-
由 Chandan Babu R 提交于
The extent mapping the file offset at which a hole has to be inserted will be split into two extents causing extent count to increase by 1. Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Signed-off-by: NChandan Babu R <chandanrlinux@gmail.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Chandan Babu R 提交于
When adding a new data extent (without modifying an inode's existing extents) the extent count increases only by 1. This commit checks for extent count overflow in such cases. Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Signed-off-by: NChandan Babu R <chandanrlinux@gmail.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 10 12月, 2020 4 次提交
-
-
由 Darrick J. Wong 提交于
Refactor all the open-coded validation of file block ranges into a single helper, and teach the bmap scrubber to check the ranges. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Refactor all the open-coded validation of non-static data device extents into a single helper. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
The code that validates recovered bmap intent items is kind of a mess -- it doesn't use the standard xfs type validators, and it doesn't check for things that it should. Fix the validator function to use the standard validation helpers and look for more types of obvious errors. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
When we recover a bmap intent from the log, we need to validate its contents before we try to replay them. Hoist the checking code into a separate function in preparation to refactor this code to use validation helpers. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 07 10月, 2020 5 次提交
-
-
由 Darrick J. Wong 提交于
There's a subtle design flaw in the deferred log item code that can lead to pinning the log tail. Taking up the defer ops chain examples from the previous commit, we can get trapped in sequences like this: Caller hands us a transaction t0 with D0-D3 attached. The defer ops chain will look like the following if the transaction rolls succeed: t1: D0(t0), D1(t0), D2(t0), D3(t0) t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0) t3: d5(t1), D1(t0), D2(t0), D3(t0) ... t9: d9(t7), D3(t0) t10: D3(t0) t11: d10(t10), d11(t10) t12: d11(t10) In transaction 9, we finish d9 and try to roll to t10 while holding onto an intent item for D3 that we logged in t0. The previous commit changed the order in which we place new defer ops in the defer ops processing chain to reduce the maximum chain length. Now make xfs_defer_finish_noroll capable of relogging the entire chain periodically so that we can always move the log tail forward. Most chains will never get relogged, except for operations that generate very long chains (large extents containing many blocks with different sharing levels) or are on filesystems with small logs and a lot of ongoing metadata updates. Callers are now required to ensure that the transaction reservation is large enough to handle logging done items and new intent items for the maximum possible chain length. Most callers are careful to keep the chain lengths low, so the overhead should be minimal. The decision to relog an intent item is made based on whether the intent was logged in a previous checkpoint, since there's no point in relogging an intent into the same checkpoint. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
In xfs_bui_item_recover, there exists a use-after-free bug with regards to the inode that is involved in the bmap replay operation. If the mapping operation does not complete, we call xfs_bmap_unmap_extent to create a deferred op to finish the unmapping work, and we retain a pointer to the incore inode. Unfortunately, the very next thing we do is commit the transaction and drop the inode. If reclaim tears down the inode before we try to finish the defer ops, we dereference garbage and blow up. Therefore, create a way to join inodes to the defer ops freezer so that we can maintain the xfs_inode reference until we're done with the inode. Note: This imposes the requirement that there be enough memory to keep every incore inode in memory throughout recovery. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
In most places in XFS, we have a specific order in which we gather resources: grab the inode, allocate a transaction, then lock the inode. xfs_bui_item_recover doesn't do it in that order, so fix it to be more consistent. This also makes the error bailout code a bit less weird. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
The bmap intent item checking code in xfs_bui_item_recover is spread all over the function. We should check the recovered log item at the top before we allocate any resources or do anything else, so do that. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
When we replay unfinished intent items that have been recovered from the log, it's possible that the replay will cause the creation of more deferred work items. As outlined in commit 50995582 ("xfs: log recovery should replay deferred ops in order"), later work items have an implicit ordering dependency on earlier work items. Therefore, recovery must replay the items (both recovered and created) in the same order that they would have been during normal operation. For log recovery, we enforce this ordering by using an empty transaction to collect deferred ops that get created in the process of recovering a log intent item to prevent them from being committed before the rest of the recovered intent items. After we finish committing all the recovered log items, we allocate a transaction with an enormous block reservation, splice our huge list of created deferred ops into that transaction, and commit it, thereby finishing all those ops. This is /really/ hokey -- it's the one place in XFS where we allow nested transactions; the splicing of the defer ops list is is inelegant and has to be done twice per recovery function; and the broken way we handle inode pointers and block reservations cause subtle use-after-free and allocator problems that will be fixed by this patch and the two patches after it. Therefore, replace the hokey empty transaction with a structure designed to capture each chain of deferred ops that are created as part of recovering a single unfinished log intent. Finally, refactor the loop that replays those chains to do so using one transaction per chain. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 23 9月, 2020 3 次提交
-
-
由 Darrick J. Wong 提交于
Nowadays, log recovery will call ->release on the recovered intent items if recovery fails. Therefore, it's redundant to release them from inside the ->recover functions when they're about to return an error. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
In the bmap intent item recovery code, we must be careful to attach the inode to its dquots (if quotas are enabled) so that a change in the shape of the bmap btree doesn't cause the quota counters to be incorrect. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
During a code inspection, I found a serious bug in the log intent item recovery code when an intent item cannot complete all the work and decides to requeue itself to get that done. When this happens, the item recovery creates a new incore deferred op representing the remaining work and attaches it to the transaction that it allocated. At the end of _item_recover, it moves the entire chain of deferred ops to the dummy parent_tp that xlog_recover_process_intents passed to it, but fail to log a new intent item for the remaining work before committing the transaction for the single unit of work. xlog_finish_defer_ops logs those new intent items once recovery has finished dealing with the intent items that it recovered, but this isn't sufficient. If the log is forced to disk after a recovered log item decides to requeue itself and the system goes down before we call xlog_finish_defer_ops, the second log recovery will never see the new intent item and therefore has no idea that there was more work to do. It will finish recovery leaving the filesystem in a corrupted state. The same logic applies to /any/ deferred ops added during intent item recovery, not just the one handling the remaining work. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 29 7月, 2020 1 次提交
-
-
由 Carlos Maiolino 提交于
Use kmem_cache_zalloc() directly. With the exception of xlog_ticket_alloc() which will be dealt on the next patch for readability. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 08 5月, 2020 8 次提交
-
-
由 Darrick J. Wong 提交于
The only purpose of XFS_LI_RECOVERED is to prevent log recovery from trying to replay recovered intents more than once. Therefore, we can move the bit setting up to the ->iop_recover caller. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Now that we've made the recovered item tests all the same, we can hoist the test and the ail locking code to the ->iop_recover caller and call the recovery function directly. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Rename XFS_{EFI,BUI,RUI,CUI}_RECOVERED to XFS_LI_RECOVERED so that we track recovery status in the log item, then get rid of the now unused flags fields in each of those log item types. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> -
由 Darrick J. Wong 提交于
During recovery, every intent that we recover from the log has to be added to the AIL. Replace the open-coded addition with a helper. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com>
-
由 Darrick J. Wong 提交于
Replace the open-coded AIL item walking with a proper helper when we're trying to release an intent item that has been finished. We add a new ->iop_match method to decide if an intent item matches a supplied ID. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Move the code that processes the log items created from the recovered log items into the per-item source code files and use dispatch functions to call them. No functional changes. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Move the bmap update intent and intent-done pass2 commit code into the per-item source code files and use dispatch functions to call them. We do these one at a time because there's a lot of code to move. No functional changes. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Darrick J. Wong 提交于
Create a generic dispatch structure to delegate recovery of different log item types into various code modules. This will enable us to move code specific to a particular log item type out of xfs_log_recover.c and into the log item source. The first operation we virtualize is the log item sorting. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 07 5月, 2020 1 次提交
-
-
由 Brian Foster 提交于
Various intent log items call xfs_trans_ail_remove() with a log I/O error shutdown type, but this helper historically checks whether an item is in the AIL before calling xfs_trans_ail_delete(). This means the shutdown check is essentially a no-op for users of xfs_trans_ail_remove(). It is possible that some items might not be AIL resident when the AIL remove attempt occurs, but this should be isolated to cases where the filesystem has already shutdown. For example, this includes abort of the transaction committing the intent and I/O error of the iclog buffer committing the intent to the log. Therefore, update these callsites to use xfs_trans_ail_delete() to provide AIL state validation for the common path of items being released and removed when associated done items commit to the physical log. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAllison Collins <allison.henderson@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 05 5月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
Given how XFS is all based around btrees it doesn't make much sense to offer a totally generic state when we can just use the btree cursor. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-