- 10 6月, 2021 2 次提交
-
-
由 Allison Henderson 提交于
This patch renames the following functions to make the nameing scheme more consistent: xfs_attr_shortform_remove -> xfs_attr_sf_removename xfs_attr_node_remove_name -> xfs_attr_node_removename xfs_attr_set_fmt -> xfs_attr_sf_addname Suggested-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Allison Henderson 提交于
This ASSERT checks for the state value of RM_SHRINK in the set path which should never happen. Change to ASSERT(0); Suggested-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
- 09 6月, 2021 6 次提交
-
-
由 Darrick J. Wong 提交于
The xfs_eofblocks structure is no longer well-named -- nowadays it provides optional filtering criteria to any walk of the incore inode cache. Only one of the cache walk goals has anything to do with clearing of speculative post-EOF preallocations, so change the name to be more appropriate. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
It's important that the filesystem retain its memory of sick inodes for a little while after problems are found so that reports can be collected about what was wrong. Don't let inode reclamation free sick inodes unless we're unmounting or the fs already went down. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
-
由 Darrick J. Wong 提交于
In preparation for renaming struct xfs_eofblocks to struct xfs_icwalk, change the prefix of the existing XFS_EOF_FLAGS_* flags to XFS_ICWALK_FLAG_ and convert all the existing users. This adds a degree of interface separation between the ioctl definitions and the incore parameters. Since FLAGS_UNION is only used in xfs_icache.c, move it there as a private flag. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
-
由 Darrick J. Wong 提交于
When we decide to mark an inode sick, clear the DONTCACHE flag so that the incore inode will be kept around until memory pressure forces it out of memory. This increases the chances that the sick status will be caught by someone compiling a health report later on. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
-
由 Darrick J. Wong 提交于
While running some fuzz tests on inode metadata, I noticed that the filesystem health report (as provided by xfs_spaceman) failed to report the file corruption even when spaceman was run immediately after running xfs_scrub to detect the corruption. That isn't the intended behavior; one ought to be able to run scrub to detect errors in the ondisk metadata and be able to access to those reports for some time after the scrub. After running the same sequence through an instrumented kernel, I discovered the reason why -- scrub igets the file, scans it, marks it sick, and ireleases the inode. When the VFS lets go of the incore inode, it moves to RECLAIMABLE state. If spaceman igets the incore inode before it moves to RECLAIM state, iget reinitializes the VFS state, clears the sick and checked masks, and hands back the inode. At this point, the caller has the exact same incore inode, but with all the health state erased. In other words, we're erasing the incore inode's health state flags when we've decided NOT to sever the link between the incore inode and the ondisk inode. This is wrong, so we need to remove the lines that zero the fields from xfs_iget_cache_hit. As a precaution, we add the same lines into xfs_reclaim_inode just after we sever the link between incore and ondisk inode. Strictly speaking this isn't necessary because once an inode has gone through reclaim it must go through xfs_inode_alloc (which also zeroes the state) and xfs_iget is careful to check for mismatches between the inode it pulls out of the radix tree and the one it wants. Fixes: 6772c1f1 ("xfs: track metadata health status") Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
-
由 Dave Chinner 提交于
From: Dave Chinner <dchinner@redhat.com> Stephen Rothwell reported this compiler warning from linux-next: fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt': fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable] 2032 | struct xfs_agi *agi = agbp->b_addr; Which is fallout from agno -> perag conversions that were done in this function. xfs_check_agi_freecount() is the only user of "agi" in xfs_difree_finobt() now, and it only uses the agi to get the current free inode count. We hold that in the perag structure, so there's not need to directly reference the raw AGI to get this information. The btree cursor being passed to xfs_check_agi_freecount() has a reference to the perag being operated on, so use that directly in xfs_check_agi_freecount() rather than passing an AGI. Fixes: 7b13c515 ("xfs: use perag for ialloc btree cursors") Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
- 07 6月, 2021 5 次提交
-
-
由 Dave Chinner 提交于
It only has one caller and is now a simple function, so merge it into the caller. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Use a single goto label for freeing the buffer and returning an error. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDave Chinner <dchinner@redhat.com>
-
由 Dave Chinner 提交于
Only used in one place, so just open code the logic in the macro. Based on a patch from Christoph Hellwig. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Ever since we stopped using the Linux page cache to back XFS buffers there is no need to take the start sector into account for calculating the number of pages in a buffer, as the data always start from the beginning of the buffer. Signed-off-by: NChristoph Hellwig <hch@lst.de> [dgc: modified to suit this series] Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
->b_offset can only be non-zero for _XBF_KMEM backed buffers, so remove all code dealing with it for page backed buffers. Signed-off-by: NChristoph Hellwig <hch@lst.de> [dgc: modified to fit this patchset] Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
- 04 6月, 2021 15 次提交
-
-
由 Darrick J. Wong 提交于
In preparation for adding another incore inode tree tag, refactor the code that sets and clears tags from the per-AG inode tree and the tree of per-AG structures, and remove the open-coded versions used by the blockgc code. Note: For reclaim, we now rely on the radix tree tags instead of the reclaimable inode count more heavily than we used to. The conversion should be fine, but the logic isn't 100% identical. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Merge these two inode walk loops together, since they're pretty similar now. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Pass a pointer to the actual eofb structure around the inode scanner functions instead of a void pointer, now that none of the functions is used as a callback. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Radix tree tags are supposed to be unsigned ints, so fix the callers. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Soon we're going to be adding two new callers to the incore inode walk code: reclaim of incore inodes, and (later) inactivation of inodes. Both states operate on inodes that no longer have any VFS state, so we need to move the xfs_irele calls into the processing functions. In other words, icwalk processing functions are responsible for cleaning up whatever state changes are made by the corresponding icwalk igrab function that picked the inode for processing. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Clean up the definition of which inode states are not eligible for speculative preallocation garbage collecting by creating a private #define. The deferred inactivation patchset will add two new entries to the set of flags-to-ignore, so we want the definition not to end up a cluttered mess. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
It turns out that there is a 1:1 mapping between the execute and goal parameters that are passed to xfs_inode_walk_ag: xfs_blockgc_scan_inode <=> XFS_ICWALK_BLOCKGC xfs_dqrele_inode <=> XFS_ICWALK_DQRELE Because of this exact correspondence, we don't need the execute function pointer and can replace it with a direct call. For the price of a forward static declaration, we can eliminate the indirect function call. This likely has a negligible impact on performance (since the execute function runs transactions), but it also simplifies the function signature. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
The sole iter_flags is XFS_INODE_WALK_INEW_WAIT, and there are no users. Remove the flag, and the parameter, and all the code that used it. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Move the INEW wait into xfs_dqrele_inode so that we can drop the iter_flags parameter in the next patch. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Disentangle the dqrele_all inode grab code from the "generic" inode walk grabbing code, and and use the opportunity to document why the dqrele grab function does what it does. Since xfs_inode_walk_ag_grab is now only used for blockgc, rename it to reflect that. Ultimately, there will be four reasons to perform a walk of incore inodes: quotaoff dquote releasing (dqrele), garbage collection of speculative preallocations (blockgc), reclamation of incore inodes (reclaim), and deferred inactivation (inodegc). Each of these four have their own slightly different criteria for deciding if they want to handle an inode, so it makes more sense to have four cohesive igrab functions than one confusing parameteric grab function like we do now. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
As part of removing the indirect calls and radix tag implementation details from the incore inode walk loop, create an enum to represent the goal of the inode iteration. More immediately, this separate removes the need for the "ICI_NOTAG" define which makes little sense. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Shorten the prefix so that all the incore inode cache walk code has "xfs_icwalk" in the name somewhere. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Move the inode walk functions further down in the file to limit the forward declarations to the two walk functions as we add new code that uses the inode walks. We'll clean them out later (i.e. after the deferred inode inactivation series). Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Once we're done with inactivating an inode, we're finished updating metadata for that inode. This means that we can detach the dquots at the end and not have to wait for reclaim to do it for us. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
The only external caller of xfs_inode_walk* happens in quotaoff, when we want to walk all the incore inodes to detach the dquots. Move this code to xfs_icache.c so that we can hide xfs_inode_walk as the starting step in more cleanups of inode walks. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 03 6月, 2021 4 次提交
-
-
由 Dave Chinner 提交于
Because this happens at high thread counts on high IOPS devices doing mixed read/write AIO-DIO to a single file at about a million iops: 64.09% 0.21% [kernel] [k] io_submit_one - 63.87% io_submit_one - 44.33% aio_write - 42.70% xfs_file_write_iter - 41.32% xfs_file_dio_write_aligned - 25.51% xfs_file_write_checks - 21.60% _raw_spin_lock - 21.59% do_raw_spin_lock - 19.70% __pv_queued_spin_lock_slowpath This also happens of the IO completion IO path: 22.89% 0.69% [kernel] [k] xfs_dio_write_end_io - 22.49% xfs_dio_write_end_io - 21.79% _raw_spin_lock - 20.97% do_raw_spin_lock - 20.10% __pv_queued_spin_lock_slowpath IOWs, fio is burning ~14 whole CPUs on this spin lock. So, do an unlocked check against inode size first, then if we are at/beyond EOF, take the spinlock and recheck. This makes the spinlock disappear from the overwrite fastpath. I'd like to report that fixing this makes things go faster. It doesn't - it just exposes the the XFS_ILOCK as the next severe contention point doing extent mapping lookups, and that now burns all the 14 CPUs this spinlock was burning. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it static. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Jiapeng Chong 提交于
Variable busy is set to false, but this value is never read as it is overwritten or not used later on, hence it is a redundant assignment and can be removed. Clean up the following clang-analyzer warning: fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Shaokun Zhang 提交于
Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and 'xfs_symlink_buf_ops' are declared twice, so sort these variables alphabetically and remove the repeated declaration. Cc: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
- 02 6月, 2021 8 次提交
-
-
由 Dave Chinner 提交于
Almost unused, gets rid of another typedef. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Unlinked lists are held in the perag, and freeing of inodes needs to be passed a perag, too, so look up the perag early in the unlink processing and use it throughout. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Dave Chinner 提交于
Because it's a mess. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Now that we've internalised the two-phase inode allocation, we can now easily make the AG selection and allocation atomic from the perspective of a single perag context. This will ensure AGs going offline/away cannot occur between the selection and allocation steps. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
This is just a simple wrapper around the per-ag inode allocation that doesn't need to exist. The internal mechanism to select and allocate within an AG does not need to be exposed outside xfs_ialloc.c, and it being exposed simply makes it harder to follow the code and simplify it. This is simplified by internalising xf_dialloc_select_ag() and xfs_dialloc_ag() into a single xfs_dialloc() function and then xfs_dir_ialloc() can go away. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
xfs_dialloc_select_ag() does a lot of repetitive work. It first calls xfs_ialloc_ag_select() to select the AG to start allocation attempts in, which can do up to two entire loops across the perags that inodes can be allocated in. This is simply checking if there is spce available to allocate inodes in an AG, and it returns when it finds the first candidate AG. xfs_dialloc_select_ag() then does it's own iterative walk across all the perags locking the AGIs and trying to allocate inodes from the locked AG. It also doesn't limit the search to mp->m_maxagi, so it will walk all AGs whether they can allocate inodes or not. Hence if we are really low on inodes, we could do almost 3 entire walks across the whole perag range before we find an allocation group we can allocate inodes in or report ENOSPC. Because xfs_ialloc_ag_select() returns on the first candidate AG it finds, we can simply do these checks directly in xfs_dialloc_select_ag() before we lock and try to allocate inodes. This reduces the inode allocation pass down to 2 perag sweeps at most - one for aligned inode cluster allocation and if we can't allocate full, aligned inode clusters anywhere we'll do another pass trying to do sparse inode cluster allocation. This also removes a big chunk of duplicate code. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
The only caller of xfs_dialloc_select_ag() will always return -ENOSPC to it's caller if the agbp returned from xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate AGI we can allocate inodes from is always an ENOSPC condition, so move this logic up into xfs_dialloc_select_ag() so we can simplify the return logic in this function. xfs_dialloc_select_ag() now only ever returns 0 with a locked agbp, or an error with no agbp. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Now that everything passes a perag, the agno is not needed anymore. Convert all the users to use pag->pag_agno instead and remove the agno from the cursor. This was largely done as an automated search and replace. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-