- 02 6月, 2021 17 次提交
-
-
由 Dave Chinner 提交于
The only caller of xfs_dialloc_select_ag() will always return -ENOSPC to it's caller if the agbp returned from xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate AGI we can allocate inodes from is always an ENOSPC condition, so move this logic up into xfs_dialloc_select_ag() so we can simplify the return logic in this function. xfs_dialloc_select_ag() now only ever returns 0 with a locked agbp, or an error with no agbp. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Now that everything passes a perag, the agno is not needed anymore. Convert all the users to use pag->pag_agno instead and remove the agno from the cursor. This was largely done as an automated search and replace. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Which will eventually completely replace the agno in it. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Dave Chinner 提交于
Needs a [from, to] ranged AG walk, and the perag to be stuffed into the info structure for callouts to use. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
We currently pass an agno from the AG reservation functions to the individual feature accounting functions, which in future may have to do perag lookups to access per-AG state. Instead, pre-emptively plumb the perag through from the highest AG reservation layer to the feature callouts so they won't have to look it up again. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Dave Chinner 提交于
All of the callers of the busy extent API either have perag references available to use so we can pass a perag to the busy extent functions rather than having them have to do unnecessary lookups. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Clean up the last external manual AG walk. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Rather than manually walking the ags and passing agnunbers around, pass the perag for the AG we are currently working on around in the iwalk structure. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Convert the raw walks to an iterator, pulling the current AG out of pag->pag_agno instead of the loop iterator variable. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
for_each_perag_tag() is defined in xfs_icache.c for local use. Promote this to xfs_ag.h and define equivalent iteration functions so that we can use them to iterate AGs instead to replace open coded perag walks and perag lookups. We also convert as many of the straight forward open coded AG walks to use these iterators as possible. Anything that is not a direct conversion to an iterator is ignored and will be updated in future commits. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
Move the xfs_perag infrastructure to the libxfs files that contain all the per AG infrastructure. This helps set up for passing perags around all the code instead of bare agnos with minimal extra includes for existing files. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
The perag structures really need to be defined with the rest of the AG support infrastructure. The struct xfs_perag and init/teardown has been placed in xfs_mount.[ch] because there are differences in the structure between kernel and userspace. Mainly that userspace doesn't have a lot of the internal stuff that the kernel has for caches and discard and other such structures. However, it makes more sense to move this to libxfs than to keep this separation because we are now moving to use struct perags everywhere in the code instead of passing raw agnumber_t values about. Hence we shoudl really move the support infrastructure to libxfs/xfs_ag.[ch]. To do this without breaking userspace, first we need to rearrange the structures and code so that all the kernel specific code is located together. This makes it simple for userspace to ifdef out the all the parts it does not need, minimising the code differences between kernel and userspace. The next commit will do the move... Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Dave Chinner 提交于
They are AG functions, not superblock functions, so move them to the appropriate location. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
-
- 04 5月, 2021 2 次提交
-
-
由 Brian Foster 提交于
The only remaining user of ->io_private is the generic ioend merging infrastructure. The only user of that is XFS, which no longer sets ->io_private or passes an associated merge callback. Remove the unused parameter and the ->io_private field. CC: linux-fsdevel@vger.kernel.org Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Darrick J. Wong 提交于
While running generic/050 with an external log, I observed this warning in dmesg: Trying to write to read-only block-device sda4 (partno 4) WARNING: CPU: 2 PID: 215677 at block/blk-core.c:704 submit_bio_checks+0x256/0x510 Call Trace: submit_bio_noacct+0x2c/0x430 _xfs_buf_ioapply+0x283/0x3c0 [xfs] __xfs_buf_submit+0x6a/0x210 [xfs] xfs_buf_delwri_submit_buffers+0xf8/0x270 [xfs] xfsaild+0x2db/0xc50 [xfs] kthread+0x14b/0x170 I think this happened because we tried to cover the log after a readonly mount, and the AIL tried to write the primary superblock to the data device. The test marks the data device readonly, but it doesn't do the same to the external log device. Therefore, XFS thinks that the log is writable, even though AIL writes whine to dmesg because the data device is read only. Fix this by amending xfs_log_writable to prevent writes when the AIL can't possible write anything into the filesystem. Note: As for the external log or the rt devices being readonly-- xfs_blkdev_get will complain about that if we aren't doing a norecovery mount. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 29 4月, 2021 8 次提交
-
-
由 Darrick J. Wong 提交于
The final parameter of filemap_write_and_wait_range is the end of the range to flush, not the length of the range to flush. Fixes: 46afb062 ("xfs: only flush the unshared range in xfs_reflink_unshare") Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Brian Foster 提交于
The blocks used for allocation btrees (bnobt and countbt) are technically considered free space. This is because as free space is used, allocbt blocks are removed and naturally become available for traditional allocation. However, this means that a significant portion of free space may consist of in-use btree blocks if free space is severely fragmented. On large filesystems with large perag reservations, this can lead to a rare but nasty condition where a significant amount of physical free space is available, but the majority of actual usable blocks consist of in-use allocbt blocks. We have a record of a (~12TB, 32 AG) filesystem with multiple AGs in a state with ~2.5GB or so free blocks tracked across ~300 total allocbt blocks, but effectively at 100% full because the the free space is entirely consumed by refcountbt perag reservation. Such a large perag reservation is by design on large filesystems. The problem is that because the free space is so fragmented, this AG contributes the 300 or so allocbt blocks to the global counters as free space. If this pattern repeats across enough AGs, the filesystem lands in a state where global block reservation can outrun physical block availability. For example, a streaming buffered write on the affected filesystem continues to allow delayed allocation beyond the point where writeback starts to fail due to physical block allocation failures. The expected behavior is for the delalloc block reservation to fail gracefully with -ENOSPC before physical block allocation failure is a possibility. To address this problem, set aside in-use allocbt blocks at reservation time and thus ensure they cannot be reserved until truly available for physical allocation. This allows alloc btree metadata to continue to reside in free space, but dynamically adjusts reservation availability based on internal state. Note that the logic requires that the allocbt counter is fully populated at reservation time before it is fully effective. We currently rely on the mount time AGF scan in the perag reservation initialization code for this dependency on filesystems where it's most important (i.e. with active perag reservations). Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Brian Foster 提交于
Introduce an in-core counter to track the sum of all allocbt blocks used by the filesystem. This value is currently tracked per-ag via the ->agf_btreeblks field in the AGF, which also happens to include rmapbt blocks. A global, in-core count of allocbt blocks is required to identify the subset of global ->m_fdblocks that consists of unavailable blocks currently used for allocation btrees. To support this calculation at block reservation time, construct a similar global counter for allocbt blocks, populate it on first read of each AGF and update it as allocbt blocks are used and released. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Brian Foster 提交于
perag reservation is enabled at mount time on a per AG basis. The upcoming change to set aside allocbt blocks from block reservation requires a populated allocbt counter as soon as possible after mount to be fully effective against large perag reservations. Therefore as a preparation step, initialize the pagf on all mounts where at least one reservation is active. Note that this already occurs to some degree on most default format filesystems as reservation requirement calculations already depend on the AGF or AGI, depending on the reservation type. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChandan Babu R <chandanrlinux@gmail.com> Reviewed-by: NAllison Henderson <allison.henderson@oracle.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Darrick J. Wong 提交于
Since agf_btreeblks didn't exist before the lazysbcount feature, the fs summary count scrubber needs to walk the free space btrees to determine the amount of space being used by those btrees. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NGao Xiang <hsiangkao@redhat.com>
-
由 Dave Chinner 提交于
Keep the mount superblock counters up to date for !lazysbcount filesystems so that when we log the superblock they do not need updating in any way because they are already correct. It's found by what Zorro reported: 1. mkfs.xfs -f -l lazy-count=0 -m crc=0 $dev 2. mount $dev $mnt 3. fsstress -d $mnt -p 100 -n 1000 (maybe need more or less io load) 4. umount $mnt 5. xfs_repair -n $dev and I've seen no problem with this patch. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reported-by: NZorro Lang <zlang@redhat.com> Reviewed-by: NGao Xiang <hsiangkao@redhat.com> Signed-off-by: NGao Xiang <hsiangkao@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
The AGF free space btree block counter wasn't added until the lazysbcount feature was added to XFS midway through the life of the V4 format, so ignore the field when checking. Online AGF repair requires rmapbt, so it doesn't need the feature check. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
In commit f8f2835a we changed the behavior of XFS to use EFIs to remove blocks from an overfilled AGFL because there were complaints about transaction overruns that stemmed from trying to free multiple blocks in a single transaction. Unfortunately, that commit missed a subtlety in the debug-mode transaction accounting when a realtime volume is attached. If a realtime file undergoes a data fork mapping change such that realtime extents are allocated (or freed) in the same transaction that a data device block is also allocated (or freed), we can trip a debugging assertion. This can happen (for example) if a realtime extent is allocated and it is necessary to reshape the bmbt to hold the new mapping. When we go to allocate a bmbt block from an AG, the first thing the data device block allocator does is ensure that the freelist is the proper length. If the freelist is too long, it will trim the freelist to the proper length. In debug mode, trimming the freelist calls xfs_trans_agflist_delta() to record the decrement in the AG free list count. Prior to f8f28 we would put the free block back in the free space btrees in the same transaction, which calls xfs_trans_agblocks_delta() to record the increment in the AG free block count. Since AGFL blocks are included in the global free block count (fdblocks), there is no corresponding fdblocks update, so the AGFL free satisfies the following condition in xfs_trans_apply_sb_deltas: /* * Check that superblock mods match the mods made to AGF counters. */ ASSERT((tp->t_fdblocks_delta + tp->t_res_fdblocks_delta) == (tp->t_ag_freeblks_delta + tp->t_ag_flist_delta + tp->t_ag_btree_delta)); The comparison here used to be: (X + 0) == ((X+1) + -1 + 0), where X is the number blocks that were allocated. After commit f8f28 we defer the block freeing to the next chained transaction, which means that the calls to xfs_trans_agflist_delta and xfs_trans_agblocks_delta occur in separate transactions. The (first) transaction that shortens the free list trips on the comparison, which has now become: (X + 0) == ((X) + -1 + 0) because we haven't freed the AGFL block yet; we've only logged an intention to free it. When the second transaction (the deferred free) commits, it will evaluate the expression as: (0 + 0) == (1 + 0 + 0) and trip over that in turn. At this point, the astute reader may note that the two commits tagged by this patch have been in the kernel for a long time but haven't generated any bug reports. How is it that the author became aware of this bug? This originally surfaced as an intermittent failure when I was testing realtime rmap, but a different bug report by Zorro Lang reveals the same assertion occuring on !lazysbcount filesystems. The common factor to both reports (and why this problem wasn't previously reported) becomes apparent if we consider when xfs_trans_apply_sb_deltas is called by __xfs_trans_commit(): if (tp->t_flags & XFS_TRANS_SB_DIRTY) xfs_trans_apply_sb_deltas(tp); With a modern lazysbcount filesystem, transactions update only the percpu counters, so they don't need to set XFS_TRANS_SB_DIRTY, hence xfs_trans_apply_sb_deltas is rarely called. However, updates to the count of free realtime extents are not part of lazysbcount, so XFS_TRANS_SB_DIRTY will be set on transactions adding or removing data fork mappings to realtime files; similarly, XFS_TRANS_SB_DIRTY is always set on !lazysbcount filesystems. Dave mentioned in response to an earlier version of this patch: "IIUC, what you are saying is that this debug code is simply not exercised in normal testing and hasn't been for the past decade? And it still won't be exercised on anything other than realtime device testing? "...it was debugging code from 1994 that was largely turned into dead code when lazysbcounters were introduced in 2007. Hence I'm not sure it holds any value anymore." This debugging code isn't especially helpful - you can modify the flcount on one AG and the freeblks of another AG, and it won't trigger. Add the fact that nobody noticed for a decade, and let's just get rid of it (and start testing realtime :P). This bug was found by running generic/051 on either a V4 filesystem lacking lazysbcount; or a V5 filesystem with a realtime volume. Cc: bfoster@redhat.com, zlang@redhat.com Fixes: f8f2835a ("xfs: defer agfl block frees when dfops is available") Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 23 4月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
Rename struct xfs_legacy_ictimestamp to struct xfs_log_legacy_timestamp as it is a type used for logging timestamps with no relationship to the in-core inode. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Rename xfs_ictimestamp_t to xfs_log_timestamp_t as it is a type used for logging timestamps with no relationship to the in-core inode. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
- 16 4月, 2021 8 次提交
-
-
由 Darrick J. Wong 提交于
The function was renamed, so get rid of the declaration. Signed-off-by: NDarrick J. Wong <djwong@kernel.org> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Christoph Hellwig 提交于
The in-memory XFS_IFEXTENTS is now only used to check if an inode with extents still needs the extents to be read into memory before doing operations that need the extent map. Add a new xfs_need_iread_extents helper that returns true for btree format forks that do not have any entries in the in-memory extent btree, and use that instead of checking the XFS_IFEXTENTS flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Just check for an inline format fork instead of the using the equivalent in-memory XFS_IFINLINE flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Just check for a btree format fork instead of the using the equivalent in-memory XFS_IFBROOT flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Stop using the XFS_IFEXTENTS flag, and instead switch on the fork format in xfs_idestroy_fork to decide how to cleanup. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Directly return from the subfunctions and avoid the error variable. Also remove the not really needed dp local variable. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
xfs_bmap_one_block is only called for the attribute fork. Move it to xfs_attr.c, drop the unused whichfork argument and code only executed for the data fork and rename the result to xfs_attr_is_leaf. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Christoph Hellwig 提交于
Move the XFS_IFEXTENTS check from the callers into xfs_iread_extents to simplify the code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
- 12 4月, 2021 1 次提交
-
-
由 Miklos Szeredi 提交于
Use the fileattr API to let the VFS handle locking, permission checking and conversion. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Cc: Darrick J. Wong <djwong@kernel.org>
-
- 10 4月, 2021 2 次提交
-
-
由 Brian Foster 提交于
xfs_setfilesize() is the only remaining caller of the internal __xfs_setfilesize() helper. Fold them into a single function. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-
由 Brian Foster 提交于
XFS no longer attaches anthing to ioend->io_private. Remove the unnecessary ->io_private merging code. This removes the only remaining user of xfs_setfilesize_ioend() so remove that function as well. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <djwong@kernel.org> Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
-