提交 · 855e57a11966722c67e631ae530c03e251fbcf8a · openeuler / Kernel

03 6月, 2020 1 次提交

iomap: convert from readpages to readahead · 9d24a13a

由 Matthew Wilcox (Oracle) 提交于 6月 01, 2020

Use the new readahead operation in iomap.  Convert XFS and ZoneFS to use
it.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Link: http://lkml.kernel.org/r/20200414150233.24495-26-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d24a13a

17 4月, 2020 1 次提交

xfs: move inode flush to the sync workqueue · f0f7a674

由 Darrick J. Wong 提交于 4月 12, 2020

Move the inode dirty data flushing to a workqueue so that multiple
threads can take advantage of a single thread's flushing work. The
ratelimiting technique used in bdd4ee4 was not successful, because
threads that skipped the inode flush scan due to ratelimiting would
ENOSPC early, which caused occasional (but noticeable) changes in
behavior and sporadic fstest regressions.

Therefore, make all the writer threads wait on a single inode flush,
which eliminates both the stampeding hordes of flushers and the small
window in which a write could fail with ENOSPC because it lost the
ratelimit race after even another thread freed space.

Fixes: c6425702 ("xfs: ratelimit inode flush on buffered write ENOSPC")
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

f0f7a674

13 4月, 2020 2 次提交

xfs: fix partially uninitialized structure in xfs_reflink_remap_extent · c142932c

由 Darrick J. Wong 提交于 4月 12, 2020

In the reflink extent remap function, it turns out that uirec (the block
mapping corresponding only to the part of the passed-in mapping that got
unmapped) was not fully initialized.  Specifically, br_state was not
being copied from the passed-in struct to the uirec.  This could lead to
unpredictable results such as the reflinked mapping being marked
unwritten in the destination file.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

c142932c

xfs: acquire superblock freeze protection on eofblocks scans · 4b674b9a

由 Brian Foster 提交于 4月 12, 2020

The filesystem freeze sequence in XFS waits on any background
eofblocks or cowblocks scans to complete before the filesystem is
quiesced. At this point, the freezer has already stopped the
transaction subsystem, however, which means a truncate or cowblock
cancellation in progress is likely blocked in transaction
allocation. This results in a deadlock between freeze and the
associated scanner.

Fix this problem by holding superblock write protection across calls
into the block reapers. Since protection for background scans is
acquired from the workqueue task context, trylock to avoid a similar
deadlock between freeze and blocking on the write lock.

Fixes: d6b636eb ("xfs: halt auto-reclamation activities while rebuilding rmap")
Reported-by: NPaul Furtado <paulfurtado91@gmail.com>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChandan Rajendra <chandanrlinux@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAllison Collins <allison.henderson@oracle.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

4b674b9a

06 4月, 2020 2 次提交

xfs: reflink should force the log out if mounted with wsync · 5833112d

由 Christoph Hellwig 提交于 4月 03, 2020

Reflink should force the log out to disk if the filesystem was mounted
with wsync, the same as most other operations in xfs.

[Note: XFS_MOUNT_WSYNC is set when the admin mounts the filesystem
with either the 'wsync' or 'sync' mount options, which effectively means
that we're classifying reflink/dedupe as IO operations and making them
synchronous when required.]

Fixes: 3fc9f5e4 ("xfs: remove xfs_reflink_remap_range")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
[darrick: add more to the changelog]
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

5833112d

xfs: factor out a new xfs_log_force_inode helper · 54fbdd10

由 Christoph Hellwig 提交于 4月 03, 2020

Create a new helper to force the log up to the last LSN touching an
inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

54fbdd10

02 4月, 2020 1 次提交

xfs: fix inode number overflow in ifree cluster helper · d9fdd0ad

由 Brian Foster 提交于 4月 02, 2020

Qian Cai reports seemingly random buffer read verifier errors during
filesystem writeback. This was isolated to a recent patch that
factored out some inode cluster freeing code and happened to cast an
unsigned inode number type to a signed value. If the inode number
value overflows, we can skip marking in-core inodes associated with
the underlying buffer stale at the time the physical inodes are
freed. If such an inode happens to be dirty, xfsaild will eventually
attempt to write it back over non-inode blocks. The invalidation of
the underlying inode buffer causes writeback to read the buffer from
disk. This fails the read verifier (preventing eventual corruption)
if the buffer no longer looks like an inode cluster. Analysis by
Dave Chinner.

Fix up the helper to use the proper type for inode number values.

Fixes: 5806165a ("xfs: factor inode lookup from xfs_ifree_cluster")
Reported-by: NQian Cai <cai@lca.pw>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d9fdd0ad

31 3月, 2020 2 次提交

xfs: remove redundant variable assignment in xfs_symlink() · d8fcb6f1

由 Kaixu Xia 提交于 3月 29, 2020

The variables 'udqp' and 'gdqp' have been initialized, so remove
redundant variable assignment in xfs_symlink().
Signed-off-by: NKaixu Xia <kaixuxia@tencent.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d8fcb6f1

xfs: ratelimit inode flush on buffered write ENOSPC · c6425702

由 Darrick J. Wong 提交于 3月 27, 2020

A customer reported rcu stalls and softlockup warnings on a computer
with many CPU cores and many many more IO threads trying to write to a
filesystem that is totally out of space. Subsequent analysis pointed to
the many many IO threads calling xfs_flush_inodes -> sync_inodes_sb,
which causes a lot of wb_writeback_work to be queued. The writeback
worker spends so much time trying to wake the many many threads waiting
for writeback completion that it trips the softlockup detector, and (in
this case) the system automatically reboots.

In addition, they complain that the lengthy xfs_flush_inodes scan traps
all of those threads in uninterruptible sleep, which hampers their
ability to kill the program or do anything else to escape the situation.

If there's thousands of threads trying to write to files on a full
filesystem, each of those threads will start separate copies of the
inode flush scan. This is kind of pointless since we only need one
scan, so rate limit the inode flush.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

c6425702

29 3月, 2020 3 次提交

xfs: return locked status of inode buffer on xfsaild push · d4bc4c5f

由 Brian Foster 提交于 3月 27, 2020

If the inode buffer backing a particular inode is locked,
xfs_iflush() returns -EAGAIN and xfs_inode_item_push() skips the
inode. It still returns success to xfsaild, however, which bypasses
the xfsaild backoff heuristic. Update xfs_inode_item_push() to
return locked status if the inode buffer couldn't be locked.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d4bc4c5f

xfs: trylock underlying buffer on dquot flush · 8d3d7e2b

由 Brian Foster 提交于 3月 27, 2020

A dquot flush currently blocks on the buffer lock for the underlying
dquot buffer. In turn, this causes xfsaild to block rather than
continue processing other items in the meantime. Update
xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
are handled, and return -EAGAIN if the lock fails. Fix up any
callers that don't currently handle the error properly.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

8d3d7e2b

xfs: remove unnecessary ternary from xfs_create · 63337b63

由 Kaixu Xia 提交于 3月 27, 2020

Since the "no-allocation" reservations for file creations has
been removed, the resblks value should be larger than zero, so
remove unnecessary ternary conditional.
Signed-off-by: NKaixu Xia <kaixuxia@tencent.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
[darrick: s/judgment/ternary/]
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

63337b63

27 3月, 2020 18 次提交

xfs: don't write a corrupt unmount record to force summary counter recalc · 5cc3c006

由 Darrick J. Wong 提交于 3月 26, 2020

In commit f467cad9, I added the ability to force a recalculation of
the filesystem summary counters if they seemed incorrect.  This was done
(not entirely correctly) by tweaking the log code to write an unmount
record without the UMOUNT_TRANS flag set.  At next mount, the log
recovery code will fail to find the unmount record and go into recovery,
which triggers the recalculation.

What actually gets written to the log is what ought to be an unmount
record, but without any flags set to indicate what kind of record it
actually is.  This worked to trigger the recalculation, but we shouldn't
write bogus log records when we could simply write nothing.

Fixes: f467cad9 ("xfs: force summary counter recalc at next mount")
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

5cc3c006

xfs: factor inode lookup from xfs_ifree_cluster · 5806165a