提交 · 0c6dda7a1cbd587e48bcef1999875e29549c2b41 · openeuler / Kernel

29 1月, 2018 17 次提交

iomap: warn on zero-length mappings · 0c6dda7a

由 Darrick J. Wong 提交于 1月 26, 2018

Don't let the iomap callback get away with feeding us a garbage zero
length mapping -- there was a bug in xfs that resulted in those leaking
out to hilarious effect.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

0c6dda7a

xfs: treat CoW fork operations as delalloc for quota accounting · 4b4c1326

由 Darrick J. Wong 提交于 1月 19, 2018

Since the CoW fork only exists in memory, it is incorrect to update the
on-disk quota block counts when we modify the CoW fork.  Unlike the data
fork, even real extents in the CoW fork are only delalloc-style
reservations (on-disk they're owned by the refcountbt) so they must not
be tracked in the on disk quota info.  Ensure the i_delayed_blks
accounting reflects this too.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4b4c1326

xfs: only grab shared inode locks for source file during reflink · 01c2e13d

由 Darrick J. Wong 提交于 1月 18, 2018

Reflink and dedupe operations remap blocks from a source file into a
destination file. The destination file needs exclusive locks on all
levels because we're updating its block map, but the source file isn't
undergoing any block map changes so we can use a shared lock.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

01c2e13d

xfs: allow xfs_lock_two_inodes to take different EXCL/SHARED modes · 7c2d238a

由 Darrick J. Wong 提交于 1月 26, 2018

Refactor xfs_lock_two_inodes to take separate locking modes for each
inode.  Specifically, this enables us to take a SHARED lock on one inode
and an EXCL lock on the other.  The lock class (MMAPLOCK/ILOCK) must be
the same for each inode.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

7c2d238a

xfs: reflink should break pnfs leases before sharing blocks · 1364b1d4

由 Darrick J. Wong 提交于 1月 18, 2018

Before we share blocks between files, we need to break the pnfs leases
on the layout before we start slicing and dicing the block map.  The
structure of this function sets us up for the lock contention reduction
in the next patch.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

1364b1d4

xfs: don't clobber inobt/finobt cursors when xref with rmap · c47b74fb

由 Darrick J. Wong 提交于 1月 23, 2018

Even if we can't use the inobt/finobt cursors to count the number of
inode btree blocks, we are never allowed to clobber the cursor of the
btree being checked, so don't do this.  Found by fuzzing level = ones
in xfs/364.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

c47b74fb

xfs: skip CoW writes past EOF when writeback races with truncate · 70c57dcd

由 Darrick J. Wong 提交于 1月 24, 2018

Every so often we blow the ASSERT(type != XFS_IO_COW) in xfs_map_blocks
when running fsstress, as we do in generic/269. The cause of this is
writeback racing with truncate -- writeback doesn't take the iolock, so
truncate can sneak in to decrease i_size and truncate page cache while
writeback is gathering buffer heads to schedule writeout.

If we hit this race on a block that has a CoW mapping, we'll get a valid
imap from the CoW fork but the reduced i_size trims the mapping to zero
length (which makes it invalid), so we call xfs_map_blocks to try again.
This doesn't do much anyway, since any mapping we get out of that will
also be invalid, so we might as well skip the assert and just stop.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

70c57dcd

xfs: preserve i_rdev when recycling a reclaimable inode · acd1d715

由 Amir Goldstein 提交于 1月 26, 2018

Commit 66f36464 ("xfs: remove if_rdev") moved storing of rdev
value for special inodes to VFS inodes, but forgot to preserve the
value of i_rdev when recycling a reclaimable xfs_inode.

This was detected by xfstest overlay/017 with inodex=on mount option
and xfs base fs. The test does a lookup of overlay chardev and blockdev
right after drop caches.

Overlayfs inodes hold a reference on underlying xfs inodes when mount
option index=on is configured. If drop caches reclaim xfs inodes, before
it relclaims overlayfs inodes, that can sometimes leave a reclaimable xfs
inode and that test hits that case quite often.

When that happens, the xfs inode cache remains broken (zere i_rdev)
until the next cycle mount or drop caches.

Fixes: 66f36464 ("xfs: remove if_rdev")
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

acd1d715

xfs: refactor accounting updates out of xfs_bmap_btalloc · 751f3767

由 Darrick J. Wong 提交于 1月 25, 2018

Move all the inode and quota accounting updates out of xfs_bmap_btalloc
in preparation for fixing some quota accounting problems with copy on
write.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

751f3767

xfs: refactor inode verifier corruption error printing · 22431bf3

由 Darrick J. Wong 提交于 1月 22, 2018

Refactor inode verifier error reporting into a non-libxfs function so
that we aren't encoding the message format in libxfs.  This also
changes the kernel dmesg output to resemble buffer verifier errors
more closely.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

22431bf3

xfs: make tracepoint inode number format consistent · 67a3f6d0

由 Darrick J. Wong 提交于 1月 22, 2018

Fix all the inode number formats to be consistently (0x%llx) in all
trace point definitions.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

67a3f6d0

xfs: always zero di_flags2 when we free the inode · beaae8cd

由 Darrick J. Wong 提交于 1月 22, 2018

Always zero the di_flags2 field when we free the inode so that we never
end up with an on-disk record for an unallocated inode that also has the
reflink iflag set.  This is in keeping with the general principle that
only files can have the reflink iflag set, even though we'll zero out
di_flags2 if we ever reallocate the inode.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

beaae8cd

xfs: call xfs_qm_dqattach before performing reflink operations · 09ac8623

由 Darrick J. Wong 提交于 1月 19, 2018

Ensure that we've attached all the necessary dquots before performing
reflink operations so that quota accounting is accurate.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

09ac8623

xfs: bmap code cleanup · 6ca30729

由 Shan Hai 提交于 1月 23, 2018

Remove the extent size hint and realtime inode relevant code from
the xfs_bmapi_reserve_delalloc since it is not called on the inode
with extent size hint set or on a realtime inode.
Signed-off-by: NShan Hai <shan.hai@oracle.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

6ca30729

Use list_head infra-structure for buffer's log items list · 643c8c05

由 Carlos Maiolino 提交于 1月 24, 2018

Now that buffer's b_fspriv has been split, just replace the current
singly linked list of xfs_log_items, by the list_head infrastructure.

Also, remove the xfs_log_item argument from xfs_buf_resubmit_failed_buffers(),
there is no need for this argument, once the log items can be walked
through the list_head in the buffer.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style cleanups]
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

643c8c05

Split buffer's b_fspriv field · fb1755a6

由 Carlos Maiolino 提交于 1月 24, 2018

By splitting the b_fspriv field into two different fields (b_log_item
and b_li_list). It's possible to get rid of an old ABI workaround, by
using the new b_log_item field to store xfs_buf_log_item separated from
the log items attached to the buffer, which will be linked in the new
b_li_list field.

This way, there is no more need to reorder the log items list to place
the buf_log_item at the beginning of the list, simplifying a bit the
logic to handle buffer IO.

This also opens the possibility to change buffer's log items list into a
proper list_head.

b_log_item field is still defined as a void *, because it is still used
by the log buffers to store xlog_in_core structures, and there is no
need to add an extra field on xfs_buf just for xlog_in_core.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
[darrick: minor style changes]
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

fb1755a6

Get rid of xfs_buf_log_item_t typedef · 70a20655

由 Carlos Maiolino 提交于 1月 24, 2018

Take advantage of the rework on xfs_buf log items list, to get rid of
ths typedef for xfs_buf_log_item.

This patch also fix some indentation alignment issues found along the way.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

70a20655

18 1月, 2018 23 次提交

xfs: fix non-debug build compiler warnings · 75d4a13b

由 Darrick J. Wong 提交于 1月 16, 2018

Fix compiler warning on non-debug build
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

75d4a13b

xfs: check sb_agblocks and sb_agblklog when validating superblock · 4bb73d01

由 Darrick J. Wong 提交于 1月 16, 2018

Currently, we don't check sb_agblocks or sb_agblklog when we validate
the superblock, which means that we can fuzz garbage values into those
values and the mount succeeds.  This leads to all sorts of UBSAN
warnings in xfs/350 since we can then coerce other parts of xfs into
shifting by ridiculously large values.

Once we've validated agblocks, make sure the agcount makes sense.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

4bb73d01

xfs: recheck reflink / dirty page status before freeing CoW reservations · be78ff0e

由 Darrick J. Wong 提交于 1月 16, 2018

Eryu Guan reported seeing occasional hangs when running generic/269 with
a new fsstress that supports clonerange/deduperange. The cause of this
hang is an infinite loop when we convert the CoW fork extents from
unwritten to real just prior to writing the pages out; the infinite
loop happens because there's nothing in the CoW fork to convert, and so
it spins forever.

The fundamental issue here is that when we go to perform these CoW fork
conversions, we're supposed to have an extent waiting for us, but the
low space CoW reaper has snuck in and blown them away! There are four
conditions that can dissuade the reaper from touching our file -- no
reflink iflag; dirty page cache; writeback in progress; or directio in
progress. We check the four conditions prior to taking the locks, but
we neglect to recheck them once we have the locks, which is how we end
up whacking the writeback that's in progress.

Therefore, refactor the four checks into a helper function and call it
once again once we have the locks to make sure we really want to reap
the inode. While we're at it, add an ASSERT for this weird condition so
that we'll fail noisily if we ever screw this up again.
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Tested-by: NEryu Guan <eguan@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

be78ff0e

xfs: check that br_blockcount doesn't overflow · a5f460b1

由 Darrick J. Wong 提交于 1月 16, 2018

xfs_bmbt_irec.br_blockcount is declared as xfs_filblks_t, which is an
unsigned 64-bit integer.  Though the bmbt helpers will never set a value
larger than 2^21 (since the underlying on-disk extent record has a
length field that is only 21 bits wide), we should be a little defensive
about checking that a bmbt record doesn't exceed what we're expecting or
overflow into the next AG.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

a5f460b1

xfs: btree format ifork loader should check for zero numrecs · 55e45429

由 Darrick J. Wong 提交于 1月 16, 2018

A btree format inode fork with zero records makes no sense, so reject it
if we see it, or else we can miscalculate memory allocations. Found by
zeroes fuzzing {a,u3}.bmbt.numrecs in xfs/{374,378,412} with KASAN.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

55e45429

xfs: attr leaf verifier needs to check for obviously bad count · 79a69bf8

由 Darrick J. Wong 提交于 1月 16, 2018

In the attribute leaf verifier, we can check for obviously bad values of
firstused and count so that later attempts at lasthash don't run off the
end of the memory buffer.  Found by ones fuzzing hdr.count in xfs/400 with
KASAN.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

79a69bf8

xfs: directory scrubber must walk through data block to offset · ce92d29d

由 Darrick J. Wong 提交于 1月 16, 2018

In xfs_scrub_dir_rec, we must walk through the directory block entries
to arrive at the offset given by the hash structure.  If we blindly
trust the hash address, we can end up midway into a directory entry and
stray outside the block.  Found by lastbit fuzzing lents[3].address in
xfs/390 with KASAN enabled.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ce92d29d

xfs: don't iunlock unlocked inodes · 638a7174

由 Darrick J. Wong 提交于 1月 16, 2018

Don't iunlock an unlocked inode, which can happen if the parent pointer
scrubber bails out with sc->ip unlocked while trying to grab the parent
directory inode.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

638a7174

xfs: scrub in-core metadata · cf1b0b8b

由 Darrick J. Wong 提交于 1月 16, 2018

Whenever we load a buffer, explicitly re-call the structure verifier to
ensure that memory isn't corrupting things.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

cf1b0b8b

xfs: cross-reference the block mappings when possible · 561f648a

由 Darrick J. Wong 提交于 1月 16, 2018

Use an inode's block mappings to cross-reference inode block counters.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

561f648a

xfs: cross-reference the realtime bitmap · 46d9bfb5

由 Darrick J. Wong 提交于 1月 16, 2018

While we're scrubbing various btrees, cross-reference the records
with the other metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

46d9bfb5

xfs: cross-reference refcount btree during scrub · f6d5fc21

由 Darrick J. Wong 提交于 1月 16, 2018

During metadata btree scrub, we should cross-reference with the
reference counts.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

f6d5fc21

xfs: cross-reference the rmapbt data with the refcountbt · dbde19da

由 Darrick J. Wong 提交于 1月 16, 2018

Cross reference the refcount data with the rmap data to check that the
number of rmaps for a given block match the refcount of that block, and
that CoW blocks (which are owned entirely by the refcountbt) are tracked
as well.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

dbde19da

xfs: cross-reference reverse-mapping btree · d852657c

由 Darrick J. Wong 提交于 1月 16, 2018

When scrubbing various btrees, we should cross-reference the records
with the reverse mapping btree and ensure that traversing the btree
finds the same number of blocks that the rmapbt thinks are owned by
that btree.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

d852657c

xfs: cross-reference inode btrees during scrub · 2e6f2756

由 Darrick J. Wong 提交于 1月 16, 2018

Cross-reference the inode btrees with the other metadata when we
scrub the filesystem.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

2e6f2756

xfs: cross-reference bnobt records with cntbt · e1134b12

由 Darrick J. Wong 提交于 1月 16, 2018

Scrub should make sure that each bnobt record has a corresponding
cntbt record.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

e1134b12

xfs: cross-reference with the bnobt · 52dc4b44

由 Darrick J. Wong 提交于 1月 16, 2018

When we're scrubbing various btrees, cross-reference the records with
the bnobt to ensure that we don't also think the space is free.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

52dc4b44

xfs: introduce scrubber cross-referencing stubs · 166d7641

由 Darrick J. Wong 提交于 1月 16, 2018

Create some stubs that will be used to cross-reference metadata records.
The actual cross-referencing will be filled in by subsequent patches.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

166d7641

xfs: check btree block ownership with bnobt/rmapbt when scrubbing btree · 858333dc

由 Darrick J. Wong 提交于 1月 16, 2018

When scanning a metadata btree block, cross-reference the block location
with the free space btree and the reverse mapping btree to ensure that
the rmapbt knows about the block and the bnobt does not.  Add a
mechanism to defer checks when we happen to be scanning the bnobt/rmapbt
itself because it's less efficient to repeatedly clone and destroy the
cursor.

This patch provides the framework to make btree block owner checks
happen; the actual meat will be added in subsequent patches.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

858333dc

xfs: fix a few erroneous process_error calls in the scrubbers · 9a7e2695

由 Darrick J. Wong 提交于 1月 16, 2018

There are a few places where we make a libxfs api call on behalf of some
object other than the one we're scrubbing but inadvertently call the
regular process_error function. When this happens we mark the object
corrupt even though it was corruption in /some other/ object that
actually produced the -EFSCORRUPTED code. The correct output flag for
these situations is SCRUB_OFLAG_XFAIL, not SCRUB_OFLAG_CORRUPT, so fix
this now that we also have a helper to set these.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

9a7e2695

xfs: set up scrub cross-referencing helpers · 64b12563

由 Darrick J. Wong 提交于 1月 16, 2018

Create some helper functions that we'll use later to deal with problems
we might encounter while cross referencing metadata with other metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

64b12563

xfs: add scrub cross-referencing helpers for the refcount btrees · 49db55ec

由 Darrick J. Wong 提交于 1月 16, 2018

Add a couple of functions to the refcount btrees that will be used
to cross-reference metadata against the refcountbt.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

49db55ec

xfs: add scrub cross-referencing helpers for the rmap btrees · ed7c52d4

由 Darrick J. Wong 提交于 1月 16, 2018

Add a couple of functions to the rmap btrees that will be used
to cross-reference metadata against the rmapbt.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ed7c52d4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功