提交 · cb7d58594412fff106cde550dd9e0a7999cc2a0c · openeuler / Kernel

05 5月, 2020 1 次提交

xfs: remove the xfs_inode_log_item_t typedef · fd9cbe51

由 Christoph Hellwig 提交于 4月 30, 2020

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

fd9cbe51

19 3月, 2020 1 次提交

xfs: only check the superblock version for dinode size calculation · e9e2eae8

由 Christoph Hellwig 提交于 3月 18, 2020

The size of the dinode structure is only dependent on the file system
version, so instead of checking the individual inode version just use
the newly added xfs_sb_version_has_large_dinode helper, and simplify
various calling conventions.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChandan Rajendra <chandanrlinux@gmail.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e9e2eae8

19 11月, 2019 1 次提交

xfs: Remove kmem_zone_free() wrapper · 377bcd5f

由 Carlos Maiolino 提交于 11月 14, 2019

We can remove it now, without needing to rework the KM_ flags.

Use kmem_cache_free() directly.
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

377bcd5f

05 11月, 2019 1 次提交

xfs: always log corruption errors · a5155b87

由 Darrick J. Wong 提交于 11月 02, 2019

Make sure we log something to dmesg whenever we return -EFSCORRUPTED up
the call stack.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

a5155b87

22 10月, 2019 1 次提交

xfs: fix inode fork extent count overflow · 3f8a4f1d

由 Dave Chinner 提交于 10月 17, 2019

[commit message is verbose for discussion purposes - will trim it
down later. Some questions about implementation details at the end.]

Zorro Lang recently ran a new test to stress single inode extent
counts now that they are no longer limited by memory allocation.
The test was simply:

# xfs_io -f -c "falloc 0 40t" /mnt/scratch/big-file
# ~/src/xfstests-dev/punch-alternating /mnt/scratch/big-file

This test uncovered a problem where the hole punching operation
appeared to finish with no error, but apparently only created 268M
extents instead of the 10 billion it was supposed to.

Further, trying to punch out extents that should have been present
resulted in success, but no change in the extent count. It looked
like a silent failure.

While running the test and observing the behaviour in real time,
I observed the extent coutn growing at ~2M extents/minute, and saw
this after about an hour:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next ; \
> sleep 60 ; \
> xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 127657993
fsxattr.nextents = 129683339
#

And a few minutes later this:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 4177861124
#

Ah, what? Where did that 4 billion extra extents suddenly come from?

Stop the workload, unmount, mount:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 166044375
#

And it's back at the expected number. i.e. the extent count is
correct on disk, but it's screwed up in memory. I loaded up the
extent list, and immediately:

# xfs_io -f -c "stat" /mnt/scratch/big-file |grep next
fsxattr.nextents = 4192576215
#

It's bad again. So, where does that number come from?
xfs_fill_fsxattr():

                if (ip->i_df.if_flags & XFS_IFEXTENTS)
                        fa->fsx_nextents = xfs_iext_count(&ip->i_df);
                else
                        fa->fsx_nextents = ip->i_d.di_nextents;

And that's the behaviour I just saw in a nutshell. The on disk count
is correct, but once the tree is loaded into memory, it goes whacky.
Clearly there's something wrong with xfs_iext_count():

inline xfs_extnum_t xfs_iext_count(struct xfs_ifork *ifp)
{
        return ifp->if_bytes / sizeof(struct xfs_iext_rec);
}

Simple enough, but 134M extents is 2**27, and that's right about
where things went wrong. A struct xfs_iext_rec is 16 bytes in size,
which means 2**27 * 2**4 = 2**31 and we're right on target for an
integer overflow. And, sure enough:

struct xfs_ifork {
        int                     if_bytes;       /* bytes in if_u1 */
....

Once we get 2**27 extents in a file, we overflow if_bytes and the
in-core extent count goes wrong. And when we reach 2**28 extents,
if_bytes wraps back to zero and things really start to go wrong
there. This is where the silent failure comes from - only the first
2**28 extents can be looked up directly due to the overflow, all the
extents above this index wrap back to somewhere in the first 2**28
extents. Hence with a regular pattern, trying to punch a hole in the
range that didn't have holes mapped to a hole in the first 2**28
extents and so "succeeded" without changing anything. Hence "silent
failure"...

Fix this by converting if_bytes to a int64_t and converting all the
index variables and size calculations to use int64_t types to avoid
overflows in future. Signed integers are still used to enable easy
detection of extent count underflows. This enables scalability of
extent counts to the limits of the on-disk format - MAXEXTNUM
(2**31) extents.

Current testing is at over 500M extents and still going:

fsxattr.nextents = 517310478
Reported-by: NZorro Lang <zlang@redhat.com>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3f8a4f1d

27 8月, 2019 1 次提交

fs: xfs: Remove KM_NOSLEEP and KM_SLEEP. · 707e0dda

由 Tetsuo Handa 提交于 8月 26, 2019

Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP,
we can remove KM_NOSLEEP and replace KM_SLEEP with 0.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

707e0dda

29 6月, 2019 2 次提交

xfs: remove unused header files · 250d4b4c

由 Eric Sandeen 提交于 6月 28, 2019

There are many, many xfs header files which are included but
unneeded (or included twice) in the xfs code, so remove them.

nb: xfs_linux.h includes about 9 headers for everyone, so those
explicit includes get removed by this.  I'm not sure what the
preference is, but if we wanted explicit includes everywhere,
a followup patch could remove those xfs_*.h includes from
xfs_linux.h and move them into the files that need them.
Or it could be left as-is.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

250d4b4c

xfs: move xfs_ino_geometry to xfs_shared.h · 5467b34b

由 Darrick J. Wong 提交于 6月 28, 2019

The inode geometry structure isn't related to ondisk format; it's
support for the mount structure.  Move it to xfs_shared.h.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

5467b34b

30 7月, 2018 3 次提交

xfs: remove the xfs_ifork_t typedef · 3ba738df

由 Christoph Hellwig 提交于 7月 17, 2018

We only have a few more callers left, so seize the opportunity and kill
it off.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3ba738df

xfs: simplify xfs_idata_realloc · 1216b58b

由 Christoph Hellwig 提交于 7月 17, 2018

Streamline the code and take advantage of the fact that kmem_realloc
through krealloc will be have like a normal allocation if passing in a
NULL old pointer.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

1216b58b

xfs: remove if_real_bytes · fcacbc3f

由 Christoph Hellwig 提交于 7月 17, 2018

The field is only used for asserts, and to track if we really need to do
realloc when growing the inode fork data.  But the krealloc function
already performs this check internally, so there is no need to keep track
of the real allocation size.

This will free space in the inode fork for keeping a sequence counter of
changes to the extent list.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

fcacbc3f

07 6月, 2018 1 次提交

xfs: convert to SPDX license tags · 0b61f8a4

由 Dave Chinner 提交于 6月 05, 2018

Remove the verbose license text from XFS files and replace them
with SPDX tags. This does not change the license of any of the code,
merely refers to the common, up-to-date license files in LICENSES/

This change was mostly scripted. fs/xfs/Makefile and
fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
and modified by the following command:

for f in `git grep -l "GNU General" fs/xfs/` ; do
	echo $f
	cat $f | awk -f hdr.awk > $f.new
	mv -f $f.new $f
done

And the hdr.awk script that did the modification (including
detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
is as follows:

$ cat hdr.awk
BEGIN {
	hdr = 1.0
	tag = "GPL-2.0"
	str = ""
}

/^ \* This program is free software/ {
	hdr = 2.0;
	next
}

/any later version./ {
	tag = "GPL-2.0+"
	next
}

/^ \*\// {
	if (hdr > 0.0) {
		print "// SPDX-License-Identifier: " tag
		print str
		print $0
		str=""
		hdr = 0.0
		next
	}
	print $0
	next
}

/^ \* / {
	if (hdr > 1.0)
		next
	if (hdr > 0.0) {
		if (str != "")
			str = str "\n"
		str = str $0
		next
	}
	print $0
	next
}

/^ \*/ {
	if (hdr > 0.0)
		next
	print $0
	next
}

// {
	if (hdr > 0.0) {
		if (str != "")
			str = str "\n"
		str = str $0
		next
	}
	print $0
}

END { }
$
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

0b61f8a4

24 3月, 2018 2 次提交

xfs: refactor inode verifier error logging · 90a58f95

由 Darrick J. Wong 提交于 3月 23, 2018

Refactor some of the inode verifier failure logging call sites to use
the new xfs_inode_verifier_error method which dumps the offending buffer
as well as the code location of the failed check.  This trims the
output, makes it clearer to the admin that repair must be run, and gives
the developers more details to work from.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

90a58f95

xfs: refactor bmap record validation · 30b0984d

由 Darrick J. Wong 提交于 3月 23, 2018

Refactor the bmap validator into a more complete helper that looks for
extents that run off the end of the device, overflow into the next AG,
or have invalid flag states.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

30b0984d

18 1月, 2018 1 次提交

xfs: btree format ifork loader should check for zero numrecs · 55e45429

由 Darrick J. Wong 提交于 1月 16, 2018

A btree format inode fork with zero records makes no sense, so reject it
if we see it, or else we can miscalculate memory allocations. Found by
zeroes fuzzing {a,u3}.bmbt.numrecs in xfs/{374,378,412} with KASAN.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

55e45429

09 1月, 2018 3 次提交

xfs: provide a centralized method for verifying inline fork data · 9cfb9b47

由 Darrick J. Wong 提交于 1月 08, 2018

Replace the current haphazard dir2 shortform verifier callsites with a
centralized verifier function that can be called either with the default
verifier functions or with a custom set. This helps us strengthen
integrity checking while providing us with flexibility for repair tools.

xfs_repair wants this to be able to supply its own verifier functions
when trying to fix possibly corrupt metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

9cfb9b47

xfs: refactor short form directory structure verifier function · dc042c2d

由 Darrick J. Wong 提交于 1月 08, 2018

Change the short form directory structure verifier function to return
the instruction pointer of a failing check or NULL if everything's ok.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

dc042c2d

xfs: move inode fork verifiers to xfs_dinode_verify · 71493b83

由 Darrick J. Wong 提交于 1月 08, 2018

Consolidate the fork size and format verifiers to xfs_dinode_verify so
that we can reject bad inodes earlier and in a single place.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

71493b83

21 11月, 2017 1 次提交

xfs: abstract out dev_t conversions · 274e0a1f

由 Christoph Hellwig 提交于 11月 20, 2017

And move them to xfs_linux.h so that xfsprogs can stub them out more
easily.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

274e0a1f

07 11月, 2017 7 次提交

xfs: pass struct xfs_bmbt_irec to xfs_bmbt_validate_extent · dac9c9b1

由 Christoph Hellwig 提交于 11月 03, 2017

This removed an unaligned load per extent, as well as the manual poking
into the on-disk extent format.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

dac9c9b1

xfs: remove the nr_extents argument to xfs_iext_insert · 0254c2f2

由 Christoph Hellwig 提交于 11月 03, 2017

We only have two places that insert 2 extents at the same time, so unroll
the loop there.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

0254c2f2

xfs: use a b+tree for the in-core extent list · 6bdcf26a

由 Christoph Hellwig 提交于 11月 03, 2017

Replace the current linear list and the indirection array for the in-core
extent list with a b+tree to avoid the need for larger memory allocations
for the indirection array when lots of extents are present. The current
extent list implementations leads to heavy pressure on the memory
allocator when modifying files with a high extent count, and can lead
to high latencies because of that.

The replacement is a b+tree with a few quirks. The leaf nodes directly
store the extent record in two u64 values. The encoding is a little bit
different from the existing in-core extent records so that the start
offset and length which are required for lookups can be retreived with
simple mask operations. The inner nodes store a 64-bit key containing
the start offset in the first half of the node, and the pointers to the
next lower level in the second half. In either case we walk the node
from the beginninig to the end and do a linear search, as that is more
efficient for the low number of cache lines touched during a search
(2 for the inner nodes, 4 for the leaf nodes) than a binary search.
We store termination markers (zero length for the leaf nodes, an
otherwise impossible high bit for the inner nodes) to terminate the key
list / records instead of storing a count to use the available cache
lines as efficiently as possible.

One quirk of the algorithm is that while we normally split a node half and
half like usual btree implementations we just spill over entries added at
the very end of the list to a new node on its own. This means we get a
100% fill grade for the common cases of bulk insertion when reading an
inode into memory, and when only sequentially appending to a file. The
downside is a slightly higher chance of splits on the first random
insertions.

Both insert and removal manually recurse into the lower levels, but
the bulk deletion of the whole tree is still implemented as a recursive
function call, although one limited by the overall depth and with very
little stack usage in every iteration.

For the first few extents we dynamically grow the list from a single
extent to the next powers of two until we have a first full leaf block
and that building the actual tree.

The code started out based on the generic lib/btree.c code from Joern
Engel based on earlier work from Peter Zijlstra, but has since been
rewritten beyond recognition.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

6bdcf26a

xfs: remove support for inlining data/extents into the inode fork · 43518812

由 Christoph Hellwig 提交于 11月 03, 2017

Supporting a small bit of data inside the inode fork blows up the fork size
a lot, removing the 32 bytes of inline data halves the effective size of
the inode fork (and it still has a lot of unused padding left), and the
performance of a single kmalloc doesn't show up compared to the size to read
an inode or create one.

It also simplifies the fork management code a lot.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

43518812

xfs: introduce the xfs_iext_cursor abstraction · b2b1712a

由 Christoph Hellwig 提交于 11月 03, 2017

Add a new xfs_iext_cursor structure to hide the direct extent map
index manipulations. In addition to the existing lookup/get/insert/
remove and update routines new primitives to get the first and last
extent cursor, as well as moving up and down by one extent are
provided.  Also new are convenience to increment/decrement the
cursor and retreive the new extent, as well as to peek into the
previous/next extent without updating the cursor and last but not
least a macro to iterate over all extents in a fork.

[darrick: rename for_each_iext to for_each_xfs_iext]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

b2b1712a

xfs: iterate over extents in xfs_iextents_copy · 71565f4b

由 Christoph Hellwig 提交于 11月 03, 2017

This actually makes the function very slightly less efficient for now as we
detour through the expanded irect format between the in-core extent format
and the on-disk one instead of just endian swapping them.  But with the
incore extent btree the in-core one will use a different format and the
representation will be entirely hidden.  It also happens to make the
function a whole more readable.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

71565f4b

xfs: pass an on-disk extent to xfs_bmbt_validate_extent · f36bc228

由 Christoph Hellwig 提交于 11月 03, 2017

This prepares for getting rid of the current in-memory extent format.
At the end of the series we will change the calling convention again
to pass the xfs_bmbt_irec structure once it is available everywhere.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

f36bc228

27 10月, 2017 6 次提交

xfs: add a new xfs_iext_lookup_extent_before helper · dc56015f

由 Christoph Hellwig 提交于 10月 23, 2017

This helper looks up the last extent the covers space before the passed
in block number.  This is useful for truncate and similar operations that
operate backwards over the extent list.  For xfs_bunmapi it also is
a slight optimization as we can return early if there are not extents
at or below the end of the to be truncated range.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

dc56015f

xfs: merge xfs_bmap_read_extents into xfs_iread_extents · 211e95bb

由 Christoph Hellwig 提交于 10月 23, 2017

xfs_iread_extents is just a trivial wrapper, there is no good reason
to keep the two separate.

[darrick: minor fixups having left xfs_bmbt_validate_extent intact]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

211e95bb

xfs: remove if_rdev · 66f36464

由 Christoph Hellwig 提交于 10月 19, 2017

We can simply use the i_rdev field in the Linux inode and just convert
to and from the XFS dev_t when reading or logging/writing the inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

66f36464

xfs: remove the never fully implemented UUID fork format · 42b67dc6

由 Christoph Hellwig 提交于 10月 19, 2017

Remove the dead code dealing with the UUID fork format that was never
implemented in Linux (and neither in IRIX as far as I know).
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

42b67dc6

xfs: remove XFS_BMAP_TRACE_EXLIST · e8e0e170

由 Christoph Hellwig 提交于 10月 19, 2017

Instead of looping over all extents in some debug-only helper just
insert trace points into the loops that already exist in the calling
functions.

Also split the xfs_extlist trace point into one each for reading and
writing extents from disk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e8e0e170

xfs: move pre/post-bmap tracing into xfs_iext_update_extent · ca5d8e5b

由 Christoph Hellwig 提交于 10月 19, 2017

xfs_iext_update_extent already has basically all the information needed
to centralize the bmap pre/post tracing. We just need to pass inode +
bmap state instead of the inode fork pointer to get all trace annotations.

In addition to covering all the existing trace points this gives us
tracing coverage for the extent shifting operations for free.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

ca5d8e5b

02 9月, 2017 2 次提交

xfs: fix compiler warnings · 7bf7a193

由 Darrick J. Wong 提交于 8月 31, 2017

Fix up all the compiler warnings that have crept in.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

7bf7a193

xfs: add a xfs_iext_update_extent helper · 67e4e69c

由 Christoph Hellwig 提交于 8月 29, 2017

This helper is used to update an extent record based on the extent index,
and can be used to provide a level of abstractions between callers that
want to modify in-core extent records and the details of the extent list
implementation.

Also switch all users of the xfs_bmbt_set_all(xfs_iext_get_ext(...))
pattern to this new helper.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

67e4e69c

26 4月, 2017 1 次提交

xfs: simplify validation of the unwritten extent bit · 0c1d9e4a

由 Christoph Hellwig 提交于 4月 20, 2017

XFS only supports the unwritten extent bit in the data fork, and only if
the file system has a version 5 superblock or the unwritten extent
feature bit.

We currently have two routines that validate the invariant:
xfs_check_nostate_extents which return -EFSCORRUPTED when it's not met,
and xfs_validate_extent that triggers and assert in debug build.

Both of them iterate over all extents of an inode fork when called,
which isn't very efficient.

This patch instead adds a new helper that verifies the invariant one
extent at a time, and calls it from the places where we iterate over
all extents to converted them from or two the in-memory format.  The
callers then return -EFSCORRUPTED when reading invalid extents from
disk, or trigger an assert when writing them to disk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

0c1d9e4a

04 4月, 2017 1 次提交

xfs: rework the inline directory verifiers · 78420281

由 Darrick J. Wong 提交于 4月 03, 2017

The inline directory verifiers should be called on the inode fork data,
which means after iformat_local on the read side, and prior to
ifork_flush on the write side.  This makes the fork verifier more
consistent with the way buffer verifiers work -- i.e. they will operate
on the memory buffer that the code will be reading and writing directly.

Furthermore, revise the verifier function to return -EFSCORRUPTED so
that we don't flood the logs with corruption messages and assert
notices.  This has been a particular problem with xfs/348, which
triggers the XFS_WANT_CORRUPTED_RETURN assertions, which halts the
kernel when CONFIG_XFS_DEBUG=y.  Disk corruption isn't supposed to do
that, at least not in a verifier.
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

78420281

29 3月, 2017 1 次提交

xfs: rework the inline directory verifiers · 005c5db8

由 Darrick J. Wong 提交于 3月 28, 2017

The inline directory verifiers should be called on the inode fork data,
which means after iformat_local on the read side, and prior to
ifork_flush on the write side.  This makes the fork verifier more
consistent with the way buffer verifiers work -- i.e. they will operate
on the memory buffer that the code will be reading and writing directly.

Furthermore, revise the verifier function to return -EFSCORRUPTED so
that we don't flood the logs with corruption messages and assert
notices.  This has been a particular problem with xfs/348, which
triggers the XFS_WANT_CORRUPTED_RETURN assertions, which halts the
kernel when CONFIG_XFS_DEBUG=y.  Disk corruption isn't supposed to do
that, at least not in a verifier.
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
---
v2: get the inode d_ops the proper way
v3: describe the bug that this patch fixes; no code changes

005c5db8

15 3月, 2017 1 次提交

xfs: verify inline directory data forks · 630a04e7

由 Darrick J. Wong 提交于 3月 15, 2017

When we're reading or writing the data fork of an inline directory,
check the contents to make sure we're not overflowing buffers or eating
garbage data.  xfs/348 corrupts an inline symlink into an inline
directory, triggering a buffer overflow bug.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
---
v2: add more checks consistent with _dir2_sf_check and make the verifier
usable from anywhere.

630a04e7

03 2月, 2017 2 次提交

xfs: check for obviously bad level values in the bmbt root · b3bf607d

由 Darrick J. Wong 提交于 2月 02, 2017

We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's
no such thing as a zero-level bmbt (for that we have extents format),
so if we see this, send back an error code.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

b3bf607d

xfs: fix toctou race when locking an inode to access the data map · 4b5bd5bf

由 Darrick J. Wong 提交于 2月 02, 2017

We use di_format and if_flags to decide whether we're grabbing the ilock
in btree mode (btree extents not loaded) or shared mode (anything else),
but the state of those fields can be changed by other threads that are
also trying to load the btree extents -- IFEXTENTS gets set before the
_bmap_read_extents call and cleared if it fails.

We don't actually need to have IFEXTENTS set until after the bmbt
records are successfully loaded and validated, which will fix the race
between multiple threads trying to read the same directory.  The next
patch strengthens directory bmbt validation by refusing to open the
directory if reading the bmbt to start directory readahead fails.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4b5bd5bf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功