提交 · 70a20655339ab90866300e174a47631df49a018a · openeuler / Kernel

29 1月, 2018 1 次提交

Get rid of xfs_buf_log_item_t typedef · 70a20655

由 Carlos Maiolino 提交于 1月 24, 2018

Take advantage of the rework on xfs_buf log items list, to get rid of
ths typedef for xfs_buf_log_item.

This patch also fix some indentation alignment issues found along the way.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NBill O'Donnell <billodo@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

70a20655

18 1月, 2018 25 次提交

xfs: fix non-debug build compiler warnings · 75d4a13b

由 Darrick J. Wong 提交于 1月 16, 2018

Fix compiler warning on non-debug build
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

75d4a13b

xfs: check sb_agblocks and sb_agblklog when validating superblock · 4bb73d01

由 Darrick J. Wong 提交于 1月 16, 2018

Currently, we don't check sb_agblocks or sb_agblklog when we validate
the superblock, which means that we can fuzz garbage values into those
values and the mount succeeds.  This leads to all sorts of UBSAN
warnings in xfs/350 since we can then coerce other parts of xfs into
shifting by ridiculously large values.

Once we've validated agblocks, make sure the agcount makes sense.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

4bb73d01

xfs: recheck reflink / dirty page status before freeing CoW reservations · be78ff0e

由 Darrick J. Wong 提交于 1月 16, 2018

Eryu Guan reported seeing occasional hangs when running generic/269 with
a new fsstress that supports clonerange/deduperange. The cause of this
hang is an infinite loop when we convert the CoW fork extents from
unwritten to real just prior to writing the pages out; the infinite
loop happens because there's nothing in the CoW fork to convert, and so
it spins forever.

The fundamental issue here is that when we go to perform these CoW fork
conversions, we're supposed to have an extent waiting for us, but the
low space CoW reaper has snuck in and blown them away! There are four
conditions that can dissuade the reaper from touching our file -- no
reflink iflag; dirty page cache; writeback in progress; or directio in
progress. We check the four conditions prior to taking the locks, but
we neglect to recheck them once we have the locks, which is how we end
up whacking the writeback that's in progress.

Therefore, refactor the four checks into a helper function and call it
once again once we have the locks to make sure we really want to reap
the inode. While we're at it, add an ASSERT for this weird condition so
that we'll fail noisily if we ever screw this up again.
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Tested-by: NEryu Guan <eguan@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

be78ff0e

xfs: check that br_blockcount doesn't overflow · a5f460b1

由 Darrick J. Wong 提交于 1月 16, 2018

xfs_bmbt_irec.br_blockcount is declared as xfs_filblks_t, which is an
unsigned 64-bit integer.  Though the bmbt helpers will never set a value
larger than 2^21 (since the underlying on-disk extent record has a
length field that is only 21 bits wide), we should be a little defensive
about checking that a bmbt record doesn't exceed what we're expecting or
overflow into the next AG.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

a5f460b1

xfs: btree format ifork loader should check for zero numrecs · 55e45429

由 Darrick J. Wong 提交于 1月 16, 2018

A btree format inode fork with zero records makes no sense, so reject it
if we see it, or else we can miscalculate memory allocations. Found by
zeroes fuzzing {a,u3}.bmbt.numrecs in xfs/{374,378,412} with KASAN.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

55e45429

xfs: attr leaf verifier needs to check for obviously bad count · 79a69bf8

由 Darrick J. Wong 提交于 1月 16, 2018

In the attribute leaf verifier, we can check for obviously bad values of
firstused and count so that later attempts at lasthash don't run off the
end of the memory buffer.  Found by ones fuzzing hdr.count in xfs/400 with
KASAN.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

79a69bf8

xfs: directory scrubber must walk through data block to offset · ce92d29d

由 Darrick J. Wong 提交于 1月 16, 2018

In xfs_scrub_dir_rec, we must walk through the directory block entries
to arrive at the offset given by the hash structure.  If we blindly
trust the hash address, we can end up midway into a directory entry and
stray outside the block.  Found by lastbit fuzzing lents[3].address in
xfs/390 with KASAN enabled.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ce92d29d

xfs: don't iunlock unlocked inodes · 638a7174

由 Darrick J. Wong 提交于 1月 16, 2018

Don't iunlock an unlocked inode, which can happen if the parent pointer
scrubber bails out with sc->ip unlocked while trying to grab the parent
directory inode.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

638a7174

xfs: scrub in-core metadata · cf1b0b8b

由 Darrick J. Wong 提交于 1月 16, 2018

Whenever we load a buffer, explicitly re-call the structure verifier to
ensure that memory isn't corrupting things.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

cf1b0b8b

xfs: cross-reference the block mappings when possible · 561f648a

由 Darrick J. Wong 提交于 1月 16, 2018

Use an inode's block mappings to cross-reference inode block counters.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

561f648a

xfs: cross-reference the realtime bitmap · 46d9bfb5

由 Darrick J. Wong 提交于 1月 16, 2018

While we're scrubbing various btrees, cross-reference the records
with the other metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

46d9bfb5

xfs: cross-reference refcount btree during scrub · f6d5fc21

由 Darrick J. Wong 提交于 1月 16, 2018

During metadata btree scrub, we should cross-reference with the
reference counts.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

f6d5fc21

xfs: cross-reference the rmapbt data with the refcountbt · dbde19da

由 Darrick J. Wong 提交于 1月 16, 2018

Cross reference the refcount data with the rmap data to check that the
number of rmaps for a given block match the refcount of that block, and
that CoW blocks (which are owned entirely by the refcountbt) are tracked
as well.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

dbde19da

xfs: cross-reference reverse-mapping btree · d852657c

由 Darrick J. Wong 提交于 1月 16, 2018

When scrubbing various btrees, we should cross-reference the records
with the reverse mapping btree and ensure that traversing the btree
finds the same number of blocks that the rmapbt thinks are owned by
that btree.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

d852657c

xfs: cross-reference inode btrees during scrub · 2e6f2756

由 Darrick J. Wong 提交于 1月 16, 2018

Cross-reference the inode btrees with the other metadata when we
scrub the filesystem.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

2e6f2756

xfs: cross-reference bnobt records with cntbt · e1134b12

由 Darrick J. Wong 提交于 1月 16, 2018

Scrub should make sure that each bnobt record has a corresponding
cntbt record.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

e1134b12

xfs: cross-reference with the bnobt · 52dc4b44

由 Darrick J. Wong 提交于 1月 16, 2018

When we're scrubbing various btrees, cross-reference the records with
the bnobt to ensure that we don't also think the space is free.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

52dc4b44

xfs: introduce scrubber cross-referencing stubs · 166d7641

由 Darrick J. Wong 提交于 1月 16, 2018

Create some stubs that will be used to cross-reference metadata records.
The actual cross-referencing will be filled in by subsequent patches.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

166d7641

xfs: check btree block ownership with bnobt/rmapbt when scrubbing btree · 858333dc

由 Darrick J. Wong 提交于 1月 16, 2018

When scanning a metadata btree block, cross-reference the block location
with the free space btree and the reverse mapping btree to ensure that
the rmapbt knows about the block and the bnobt does not.  Add a
mechanism to defer checks when we happen to be scanning the bnobt/rmapbt
itself because it's less efficient to repeatedly clone and destroy the
cursor.

This patch provides the framework to make btree block owner checks
happen; the actual meat will be added in subsequent patches.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

858333dc

xfs: fix a few erroneous process_error calls in the scrubbers · 9a7e2695

由 Darrick J. Wong 提交于 1月 16, 2018

There are a few places where we make a libxfs api call on behalf of some
object other than the one we're scrubbing but inadvertently call the
regular process_error function. When this happens we mark the object
corrupt even though it was corruption in /some other/ object that
actually produced the -EFSCORRUPTED code. The correct output flag for
these situations is SCRUB_OFLAG_XFAIL, not SCRUB_OFLAG_CORRUPT, so fix
this now that we also have a helper to set these.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

9a7e2695

xfs: set up scrub cross-referencing helpers · 64b12563

由 Darrick J. Wong 提交于 1月 16, 2018

Create some helper functions that we'll use later to deal with problems
we might encounter while cross referencing metadata with other metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

64b12563

xfs: add scrub cross-referencing helpers for the refcount btrees · 49db55ec

由 Darrick J. Wong 提交于 1月 16, 2018

Add a couple of functions to the refcount btrees that will be used
to cross-reference metadata against the refcountbt.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

49db55ec

xfs: add scrub cross-referencing helpers for the rmap btrees · ed7c52d4

由 Darrick J. Wong 提交于 1月 16, 2018

Add a couple of functions to the rmap btrees that will be used
to cross-reference metadata against the rmapbt.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ed7c52d4

xfs: add scrub cross-referencing helpers for the inode btrees · 2e001266

由 Darrick J. Wong 提交于 1月 16, 2018

Add a couple of functions to the inode btrees that will be used
to cross-reference metadata against the inobt.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

2e001266

xfs: add scrub cross-referencing helpers for the free space btrees · ce1d802e

由 Darrick J. Wong 提交于 1月 16, 2018

Add a couple of functions to the free space btrees that will be used
to cross-reference metadata against the bnobt/cntbt, and a generic
btree function that provides the real implementation.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ce1d802e

17 1月, 2018 1 次提交

xfs: cancel tx on xfs_defer_finish() error during xattr set/remove · c4685628

由 Brian Foster 提交于 1月 16, 2018

Chris Dunlop reports a problem where an xattr operation fails,
reports the following error to syslog and hangs during unmount:

 ================================================
 [ BUG: lock held when returning to user space! ]
 ...
 ------------------------------------------------
 <PID> is leaving the kernel with locks still held!
 1 lock held by <PID>:
  #0:  (sb_internal){......}, at: [<ffffffffa07692a3>] xfs_trans_alloc+0xe3/0x130 [xfs]

The failure/shutdown occurs during deferred ops processing which
leads to an error return from xfs_defer_finish() via
xfs_attr_leaf_addname(). While the root cause of the failure is
unknown corruption, the cause of the subsequent BUG above and
unmount hang is failure to cancel the transaction before returning
to userspace.

The transaction is not cancelled because the out_defer_cancel error
handling paths in the xfs_attr_[leaf|node]_[add|remove]name()
functions clear args.trans without releasing the transaction. The
callers therefore lose the reference to the transaction and fail to
cancel it.

Since xfs_attr_[set|remove]() always cancel args.trans when != NULL
and xfs_defer_finish()->...->xfs_trans_roll() should always return
with a valid transaction, update the leaf/node xattr functions to
not reset args.trans in the error path responsible for cancelling
deferred ops.
Reported-by: NChris Dunlop <chris@onthe.net.au>
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

c4685628

13 1月, 2018 6 次提交

xfs: account finobt blocks properly in perag reservation · ad90bb58

由 Brian Foster 提交于 1月 12, 2018

XFS started using the perag metadata reservation pool for free inode
btree blocks in commit 76d771b4 ("xfs: use per-AG reservations
for the finobt"). To handle backwards compatibility, finobt blocks
are accounted against the pool so long as the full reservation is
available at mount time. Otherwise the ->m_inotbt_nores flag is set
and the filesystem falls back to the traditional per-transaction
finobt reservation.

This commit has two problems:

- finobt blocks are always accounted against the metadata
  reservation on allocation, regardless of ->m_inotbt_nores state
- finobt blocks are never returned to the reservation pool on free

The first problem affects reflink+finobt filesystems where the full
finobt reservation is not available at mount time. finobt blocks are
essentially stolen from the reflink reservation, putting refcountbt
management at risk of allocation failure. The second problem is an
unconditional leak of metadata reservation whenever finobt is
enabled.

Update the finobt block allocation callouts to consider
->m_inotbt_nores and account blocks appropriately. Blocks should be
consistently accounted against the metadata pool when
->m_inotbt_nores is false and otherwise tagged as RESV_NONE.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

ad90bb58

xfs: fix check on struct_version for versions 4 or greater · a8789a5a

由 Colin Ian King 提交于 1月 12, 2018

It appears that the check for versions 4 or more is incorrect and is
off-by-one. Fix this.

Detected by CoverityScan, CID#1463775 ("Logically dead code")

Fixes: ac503a4c ("xfs: refactor the geometry structure filling function")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

a8789a5a

xfs: destroy mutex pag_ici_reclaim_lock before free · 1da06189

由 Xiongwei Song 提交于 1月 11, 2018

The mutex pag_ici_reclaim_lock of xfs_perag_t structure is initialized in
xfs_initialize_perag. If happen errors in xfs_initialize_perag, or free
resources in xfs_free_perag, wo need to destroy the mutex before free
perag.
Signed-off-by: NXiongwei Song <sxwjean@me.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

1da06189

xfs: use %px for data pointers when debugging · c9690043

由 Darrick J. Wong 提交于 1月 09, 2018

Starting with commit 57e73442 ("vsprintf: refactor %pK code out of
pointer"), the behavior of the raw '%p' printk format specifier was
changed to print a 32-bit hash of the pointer value to avoid leaking
kernel pointers into dmesg.  For most situations that's good.

This is /undesirable/ behavior when we're trying to debug XFS, however,
so define a PTR_FMT that prints the actual pointer when we're in debug
mode.

Note that %p for tracepoints still prints the raw pointer, so in the
long run we could consider rewriting some of these messages as
tracepoints.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

c9690043

xfs: use %pS printk format for direct instruction addresses · aff68a55

由 Darrick J. Wong 提交于 1月 09, 2018

Use the %pS instead of the %pF printk format specifier for printing
symbols from direct addresses. This is needed for the ia64, ppc64 and
parisc64 architectures.

While we're at it, be consistent with the capitalization of the 'S'.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

aff68a55

xfs: change 0x%p -> %p in print messages · 3d170aa2

由 Darrick J. Wong 提交于 1月 09, 2018

Since %p prepends "0x" to the outputted string, we can drop the prefix.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

3d170aa2

10 1月, 2018 2 次提交

xfs: clarify units in the failed metadata io message · c219b015

由 Darrick J. Wong 提交于 1月 08, 2018

If a metadata IO error happens, we report the location of the failed IO
request in units of daddrs. However, the printk message misleads people
into thinking that the units are fs blocks, so fix the reported units.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

c219b015

xfs: harden directory integrity checks some more · 46c59736

由 Darrick J. Wong 提交于 1月 09, 2018

If a malicious filesystem image contains a block+ format directory
wherein the directory inode's core.mode is set such that
S_ISDIR(core.mode) == 0, and if there are subdirectories of the
corrupted directory, an attempt to traverse up the directory tree will
crash the kernel in __xfs_dir3_data_check.  Running the online scrub's
parent checks will tend to do this.

The crash occurs because the directory inode's d_ops get set to
xfs_dir[23]_nondir_ops (it's not a directory) but the parent pointer
scrubber's indiscriminate call to xfs_readdir proceeds past the ASSERT
if we have non fatal asserts configured.

Fix the null pointer dereference crash in __xfs_dir3_data_check by
looking for S_ISDIR or wrong d_ops; and teach the parent scrubber
to bail out if it is fed a non-directory "parent".
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

46c59736

09 1月, 2018 5 次提交

xfs: refactor the geometry structure filling function · ac503a4c

由 Darrick J. Wong 提交于 1月 08, 2018

Refactor the geometry structure filling function to use the superblock
to fill the fields.  While we're at it, make the function less indenty
and use some whitespace to make the function easier to read.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

ac503a4c

xfs: hoist xfs_fs_geometry to libxfs · c368ebcd

由 Darrick J. Wong 提交于 1月 08, 2018

Move xfs_fs_geometry to libxfs so that we can clean up the fs geometry
reporting in xfsprogs.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

c368ebcd

xfs: trace log reservations at mount time · b872af2c

由 Darrick J. Wong 提交于 1月 08, 2018

At each mount, emit the transaction reservation type information via
tracepoints. This makes it easier to compare the log reservation info
calculated by the kernel and xfsprogs so that we can more easily diagnose
minimum log size failures on freshly formatted filesystems.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

b872af2c

xfs: dump the first 128 bytes of any corrupt buffer · 9c712a13

由 Darrick J. Wong 提交于 1月 08, 2018

Increase the corrupt buffer dump to the first 128 bytes since v5
filesystems have larger block headers than before.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

9c712a13

xfs: teach error reporting functions to take xfs_failaddr_t · d9418ed0

由 Darrick J. Wong 提交于 1月 08, 2018

Convert the two other error reporting functions to take xfs_failaddr_t
when the caller wishes to capture a code pointer instead of the classic
void * pointer.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

d9418ed0

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功