提交 · 334fd34d76f237c0ee58dfc400d2c4e34d660544 · openeuler / Kernel

03 7月, 2017 1 次提交

vfs: Add page_cache_seek_hole_data helper · 334fd34d

由 Andreas Gruenbacher 提交于 6月 29, 2017

Both ext4 and xfs implement seeking for the next hole or piece of data
in unwritten extents by scanning the page cache, and both versions share
the same bug when iterating the buffers of a page: the start offset into
the page isn't taken into account, so when a page fits more than two
filesystem blocks, things will go wrong.  For example, on a filesystem
with a block size of 1k, the following command will fail:

  xfs_io -f -c "falloc 0 4k" \
            -c "pwrite 1k 1k" \
            -c "pwrite 3k 1k" \
            -c "seek -a -r 0" foo

In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
SEEK_DATA) will return the correct result.

Introduce a generic vfs helper for seeking in the page cache that gets
this right.  The next commits will replace the filesystem specific
implementations.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
[hch: dropped the export]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

334fd34d

02 7月, 2017 3 次提交

xfs: remove a whitespace-only line from xfs_fs_get_nextdqblk · 7175a112

由 Christoph Hellwig 提交于 6月 29, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

7175a112

xfs: rewrite xfs_dq_get_next_id using xfs_iext_lookup_extent · bda250db

由 Christoph Hellwig 提交于 6月 29, 2017

This goes straight to a single lookup in the extent list and avoids a
roundtrip through two layers that don't add any value for the simple
quoata file that just has data or holes and no page cache, delayed
allocation, unwritten extent or COW fork (which btw, doesn't seem to
be handled by the existing SEEK HOLE/DATA code).
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

bda250db

xfs: Check for m_errortag initialization in xfs_errortag_test · d04c241c

由 Carlos Maiolino 提交于 6月 30, 2017

While adding error injection into IO completion, I notice the lack of
initialization check in xfs_errortag_test(), make the error injection
mechanism unable to be used there.

IO completion is executed a few times before the error injection
mechanism is initialized, so to be safer, make xfs_errortag_test() check
if the errortag is properly initialized.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d04c241c

28 6月, 2017 9 次提交

xfs: grab dquots without taking the ilock · 50e0bdbe

由 Darrick J. Wong 提交于 6月 27, 2017

Add a new dqget flag that grabs the dquot without taking the ilock.
This will be used by the scrubber (which will have already grabbed
the ilock) to perform basic sanity checking of the quota data.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

50e0bdbe

xfs: fix semicolon.cocci warnings · 244e3dea

由 kbuild test robot 提交于 6月 26, 2017

fs/xfs/xfs_log.c:2092:38-39: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

Fixes: d4ca1d55 ("xfs: dump transaction usage details on log reservation overrun")
CC: Brian Foster <bfoster@redhat.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

244e3dea

xfs: Don't clear SGID when inheriting ACLs · 8ba35875

由 Jan Kara 提交于 6月 26, 2017

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by calling __xfs_set_acl() instead of xfs_set_acl() when
setting up inode in xfs_generic_create(). That prevents SGID bit
clearing and mode is properly set by posix_acl_create() anyway. We also
reorder arguments of __xfs_set_acl() to match the ordering of
xfs_set_acl() to make things consistent.

Fixes: 07393101
CC: stable@vger.kernel.org
CC: Darrick J. Wong <darrick.wong@oracle.com>
CC: linux-xfs@vger.kernel.org
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

8ba35875

xfs: free cowblocks and retry on buffered write ENOSPC · cf2cb784

由 Brian Foster 提交于 6月 20, 2017

XFS runs an eofblocks reclaim scan before returning an ENOSPC error to
userspace for buffered writes. This facilitates aggressive speculative
preallocation without causing user visible side effects such as
premature ENOSPC.

Run a cowblocks scan in the same situation to reclaim lingering COW fork
preallocation throughout the filesystem.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

cf2cb784

xfs: replace log_badcrc_factor knob with error injection tag · 3e88a007

由 Brian Foster 提交于 6月 27, 2017

Now that error injection tags support dynamic frequency adjustment,
replace the debug mode sysfs knob that controls log record CRC error
injection with an error injection tag.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3e88a007

xfs: convert drop_writes to use the errortag mechanism · f8c47250

由 Darrick J. Wong 提交于 6月 20, 2017

We now have enhanced error injection that can control the frequency
with which errors happen, so convert drop_writes to use this.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

f8c47250

xfs: remove unneeded parameter from XFS_TEST_ERROR · 9e24cfd0

由 Darrick J. Wong 提交于 6月 20, 2017

Since we moved the injected error frequency controls to the mountpoint,
we can get rid of the last argument to XFS_TEST_ERROR.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

9e24cfd0

xfs: expose errortag knobs via sysfs · c6840101

由 Darrick J. Wong 提交于 6月 20, 2017

Creates a /sys/fs/xfs/$dev/errortag/ directory to control the errortag
values directly.  This enables us to control the randomness values,
rather than having to accept the defaults.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

c6840101

xfs: make errortag a per-mountpoint structure · 31965ef3

由 Darrick J. Wong 提交于 6月 20, 2017

Remove the xfs_etest structure in favor of a per-mountpoint structure.
This will give us the flexibility to set as many error injection points
as we want, and later enable us to set up sysfs knobs to set the trigger
frequency as we wish.  This comes at a cost of higher memory use, but
unti we hit 1024 injection points (we're at 29) or a lot of mounts this
shouldn't be a huge issue.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>

31965ef3

25 6月, 2017 1 次提交

xfs: free uncommitted transactions during log recovery · 39775431

由 Brian Foster 提交于 6月 24, 2017

Log recovery allocates in-core transaction and member item data
structures on-demand as it processes the on-disk log. Transactions
are allocated on first encounter on-disk and stored in a hash table
structure where they are easily accessible for subsequent lookups.
Transaction items are also allocated on demand and are attached to
the associated transactions.

When a commit record is encountered in the log, the transaction is
committed to the fs and the in-core structures are freed. If a
filesystem crashes or shuts down before all in-core log buffers are
flushed to the log, however, not all transactions may have commit
records in the log. As expected, the modifications in such an
incomplete transaction are not replayed to the fs. The in-core data
structures for the partial transaction are never freed, however,
resulting in a memory leak.

Update xlog_do_recovery_pass() to first correctly initialize the
hash table array so empty lists can be distinguished from populated
lists on function exit. Update xlog_recover_free_trans() to always
remove the transaction from the list prior to freeing the associated
memory. Finally, walk the hash table of transaction lists as the
last step before it goes out of scope and free any transactions that
may remain on the lists. This prevents a memory leak of partial
transactions in the log.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

39775431

21 6月, 2017 7 次提交

xfs: don't allow bmap on rt files · 61d819e7

由 Darrick J. Wong 提交于 6月 19, 2017

bmap returns a dumb LBA address but not the block device that goes with
that LBA. Swapfiles don't care about this and will blindly assume that
the data volume is the correct blockdev, which is totally bogus for
files on the rt subvolume. This results in the swap code doing IOs to
arbitrary locations on the data device(!) if the passed in mapping is a
realtime file, so just turn off bmap for rt files.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

61d819e7

xfs: allow reading of already-locked remote symbolic link · 5da8f2f8

由 Darrick J. Wong 提交于 6月 16, 2017

Expose the readlink variant that doesn't take the inode lock so that
the scrubber can inspect symlink contents.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

5da8f2f8

xfs: pass along transaction context when reading xattr block buffers · ad017f65

由 Darrick J. Wong 提交于 6月 16, 2017

Teach the extended attribute reading functions to pass along a
transaction context if one was supplied. The extended attribute scrub
code will use transactions to lock buffers and avoid deadlocking with
itself in the case of loops; since it will already have the inode
locked, also create xattr get/list helpers that don't take locks.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

ad017f65

xfs: pass along transaction context when reading directory block buffers · acb9553c

由 Darrick J. Wong 提交于 6月 16, 2017

Teach the directory reading functions to pass along a transaction context
if one was supplied. The directory scrub code will use transactions to
lock buffers and avoid deadlocking with itself in the case of loops.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

acb9553c

xfs: return the hash value of a leaf1 directory block · 8e8877e6

由 Darrick J. Wong 提交于 6月 16, 2017

Modify the existing dir leafn lasthash function to enable us to
calculate the highest hash value of a leaf1 block.  This will be used by
the directory scrubbing code to check the sanity of hashes in leaf1
directory blocks.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

8e8877e6

xfs: refactor the ifork block counting function · e7f5d5ca

由 Darrick J. Wong 提交于 6月 16, 2017

Refactor the inode fork block counting function to count extents for us
at the same time.  This will be used by the bmbt scrubber function.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

e7f5d5ca

xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior · d29cb3e4

由 Darrick J. Wong 提交于 6月 16, 2017

There is an inconsistency in the way that _bmap_count_blocks deals with
delalloc reservations -- if the specified fork is in extents format,
*count is set to the total number of blocks referenced by the in-core
fork, including delalloc extents.  However, if the fork is in btree
format, *count is set to the number of blocks referenced by the on-disk
fork, which does /not/ include delalloc extents.

For the lone existing caller of _bmap_count_blocks this hasn't been an
issue because the function is only used to count xattr fork blocks
(where there aren't any delalloc reservations).  However, when scrub
comes along it will use this same function to check di_nblocks against
both on-disk extent maps, so we need this behavior to be consistent.

Therefore, fix _bmap_count_leaves not to include delalloc extents and
remove unnecessary parameters.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

d29cb3e4

20 6月, 2017 9 次提交

xfs: separate function to check if inode shares extents · ea7cdd7b

由 Darrick J. Wong 提交于 6月 16, 2017

Separate the "clear reflink flag" function into one function that checks
if the flag is needed, and a second function that checks and clears the
flag.  The inode scrub code will want to check the necessity of the flag
without clearing it.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

ea7cdd7b

xfs: reflink find shared should take a transaction · 92ff7285

由 Darrick J. Wong 提交于 6月 16, 2017

Adapt _reflink_find_shared to take an optional transaction pointer.  The
inode scrubber code will need to decide (within transaction context) if
a file has shared blocks.  To avoid buffer deadlocks, we must pass the
tp through to this function's utility calls.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

92ff7285

xfs: check if an inode is cached and allocated · 378f681c

由 Darrick J. Wong 提交于 6月 19, 2017

Check the inode cache for a particular inode number. If it's in the
cache, check that it's not currently being reclaimed. If it's not being
reclaimed, return zero if the inode is allocated. This function will be
used by various scrubbers to decide if the cache is more up to date
than the disk in terms of checking if an inode is allocated.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

378f681c

xfs: export _inobt_btrec_to_irec and _ialloc_cluster_alignment for scrub · e936945e

由 Darrick J. Wong 提交于 6月 16, 2017

Create a function to extract an in-core inobt record from a generic
btree_rec union so that scrub will be able to check inobt records
and check inode block alignment.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

e936945e

xfs: plumb in needed functions for range querying of various btrees · 118bb47e

由 Darrick J. Wong 提交于 6月 16, 2017

Plumb in the pieces (init_high_key, diff_two_keys) necessary to call
query_range on the inode space and block mapping btrees and to extract
raw btree records.  This will eventually be used by the inobt and bmbt
scrubbers.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

118bb47e

xfs: export various function for the online scrubber · 26788097

由 Darrick J. Wong 提交于 6月 16, 2017

Export various internal functions so that the online scrubber can use
them to check the state of metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

26788097

xfs: always compile the btree inorder check functions · 38dee376

由 Darrick J. Wong 提交于 6月 16, 2017

The btree record and key inorder check functions will be used by the
btree scrubber code, so make sure they're always built.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

38dee376

xfs: remove double-underscore integer types · c8ce540d

由 Darrick J. Wong 提交于 6月 16, 2017

This is a purely mechanical patch that removes the private
__{u,}int{8,16,32,64}_t typedefs in favor of using the system
{u,}int{8,16,32,64}_t typedefs.  This is the sed script used to perform
the transformation and fix the resulting whitespace and indentation
errors:

s/typedef\t__uint8_t/typedef __uint8_t\t/g
s/typedef\t__uint/typedef __uint/g
s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g
s/__uint8_t\t/__uint8_t\t\t/g
s/__uint/uint/g
s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g
s/__int/int/g
/^typedef.*int[0-9]*_t;$/d
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

c8ce540d

xfs: optimize _btree_query_all · 5a4c7334

由 Darrick J. Wong 提交于 6月 16, 2017

Don't bother wandering our way through the leaf nodes when the caller
issues a query_all; just zoom down the left side of the tree and walk
rightwards along level zero.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>

5a4c7334

19 6月, 2017 10 次提交

xfs: remove bli from AIL before release on transaction abort · 3d4b4a3e

由 Brian Foster 提交于 6月 14, 2017

When a buffer is modified, logged and committed, it ultimately ends
up sitting on the AIL with a dirty bli waiting for metadata
writeback. If another transaction locks and invalidates the buffer
(freeing an inode chunk, for example) in the meantime, the bli is
flagged as stale, the dirty state is cleared and the bli remains in
the AIL.

If a shutdown occurs before the transaction that has invalidated the
buffer is committed, the transaction is ultimately aborted. The log
items are flagged as such and ->iop_unlock() handles the aborted
items. Because the bli is clean (due to the invalidation),
->iop_unlock() unconditionally releases it. The log item may still
reside in the AIL, however, which means the I/O completion handler
may still run and attempt to access it. This results in assert
failure due to the release of the bli while still present in the AIL
and a subsequent NULL dereference and panic in the buffer I/O
completion handling. This can be reproduced by running generic/388
in repetition.

To avoid this problem, update xfs_buf_item_unlock() to first check
whether the bli is aborted and if so, remove it from the AIL before
it is released. This ensures that the bli is no longer accessed
during the shutdown sequence after it has been freed.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3d4b4a3e

xfs: release bli from transaction properly on fs shutdown · 79e641ce

由 Brian Foster 提交于 6月 14, 2017

If a filesystem shutdown occurs with a buffer log item in the CIL
and a log force occurs, the ->iop_unpin() handler is generally
expected to tear down the bli properly. This entails freeing the bli
memory and releasing the associated hold on the buffer so it can be
released and the filesystem unmounted.

If this sequence occurs while ->bli_refcount is elevated (i.e.,
another transaction is open and attempting to modify the buffer),
however, ->iop_unpin() may not be responsible for releasing the bli.
Instead, the transaction may release the final ->bli_refcount
reference and thus xfs_trans_brelse() is responsible for tearing
down the bli.

While xfs_trans_brelse() does drop the reference count, it only
attempts to release the bli if it is clean (i.e., not in the
CIL/AIL). If the filesystem is shutdown and the bli is sitting dirty
in the CIL as noted above, this ends up skipping the last
opportunity to release the bli. In turn, this leaves the hold on the
buffer and causes an unmount hang. This can be reproduced by running
generic/388 in repetition.

Update xfs_trans_brelse() to handle this shutdown corner case
correctly. If the final bli reference is dropped and the filesystem
is shutdown, remove the bli from the AIL (if necessary) and release
the bli to drop the buffer hold and ensure an unmount does not hang.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

79e641ce

xfs: avoid harmless gcc-7 warnings · 0cbe48cc

由 Arnd Bergmann 提交于 6月 14, 2017

gcc-7 flags the use of integer math inside of a condition
as a potential bug:

fs/xfs/xfs_bmap_util.c: In function 'xfs_swap_extents_check_format':
fs/xfs/xfs_bmap_util.c:1619:8: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]
fs/xfs/xfs_bmap_util.c:1629:8: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]

There is already a helper function for testing the di_forkoff
field for zero, so let's use that instead to shut up the warning.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

0cbe48cc

xfs: remove lsn relevant fields from xfs_trans structure and its users · f990fc5a

由 Shan Hai 提交于 6月 14, 2017

The t_lsn is not used anymore and the t_commit_lsn is used as a tmp
storage for the checkpoint sequence number only in the current code.

And the start/commit lsn are tracked as a transaction group tag in
the xfs_cil_ctx instead of a single transaction, so remove them from
the xfs_trans structure and their users to match with the design.
Signed-off-by: NShan Hai <shan.hai@oracle.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

f990fc5a

xfs: remove XFS_HSIZE · 3398a400

由 Christoph Hellwig 提交于 6月 14, 2017

XFS_HSIZE is an extremly confusing way to calculate the size of handle_t.
Given that handle_t always only had two sizes, and one of them isn't
even covered by XFS_HSIZE to start with just remove the macro and use
a constant sizeof expression.

Note that XFS_HSIZE isn't used in xfsprogs, xfsdump or xfstests either.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

3398a400

xfs: dump transaction usage details on log reservation overrun · d4ca1d55

由 Brian Foster 提交于 6月 14, 2017

If a transaction log reservation overrun occurs, the ticket data
associated with the reservation is dumped in xfs_log_commit_cil().
This occurs long after the transaction items and details have been
removed from the transaction and effectively lost. This limited set
of ticket data provides very little information to support debugging
transaction overruns based on the typical report.

To improve transaction log reservation overrun reporting, create a
helper to dump transaction details such as log items, log vector
data, etc., as well as the underlying ticket data for the
transaction. Move the overrun detection from xfs_log_commit_cil() to
xlog_cil_insert_items() so it occurs prior to migration of the
logged items to the CIL. Call the new helper such that it is able to
dump this transaction data before it is lost.

Also, warn on overrun to provide callstack context for the offending
transaction and include a few additional messages from
xlog_cil_insert_items() to display the reservation consumed locally
for overhead such as log vector headers, split region headers and
the context ticket. This provides a complete general breakdown of
the reservation consumption of a transaction when/if it happens to
overrun the reservation.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

d4ca1d55

xfs: refactor xlog_cil_insert_items() to facilitate transaction dump · e2f23426

由 Brian Foster 提交于 6月 14, 2017

Transaction reservation overrun detection currently occurs too late
to print useful information about the offending transaction.
Ideally, the transaction data is printed before the associated log
items are moved from the transaction to the CIL, which occurs in
xlog_cil_insert_items(), such that details of the items logged by
the transaction are available for analysis.

Refactor xlog_cil_insert_items() to facilitate moving tx overrun
detection to this function. Update the function to track each bit of
extra log reservation stolen from the transaction (i.e., such as for
the CIL context ticket) and perform the log item migration as the
last operation before the CIL lock is released. This creates a
context where the transaction reservation consumption has been fully
calculated when the log items are moved to the CIL. This patch makes
no functional changes.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e2f23426

xfs: separate shutdown from ticket reservation print helper · 7d2d5653

由 Brian Foster 提交于 6月 14, 2017

xlog_print_tic_res() pre-dates delayed logging and the committed
items list (CIL) and thus retains some factoring warts, such as hard
coded function names in the output and the fact that it induces a
shutdown.

In preparation for more detailed logging of regular transaction
overrun situations, refactor xlog_print_tic_res() to be slightly
more generic. Reword some of the warning messages and pull the
shutdown into the callers.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

7d2d5653

xfs: define fatal assert build time tunable · 1040960e

由 Brian Foster 提交于 6月 14, 2017

While configurable at runtime, the DEBUG mode assert failure
behavior is usually either desired or not for a particular
situation. For example, developers using kernel modules may prefer
for fatal asserts to remain disabled across module reloads while QE
engineers doing broad regression testing may prefer to have fatal
asserts enabled on boot to facilitate data collection for bug
reports.

To provide a compromise/convenience for developers, create a Kconfig
option that sets the default value of the DEBUG mode 'bug_on_assert'
sysfs tunable. The default behavior remains to trigger kernel BUGs
on assert failures to preserve existing behavior across kernel
configuration updates with DEBUG mode enabled.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

1040960e

xfs: define bug_on_assert debug mode sysfs tunable · ccdab3d6

由 Brian Foster 提交于 6月 14, 2017

In DEBUG mode, assert failures unconditionally trigger a kernel BUG.
This is useful in diagnostic situations to panic a system and
collect detailed state information at the time of a failure.

This can also cause problems in cases where DEBUG mode code is
desired but it is preferable not trigger kernel BUGs on assert
failure. For example, during development of new code or during
certain xfstests tests that intentionally cause corruption and test
the kernel for survival (but otherwise may expect to trigger assert
failures).

To provide additional flexibility, create the
<sysfs>/fs/xfs/debug/bug_on_assert tunable to configure assert
failure behavior at runtime. This tunable is only available in DEBUG
mode and is enabled by default to preserve existing default
behavior. When disabled, assert failures in DEBUG mode result in
kernel warnings.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

ccdab3d6

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功