- 16 11月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Fix a few places where we xlog_alloc_buffer a buffer, hit an error, and then bail out without freeing the buffer. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 14 11月, 2019 2 次提交
-
-
由 Pavel Reichl 提交于
Signed-off-by: NPavel Reichl <preichl@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> [darrick: fix some of the comments] Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Darrick J. Wong 提交于
Convert the last of the open coded corruption check and report idioms to use the XFS_IS_CORRUPT macro. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 11 11月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Convert EIO to EFSCORRUPTED in the logging code when we can determine that the log contents are invalid. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 08 11月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Range check the region counter when we're reassembling regions from log items during log recovery. In the old days ASSERT would halt the kernel, but this isn't true any more so we have to make an explicit error return. Coverity-id: 1132508 Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 05 11月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Make sure we log something to dmesg whenever we return -EFSCORRUPTED up the call stack. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 07 10月, 2019 1 次提交
-
-
由 Bill O'Donnell 提交于
Guarantee zeroed memory buffers for cases where potential memory leak to disk can occur. In these cases, kmem_alloc is used and doesn't zero the buffer, opening the possibility of information leakage to disk. Use existing infrastucture (xfs_buf_allocate_memory) to obtain the already zeroed buffer from kernel memory. This solution avoids the performance issue that would occur if a wholesale change to replace kmem_alloc with kmem_zalloc was done. Signed-off-by: NBill O'Donnell <billodo@redhat.com> [darrick: fix bitwise complaint about kmflag_mask] Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 06 9月, 2019 1 次提交
-
-
由 Dave Chinner 提交于
generic/530 on a machine with enough ram and a non-preemptible kernel can run the AGI processing phase of log recovery enitrely out of cache. This means it never blocks on locks, never waits for IO and runs entirely through the unlinked lists until it either completes or blocks and hangs because it has run out of log space. It runs out of log space because the background CIL push is scheduled but never runs. queue_work() queues the CIL work on the current CPU that is busy, and the workqueue code will not run it on any other CPU. Hence if the unlinked list processing never yields the CPU voluntarily, the push work is delayed indefinitely. This results in the CIL aggregating changes until all the log space is consumed. When the log recoveyr processing evenutally blocks, the CIL flushes but because the last iclog isn't submitted for IO because it isn't full, the CIL flush never completes and nothing ever moves the log head forwards, or indeed inserts anything into the tail of the log, and hence nothing is able to get the log moving again and recovery hangs. There are several problems here, but the two obvious ones from the trace are that: a) log recovery does not yield the CPU for over 4 seconds, b) binding CIL pushes to a single CPU is a really bad idea. This patch addresses just these two aspects of the problem, and are suitable for backporting to work around any issues in older kernels. The more fundamental problem of preventing the CIL from consuming more than 50% of the log without committing will take more invasive and complex work, so will be done as followup work. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 27 8月, 2019 2 次提交
-
-
由 Dave Chinner 提交于
Memory we use to submit for IO needs strict alignment to the underlying driver contraints. Worst case, this is 512 bytes. Given that all allocations for IO are always a power of 2 multiple of 512 bytes, the kernel heap provides natural alignment for objects of these sizes and that suffices. Until, of course, memory debugging of some kind is turned on (e.g. red zones, poisoning, KASAN) and then the alignment of the heap objects is thrown out the window. Then we get weird IO errors and data corruption problems because drivers don't validate alignment and do the wrong thing when passed unaligned memory buffers in bios. TO fix this, introduce kmem_alloc_io(), which will guaranteeat least 512 byte alignment of buffers for IO, even if memory debugging options are turned on. It is assumed that the minimum allocation size will be 512 bytes, and that sizes will be power of 2 mulitples of 512 bytes. Use this everywhere we allocate buffers for IO. This no longer fails with log recovery errors when KASAN is enabled due to the brd driver not handling unaligned memory buffers: # mkfs.xfs -f /dev/ram0 ; mount /dev/ram0 /mnt/test Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Tetsuo Handa 提交于
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP, we can remove KM_NOSLEEP and replace KM_SLEEP with 0. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 03 7月, 2019 1 次提交
-
-
由 Hariprasad Kelam 提交于
Change return types of below functions as they never fails xfs_log_mount_cancel xlog_recover_cancel xlog_recover_cancel_intents fix below issue reported by coccicheck fs/xfs/xfs_log_recover.c:4886:7-12: Unneeded variable: "error". Return "0" on line 4926 Signed-off-by: NHariprasad Kelam <hariprasad.kelam@gmail.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 29 6月, 2019 8 次提交
-
-
由 Eric Sandeen 提交于
There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We need to derive the mount pointer from a buffer in a lot of place. Add a direct pointer to short cut the pointer chasing. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This field is now always idential to b_length. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Now that we don't use struct xfs_buf to hold log recovery buffer rename the related functions and variables to just talk of a buffer instead of using the bp name that we usually use for xfs_buf related functionality. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
The xfs_buf structure is basically used as a glorified container for a memory allocation in the log recovery code. Replace it with a call to kmem_alloc_large and a simple abstraction to read into or write from it synchronously using chained bios. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This simplifies both the helper and the callers. We lost a bit of size sanity checking, but that is already covered by KASAN if needed. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Use the slightly shorter way to get at the buftarg for the log device wherever we can in the log and log recovery code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 12 6月, 2019 3 次提交
-
-
由 Eric Sandeen 提交于
The flags value is always passed as 0 so remove the argument. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Darrick J. Wong 提交于
inode_cluster_size is supposed to represent the size (in bytes) of an inode cluster buffer. We avoid having to handle multiple clusters per filesystem block on filesystems with large blocks by openly rounding this value up to 1 FSB when necessary. However, we never reset inode_cluster_size to reflect this new rounded value, which adds to the potential for mistakes in calculating geometries. Fix this by setting inode_cluster_size to reflect the rounded-up size if needed, and special-case the few places in the sparse inodes code where we actually need the smaller value to validate on-disk metadata. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
由 Darrick J. Wong 提交于
Separate the inode geometry information into a distinct structure. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 02 5月, 2019 1 次提交
-
-
由 Eric Sandeen 提交于
There are several functions which have no opportunity to return an error, and don't contain any ASSERTs which could be argued to be better constructed as error cases. So, make them voids to simplify the callers. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com>
-
- 19 2月, 2019 1 次提交
-
-
由 Darrick J. Wong 提交于
Create a separate magic16 check function so that we don't run afoul of static checkers. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 12 2月, 2019 3 次提交
-
-
由 Darrick J. Wong 提交于
Use xfs_verify_magic to check the magic numbers of inodes. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Brian Foster 提交于
Similar to the inode btree verifier, the same allocation btree verifier structure is shared between the by-bno (bnobt) and by-size (cntbt) btrees. This prevents the ability to distinguish magic values between them. Separate the verifier into two, one for each tree, and assign them appropriately. No functional changes. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
The inobt verifier is reused for the inobt and finobt, which prevents the ability to distinguish between magic values on a per-tree basis. Create a separate finobt structure in preparation for changes to enforce the appropriate magic value for the associated tree. This patch has no functional change. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 13 12月, 2018 1 次提交
-
-
由 Darrick J. Wong 提交于
Store the number of inodes and blocks per inode cluster in the mount data so that we don't have to keep recalculating them. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
- 29 9月, 2018 1 次提交
-
-
由 Brian Foster 提交于
One of the first steps of log recovery is to check for the special case of a zeroed log. If the first cycle in the log is zero or the tail portion of the log is zeroed, the head is set to the first instance of cycle 0. xlog_find_zeroed() includes a sanity check that enforces that the first cycle in the log must be 1 if the last cycle is 0. While this is true in most cases, the check is not totally valid because it doesn't consider the case where the filesystem crashed after a partial/out of order log buffer completion that wraps around the end of the physical log. For example, consider a filesystem that has completed most of the first cycle of the log, reaches the end of the physical log and splits the next single log buffer write into two in order to wrap around the end of the log. If these I/Os are reordered, the second (wrapped) I/O completes and the first happens to fail, the log is left in a state where the last cycle of the log is 0 and the first cycle is 2. This causes the xlog_find_zeroed() sanity check to fail and prevents the filesystem from mounting. This situation has been reproduced on particular systems via repeated runs of generic/475. This is an expected state that log recovery already knows how to deal with, however. Since the log is still partially zeroed, the head is detected correctly and points to a valid tail. The subsequent stale block detection clears blocks beyond the head up to the tail (within a maximum range), with the express purpose of clearing such out of order writes. As expected, this removes the out of order cycle 2 blocks at the physical start of the log. In other words, the only thing that prevents a clean mount and recovery of the filesystem in this scenario is the specific (last == 0 && first != 1) sanity check in xlog_find_zeroed(). Since the log head/tail are now independently validated via cycle, log record and CRC checks, this highly specific first cycle check is of dubious value. Remove it and rely on the higher level validation to determine whether log content is sane and recoverable. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 03 8月, 2018 2 次提交
-
-
由 Brian Foster 提交于
All callers pass ->t_dfops of the associated transactions. Refactor the helpers to receive the transactions and facilitate further cleanups between xfs_defer_ops and xfs_trans. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
Log intent recovery is the last user of an external (on-stack) dfops. The pattern exists because the dfops is used to collect additional deferred operations queued during the whole recovery sequence. The dfops is finished with a new transaction after intent recovery completes. We already have a mechanism to create an empty, container-like transaction to support the scrub infrastructure. We can reuse that mechanism here to drop the final user of external dfops. This facilitates folding dfops state (i.e., dop_low) into the transaction, the elimination of now unused external dfops support and also eliminates the only caller of __xfs_defer_cancel(). Replace the on-stack dfops with an empty transaction and pass it around to the various helpers that queue and finish deferred operations during intent recovery. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 27 7月, 2018 5 次提交
-
-
由 Darrick J. Wong 提交于
Replace the IRELE macro with a proper function so that we can do proper typechecking and so that we can stop open-coding iput in scrub, which means that we'll be able to ftrace inode lifetimes going through scrub correctly. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Brian Foster 提交于
Every caller of xfs_defer_finish() now passes the transaction and its associated ->t_dfops. The xfs_defer_ops parameter is therefore no longer necessary and can be removed. Since most xfs_defer_finish() callers also have to consider xfs_defer_cancel() on error, update the latter to also receive the transaction for consistency. The log recovery code contains an outlier case that cancels a dfops directly without an available transaction. Retain an internal wrapper to support this outlier case for the time being. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NBill O'Donnell <billodo@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
At this point, the transaction subsystem completely manages deferred items internally such that the common and boilerplate xfs_trans_alloc() -> xfs_defer_init() -> xfs_defer_finish() -> xfs_trans_commit() sequence can be replaced with a simple transaction allocation and commit. Remove all such boilerplate deferred ops code. In doing so, we change each case over to use the dfops in the transaction and specifically eliminate: - The on-stack dfops and associated xfs_defer_init() call, as the internal dfops is initialized on transaction allocation. - xfs_bmap_finish() calls that precede a final xfs_trans_commit() of a transaction. - xfs_defer_cancel() calls in error handlers that precede a transaction cancel. The only deferred ops calls that remain are those that are non-deterministic with respect to the final commit of the associated transaction or are open-coded due to special handling. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NBill O'Donnell <billodo@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
bmap and refcount intent processing associates a dfops from the caller with a local transaction to collect all deferred items for post-processing. Use the internal dfops in both of these functions and move the deferred items to the parent dfops before the transaction commits. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NBill O'Donnell <billodo@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
xlog_finish_defer_ops() processes the deferred operations collected over the entire intent recovery sequence. We can't xfs_defer_init() here because the dfops is already populated. Attach it manually and eliminate the last caller of xfs_defer_finish() that doesn't pass ->t_dfops. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBill O'Donnell <billodo@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 12 7月, 2018 3 次提交
-
-
由 Brian Foster 提交于
The buffer I/O submission path consists of separate function calls per type. The buffer I/O type is already controlled via buffer state (XBF_ASYNC), however, so there is no real need for separate submission functions. Combine the buffer submission functions into a single function that processes the buffer appropriately based on XBF_ASYNC. Retain an internal helper with a conditional wait parameter to continue to support batched !XBF_ASYNC submission/completion required by delwri queues. Suggested-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
All but one caller of xfs_defer_init() passes in the ->t_firstblock of the associated transaction. The one outlier is xlog_recover_process_intents(), which simply passes a dummy value because a valid pointer is required. This firstblock variable can simply be removed. At this point we could remove the xfs_defer_init() firstblock parameter and initialize ->t_firstblock directly. Even that is not necessary, however, because ->t_firstblock is automatically reinitialized in the new transaction on a transaction roll. Since xfs_defer_init() should never occur more than once on a particular transaction (since the corresponding finish will roll it), replace the reinit from xfs_defer_init() with an assert that verifies the transaction has a NULLFSBLOCK firstblock. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
Most callers of xfs_defer_init() immediately attach the dfops structure to a transaction. Add a transaction parameter to eliminate much of this boilerplate code. This also helps self-document the fact that many codepaths now expect a dfops pointer implicitly via xfs_trans->t_dfops. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 09 6月, 2018 1 次提交
-
-
由 Dave Chinner 提交于
do_mod() is a hold-over from when we have different sizes for file offsets and and other internal values for 40 bit XFS filesystems. Hence depending on build flags variables passed to do_mod() could change size. We no longer support those small format filesystems and hence everything is of fixed size theses days, even on 32 bit platforms. As such, we can convert all the do_mod() callers to platform optimised modulus operations as defined by linux/math64.h. Individual conversions depend on the types of variables being used. Signed-Off-By: NDave Chinner <dchinner@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-