- 26 4月, 2017 18 次提交
-
-
由 Christoph Hellwig 提交于
By passing the whole fsmap_head structure and an index we can get the user point annotations right for the embedded variable sized array in struct fsmap_head. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> [darrick: change idx to unsigned int] Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
At least if we want to be able to recognize the pattern. Add a missing byte swap to the corruption injection case in xlog_sync. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Found by sparse. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Found by sparse. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
XFS only supports the unwritten extent bit in the data fork, and only if the file system has a version 5 superblock or the unwritten extent feature bit. We currently have two routines that validate the invariant: xfs_check_nostate_extents which return -EFSCORRUPTED when it's not met, and xfs_validate_extent that triggers and assert in debug build. Both of them iterate over all extents of an inode fork when called, which isn't very efficient. This patch instead adds a new helper that verifies the invariant one extent at a time, and calls it from the places where we iterate over all extents to converted them from or two the in-memory format. The callers then return -EFSCORRUPTED when reading invalid extents from disk, or trigger an assert when writing them to disk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We only ever use the normal and unwritten states. And the actual ondisk format (this enum isn't despite being in xfs_format.h) only has space for the unwritten bit anyway. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Eric Sandeen 提交于
On some architectures do_div does the pointer compare trick to make sure that we've sent it an unsigned 64-bit number. (Why unsigned? I don't know.) Fix up the few places that squawk about this; in xfs_bmap_wants_extents() we just used a bare int64_t so change that to unsigned. In xfs_adjust_extent_unmap_boundaries() all we wanted was the mod, and we have an xfs-specific function to handle that w/o side effects, which includes proper casting for do_div. In xfs_daddr_to_ag[b]no, we were using the wrong type anyway; XFS_BB_TO_FSBT returns a block in the filesystem, so use xfs_rfsblock_t not xfs_daddr_t, and gain the unsignedness from that type as a bonus. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Eric Sandeen 提交于
The kbuild test robot caught this; in debug code we have another caller of do_div with a 32-bit dividend (j) which is caught now that we are using the kernel-supplied do_div. None of the values used here are 64-bit; just use simple division. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Hou Tao 提交于
The trailing newlines wil lead to extra newlines in the trace file which looks like the following output, so remove them. >kworker/4:1H-1508 [004] .... 47879.101608: xfs_discard_extent: dev 8:0 > >kworker/u16:2-238 [004] .... 47879.101725: xfs_extent_busy_clear: dev 8:0 Signed-off-by: NHou Tao <houtao1@huawei.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> [darrick: fix the getfsmap tracepoints too] Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
Directory block readahead uses a complex iteration mechanism to map between high-level directory blocks and underlying physical extents. This mechanism attempts to traverse the higher-level dir blocks in a manner that handles multi-fsb directory blocks and simultaneously maintains a reference to the corresponding physical blocks. This logic doesn't handle certain (discontiguous) physical extent layouts correctly with multi-fsb directory blocks. For example, consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory block size and a directory with the following extent layout: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..7]: 88..95 0 (88..95) 8 1: [8..15]: 80..87 0 (80..87) 8 2: [16..39]: 168..191 0 (168..191) 24 3: [40..63]: 5242952..5242975 1 (72..95) 24 Directory block 0 spans physical extents 0 and 1, dirblk 1 lies entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because extent 2 is larger than the directory block size, the readahead code erroneously assumes the block is contiguous and issues a readahead based on the physical mapping of the first fsb of the dirblk. This results in read verifier failure and a spurious corruption or crc failure, depending on the filesystem format. Further, the subsequent readahead code responsible for walking through the physical table doesn't correctly advance the physical block reference for dirblk 2. Instead of advancing two physical filesystem blocks, the first iteration of the loop advances 1 block (correctly), but the subsequent iteration advances 2 more physical blocks because the next physical extent (extent 3, above) happens to cover more than dirblk 2. At this point, the higher-level directory block walking is completely off the rails of the actual physical layout of the directory for the respective mapping table. Update the contiguous dirblock logic to consider the current offset in the physical extent to avoid issuing directory readahead to unrelated blocks. Also, update the mapping table advancing code to consider the current offset within the current dirblock to avoid advancing the mapping reference too far beyond the dirblock. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Eric Sandeen 提交于
Carlos had a case where "find" seemed to start spinning forever and never return. This was on a filesystem with non-default multi-fsb (8k) directory blocks, and a fragmented directory with extents like this: 0:[0,133646,2,0] 1:[2,195888,1,0] 2:[3,195890,1,0] 3:[4,195892,1,0] 4:[5,195894,1,0] 5:[6,195896,1,0] 6:[7,195898,1,0] 7:[8,195900,1,0] 8:[9,195902,1,0] 9:[10,195908,1,0] 10:[11,195910,1,0] 11:[12,195912,1,0] 12:[13,195914,1,0] ... i.e. the first extent is a contiguous 2-fsb dir block, but after that it is fragmented into 1 block extents. At the top of the readdir path, we allocate a mapping array which (for this filesystem geometry) can hold 10 extents; see the assignment to map_info->map_size. During readdir, we are therefore able to map extents 0 through 9 above into the array for readahead purposes. If we count by 2, we see that the last mapped index (9) is the first block of a 2-fsb directory block. At the end of xfs_dir2_leaf_readbuf() we have 2 loops to fill more readahead; the outer loop assumes one full dir block is processed each loop iteration, and an inner loop that ensures that this is so by advancing to the next extent until a full directory block is mapped. The problem is that this inner loop may step past the last extent in the mapping array as it tries to reach the end of the directory block. This will read garbage for the extent length, and as a result the loop control variable 'j' may become corrupted and never fail the loop conditional. The number of valid mappings we have in our array is stored in map->map_valid, so stop this inner loop based on that limit. There is an ASSERT at the top of the outer loop for this same condition, but we never made it out of the inner loop, so the ASSERT never fired. Huge appreciation for Carlos for debugging and isolating the problem. Debugged-and-analyzed-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Tested-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com> Reviewed-by: NBill O'Donnell <billodo@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Now that reflink operations don't set the firstblock value we don't need the workarounds for non-NULL firstblock values without a prior allocation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
The main thing that xfs_bmap_remap_alloc does is fixing the AGFL, similar to what we do in the space allocator. But the reflink code doesn't touch the allocation btree unlike the normal space allocator, so we couldn't care less about the state of the AGFL. So remove xfs_bmap_remap_alloc and just handle the di_nblocks update in the caller. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Add a new helper to be used for reflink extent list additions instead of funneling them through xfs_bmapi_write and overloading the firstblock member in struct xfs_bmalloca and struct xfs_alloc_args. With some small changes to xfs_bmap_remap_alloc this also means we do not need a xfs_bmalloca structure for this case at all. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
For the reflink case we'd much rather pass the required arguments than faking up a struct xfs_bmalloca. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
We never do COW operations for the attr fork, so don't pretend we handle them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
bno should be a xfs_fsblock_t, which is 64-bit wides instead of a xfs_aglock_t, which truncates the value to 32 bits. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 12 4月, 2017 3 次提交
-
-
由 Brian Foster 提交于
Lockdep complains about use of the iolock in inode reclaim context because it doesn't understand that reclaim has the last reference to the inode, and thus an iolock->reclaim->iolock deadlock is not possible. The iolock is technically not necessary in xfs_inactive() and was only added to appease an assert in xfs_free_eofblocks(), which can be called from other non-reclaim contexts. Therefore, just kill the assert and drop the use of the iolock from reclaim context to quiet lockdep. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Eric Sandeen 提交于
Long ago, all this gunk was added with a lament about problems with gcc's do_div, and a fun recommendation in the changelog: egcs-2.91.66 is the recommended compiler version for building XFS. All this special stuff was needed to work around an old gcc bug, apparently, and it's been there ever since. There should be no need for this anymore, so remove it. Remove the special 32-bit xfs_do_mod as well; just let the kernel's do_div() handle all this. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Eric Sandeen 提交于
ndquots is a 32-bit value, and we don't care about the remainder; there is no reason to use do_div here, it seems to be the result of a decade+ historical accident. Worse, the do_div implementation in userspace breaks when fed a 32-bit dividend, so we commented it out there in any case. Change to simple division, and then we can change userspace to match, and mandate a 64-bit dividend in the do_div() in userspace as well. Signed-off-by: NEric Sandeen <sandeen@redhat.com>
-
- 07 4月, 2017 2 次提交
-
-
由 Darrick J. Wong 提交于
Apparently FIEMAP for xattrs has been broken since we switched to the iomap backend because of an incorrect check for xattr presence. Also fix the broken locking. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
No one cares about the low-level helper anymore. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 04 4月, 2017 15 次提交
-
-
由 Darrick J. Wong 提交于
Use the realtime bitmap to return free space information via getfsmap. Eventually this will be superseded by the realtime rmapbt code. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
If the reverse-mapping btree isn't available, fall back to the free space btrees to provide partial reverse mapping information. The online scrub tool can make use of even partial information to speed up the data block scan. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
Introduce a new ioctl that uses the reverse mapping btree to return information about the physical layout of the filesystem. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
Add _query_range and _query_all functions to the realtime bitmap allocator. These two functions are similar in usage to the btree functions with the same name and will be used for getfsmap and scrub. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
Create a helper function that will query all records in a btree. This will be used by the online repair functions to examine every record in a btree to rebuild a second btree. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
Implement a query_range function for the bnobt and cntbt. This will be used for getfsmap fallback if there is no rmapbt and by the online scrub and repair code. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
Plumb in the pieces (init_high_key, diff_two_keys) necessary to call query_range on the free space btrees. Remove the debugging asserts so that we can make queries starting from block 0. While we're at it, merge the redundant "if (btnum ==" hunks. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com>
-
由 Darrick J. Wong 提交于
In xfs_ioc_getbmap, we should only copy the fields of struct getbmap from userspace, or else we end up copying random stack contents into the kernel. struct getbmap is a strict subset of getbmapx, so a partial structure copy should work fine. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Nikolay Borisov 提交于
This function has been removed ever since at least 3.12-era. No need to keep its declaration in the header so nuke it. Signed-off-by: NNikolay Borisov <nborisov@suse.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Eric Sandeen 提交于
"xfs_iread: validation failed for inode 96 failed" One "failed" seems like enough. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NAlex Elder <elder@linaro.org> Reviewed-by: NBill O'Donnell <billodo@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
Opencoding the trivial checks makes it much easier to read (and grep..). Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Christoph Hellwig 提交于
This checks for all the non-normal extent types, including handling both encodings of delayed allocations. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Brian Foster 提交于
The log covering background task used to be part of the xfssyncd workqueue. That workqueue was removed as of commit 5889608d ("xfs: syncd workqueue is no more") and the associated work item scheduled to the xfs-log wq. The latter is used for log buffer I/O completion. Since xfs_log_worker() can invoke a log flush, a deadlock is possible between the xfs-log and xfs-cil workqueues. Consider the following codepath from xfs_log_worker(): xfs_log_worker() xfs_log_force() _xfs_log_force() xlog_cil_force() xlog_cil_force_lsn() xlog_cil_push_now() flush_work() The above is in xfs-log wq context and blocked waiting on the completion of an xfs-cil work item. Concurrently, the cil push in progress can end up blocked here: xlog_cil_push_work() xlog_cil_push() xlog_write() xlog_state_get_iclog_space() xlog_wait(&log->l_flush_wait, ...) The above is in xfs-cil context waiting on log buffer I/O completion, which executes in xfs-log wq context. In this scenario both workqueues are deadlocked waiting on eachother. Add a new workqueue specifically for the high level log covering and ail pushing worker, as was the case prior to commit 5889608d. Diagnosed-by: NDavid Jeffery <djeffery@redhat.com> Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Darrick J. Wong 提交于
Fix a memory exposure problems in inumbers where we allocate an array of structures with holes, fail to zero the holes, then blindly copy the kernel memory contents (junk and all) into userspace. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Calvin Owens 提交于
When punching past EOF on XFS, fallocate(mode=PUNCH_HOLE|KEEP_SIZE) will round the file size up to the nearest multiple of PAGE_SIZE: calvinow@vm-disks/generic-xfs-1 ~$ dd if=/dev/urandom of=test bs=2048 count=1 calvinow@vm-disks/generic-xfs-1 ~$ stat test Size: 2048 Blocks: 8 IO Block: 4096 regular file calvinow@vm-disks/generic-xfs-1 ~$ fallocate -n -l 2048 -o 2048 -p test calvinow@vm-disks/generic-xfs-1 ~$ stat test Size: 4096 Blocks: 8 IO Block: 4096 regular file Commit 3c2bdc91 ("xfs: kill xfs_zero_remaining_bytes") replaced xfs_zero_remaining_bytes() with calls to iomap helpers. The new helpers don't enforce that [pos,offset) lies strictly on [0,i_size) when being called from xfs_free_file_space(), so by "leaking" these ranges into xfs_zero_range() we get this buggy behavior. Fix this by reintroducing the checks xfs_zero_remaining_bytes() did against i_size at the bottom of xfs_free_file_space(). Reported-by: NAaron Gao <gzh@fb.com> Fixes: 3c2bdc91 ("xfs: kill xfs_zero_remaining_bytes") Cc: Christoph Hellwig <hch@lst.de> Cc: Brian Foster <bfoster@redhat.com> Cc: <stable@vger.kernel.org> # 4.8+ Signed-off-by: NCalvin Owens <calvinowens@fb.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
- 29 3月, 2017 1 次提交
-
-
由 Darrick J. Wong 提交于
The inline directory verifiers should be called on the inode fork data, which means after iformat_local on the read side, and prior to ifork_flush on the write side. This makes the fork verifier more consistent with the way buffer verifiers work -- i.e. they will operate on the memory buffer that the code will be reading and writing directly. Furthermore, revise the verifier function to return -EFSCORRUPTED so that we don't flood the logs with corruption messages and assert notices. This has been a particular problem with xfs/348, which triggers the XFS_WANT_CORRUPTED_RETURN assertions, which halts the kernel when CONFIG_XFS_DEBUG=y. Disk corruption isn't supposed to do that, at least not in a verifier. Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> --- v2: get the inode d_ops the proper way v3: describe the bug that this patch fixes; no code changes
-
- 15 3月, 2017 1 次提交
-
-
由 Darrick J. Wong 提交于
When we're reading or writing the data fork of an inline directory, check the contents to make sure we're not overflowing buffers or eating garbage data. xfs/348 corrupts an inline symlink into an inline directory, triggering a buffer overflow bug. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> --- v2: add more checks consistent with _dir2_sf_check and make the verifier usable from anywhere.
-