- 29 5月, 2015 1 次提交
-
-
由 Brian Foster 提交于
Inode allocation from sparse inode records must filter the ir_free mask against ir_holemask. In preparation for this requirement, create a helper to allocate an individual inode from an inode record. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 23 2月, 2015 3 次提交
-
-
由 Eric Sandeen 提交于
Today, if we hit an XFS_WANT_CORRUPTED_RETURN we don't print any information about which filesystem hit it. Passing in the mp allows us to print the filesystem (device) name, which is a pretty critical piece of information. Tested by running fsfuzzer 'til I hit some. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
Today, if we hit an XFS_WANT_CORRUPTED_GOTO we don't print any information about which filesystem hit it. Passing in the mp allows us to print the filesystem (device) name, which is a pretty critical piece of information. Tested by running fsfuzzer 'til I hit some. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
XFS has hand-rolled per-cpu counters for the superblock since before there was any generic implementation. There are some warts around the use of them for the inode counter as the hand rolled counter is designed to be accurate at zero, but has no specific accurracy at any other value. This design causes problems for the maximum inode count threshold enforcement, as there is no trigger that balances the counters as they get close tothe maximum threshold. Instead of designing new triggers for balancing, just replace the handrolled per-cpu counter with a generic counter. This enables us to update the counter through the normal superblock modification funtions, but rather than do that we add a xfs_mod_icount() helper function (from Christoph Hellwig) and keep the percpu counter outside the superblock in the struct xfs_mount. This means we still need to initialise the per-cpu counter specifically when we read the superblock, and vice versa when we log/write it, but it does mean that we don't need to change any other code. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 04 12月, 2014 1 次提交
-
-
由 Dave Chinner 提交于
After growing a filesystem, XFS can fail to allocate inodes even though there is a large amount of space available in the filesystem for inodes. The issue is caused by a nearly full allocation group having enough free space in it to be considered for inode allocation, but not enough contiguous free space to actually allocation inodes. This situation results in successful selection of the AG for allocation, then failure of the allocation resulting in ENOSPC being reported to the caller. It is caused by two possible issues. Firstly, we only consider the lognest free extent and whether it would fit an inode chunk. If the extent is not correctly aligned, then we can't allocate an inode chunk in it regardless of the fact that it is large enough. This tends to be a permanent error until space in the AG is freed. The second issue is that we don't actually lock the AGI or AGF when we are doing these checks, and so by the time we get to actually allocating the inode chunk the space we thought we had in the AG may have been allocated. This tends to be a spurious error as it requires a race to trigger. Hence this case is ignored in this patch as the reported problem is for permanent errors. The first issue could be addressed by simply taking into account the alignment when checking the longest extent. This, however, would prevent allocation in AGs that have aligned, exact sized extents free. However, this case should be fairly rare compared to the number of allocations that occur near ENOSPC that would trigger this condition. Hence, when selecting the inode AG, take into account the inode cluster alignment when checking the lognest free extent in the AG. If we can't find any AGs with a contiguous free space large enough to be aligned, drop the alignment addition and just try for an AG that has enough contiguous free space available for an inode chunk. This won't prevent issues from occurring, but should avoid situations where other AGs have lots of free space but the selected AG can't allocate due to alignment constraints. Reported-by: NArkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 01 12月, 2014 1 次提交
-
-
由 kbuild test robot 提交于
fs/xfs/libxfs/xfs_ialloc.c:1141:1-6: WARNING: end returns can be simpified Simplify a trivial if-return sequence. Possibly combine with a preceding function call. Generated by: scripts/coccinelle/misc/simple_return.cocci Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 28 11月, 2014 3 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Christoph Hellwig 提交于
More on-disk format consolidation. A few declarations that weren't on-disk format related move into better suitable spots. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Christoph Hellwig 提交于
More consolidatation for the on-disk format defintions. Note that the XFS_IS_REALTIME_INODE moves to xfs_linux.h instead as it is not related to the on disk format, but depends on a CONFIG_ option. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 29 9月, 2014 1 次提交
-
-
由 Dave Chinner 提交于
Sparse warns that we are passing the big-endian valueo f agi_newino to the initial btree lookup function when trying to find a new inode. This is wrong - we need to pass the host order value, not the disk order value. This will adversely affect the next inode allocated, but given that the free inode btree is usually much smaller than the allocated inode btree it is much less likely to be a performance issue if we start the search in the wrong place. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 09 9月, 2014 1 次提交
-
-
由 Eric Sandeen 提交于
These were exposed by fsfuzzer runs; without them we fail in various exciting and sometimes convoluted ways when we encounter disk corruption. Without the MAXLEVELS tests we tend to walk off the end of an array in a loop like this: for (i = 0; i < cur->bc_nlevels; i++) { if (cur->bc_bufs[i]) Without the dirblklog test we try to allocate more memory than we could possibly hope for and loop forever: xfs_dabuf_map() nfsb = mp->m_dir_geo->fsbcount; irecs = kmem_zalloc(sizeof(irec) * nfsb, KM_SLEEP... As for the logbsize check, that's the convoluted one. If logbsize is specified at mount time, it's sanitized in xfs_parseargs; in particular it makes sure that it's not > XLOG_MAX_RECORD_BSIZE. If not specified at mount time, it comes from the superblock via sb_logsunit; this is limited to 256k at mkfs time as well; it's copied into m_logbsize in xfs_finish_flags(). However, if for some reason the on-disk value is corrupt and too large, nothing catches it. It's a circuitous path, but that size eventually finds its way to places that make the kernel very unhappy, leading to oopses in xlog_pack_data() because we use the size as an index into iclog->ic_data, but the array is not necessarily that big. Anyway - bounds checking when we read from disk is a good thing! Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 25 6月, 2014 2 次提交
-
-
由 Dave Chinner 提交于
Convert all the errors the core XFs code to negative error signs like the rest of the kernel and remove all the sign conversion we do in the interface layers. Errors for conversion (and comparison) found via searches like: $ git grep " E" fs/xfs $ git grep "return E" fs/xfs $ git grep " E[A-Z].*;$" fs/xfs Negation points found via searches like: $ git grep "= -[a-z,A-Z]" fs/xfs $ git grep "return -[a-z,A-D,F-Z]" fs/xfs $ git grep " -[a-z].*;" fs/xfs [ with some bits I missed from Brian Foster ] Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
Move all the source files that are shared with userspace into libxfs/. This is done as one big chunk simpy to get it done quickly Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 22 6月, 2014 1 次提交
-
-
由 Eric Sandeen 提交于
XFS_ERROR was designed long ago to trap return values, but it's not runtime configurable, it's not consistently used, and we can do similar error trapping with ftrace scripts and triggers from userspace. Just nuke XFS_ERROR and associated bits. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 06 6月, 2014 1 次提交
-
-
由 Dave Chinner 提交于
Most of the callers are just calling ASSERT(!xfs_buf_geterror()) which means they are checking for bp->b_error == 0. If bp is null in this case, we will assert fail, and hence it's no different in result to oopsing because of a null bp. In some cases, errors have already been checked for or the function returning the buffer can't return a buffer with an error, so it's just a redundant assert. Either way, the assert can either be removed. The other two non-assert callers can just test for a buffer and error properly. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 20 5月, 2014 2 次提交
-
-
由 Roger Willcocks 提交于
xfs_ialloc.h:102: error: expected ',' or '...' before 'delete' Simple parameter rename, no changes to behaviour. Signed-off-by: NRoger Willcocks <roger@filmlight.ltd.uk> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Dave Chinner 提交于
mkfs has turned on the XFS_SB_VERSION_NLINKBIT feature bit by default since November 2007. It's about time we simply made the kernel code turn it on by default and so always convert v1 inodes to v2 inodes when reading them in from disk or allocating them. This This removes needless version checks and modification when bumping link counts on inodes, and will take code out of a few common code paths. text data bss dec hex filename 783251 100867 616 884734 d7ffe fs/xfs/xfs.o.orig 782664 100867 616 884147 d7db3 fs/xfs/xfs.o.patched Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 24 4月, 2014 6 次提交
-
-
由 Brian Foster 提交于
An inode free operation can have several effects on the finobt. If all inodes have been freed and the chunk deallocated, we remove the finobt record. If the inode chunk was previously full, we must insert a new record based on the existing inobt record. Otherwise, we modify the record in place. Create the xfs_difree_finobt() function to identify the potential scenarios and update the finobt appropriately. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
Refactor xfs_difree() in preparation for the finobt. xfs_difree() performs the validity checks against the ag and reads the agi header. The work of physically updating the inode allocation btree is pushed down into the new xfs_difree_inobt() helper. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
Replace xfs_dialloc_ag() with an implementation that looks for a record in the finobt. The finobt only tracks records with at least one free inode. This eliminates the need for the intra-ag scan in the original algorithm. Once the inode is allocated, update the finobt appropriately (possibly removing the record) as well as the inobt. Move the original xfs_dialloc_ag() algorithm to xfs_dialloc_ag_inobt() and fall back as such if finobt support is not enabled. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
A newly allocated inode chunk, by definition, has at least one free inode, so a record is always inserted into the finobt. Create the xfs_inobt_insert() helper from existing code to insert a record in an inobt based on the provided BTNUM. Update xfs_ialloc_ag_alloc() to invoke the helper for the existing XFS_BTNUM_INO tree and XFS_BTNUM_FINO tree, if enabled. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
Define the AGI fields for the finobt root/level and add magic numbers. Update the btree code to add support for the new XFS_BTNUM_FINOBT inode btree. The finobt root block is reserved immediately following the inobt root block in the AG. Update XFS_PREALLOC_BLOCKS() to determine the starting AG data block based on whether finobt support is enabled. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
The introduction of the free inode btree (finobt) requires that xfs_ialloc_btree.c handle multiple trees. Refactor xfs_ialloc_btree.c so the caller specifies the btree type on cursor initialization to prepare for addition of the finobt. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <david@fromorbit.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 07 3月, 2014 1 次提交
-
-
由 Brian Foster 提交于
The inode chunk allocation path can lead to deadlock conditions if a transaction is dirtied with an AGF (to fix up the freelist) for an AG that cannot satisfy the actual allocation request. This code path is written to try and avoid this scenario, but it can be reproduced by running xfstests generic/270 in a loop on a 512b fs. An example situation is: - process A attempts an inode allocation on AG 3, modifies the freelist, fails the allocation and ultimately moves on to AG 0 with the AG 3 AGF held - process B is doing a free space operation (i.e., truncate) and acquires the AG 0 AGF, waits on the AG 3 AGF - process A acquires the AG 0 AGI, waits on the AG 0 AGF (deadlock) The problem here is that process A acquired the AG 3 AGF while moving on to AG 0 (and releasing the AG 3 AGI with the AG 3 AGF held). xfs_dialloc() makes one pass through each of the AGs when attempting to allocate an inode chunk. The expectation is a clean transaction if a particular AG cannot satisfy the allocation request. xfs_ialloc_ag_alloc() is written to support this through use of the minalignslop allocation args field. When using the agi->agi_newino optimization, we attempt an exact bno allocation request based on the location of the previously allocated chunk. minalignslop is set to inform the allocator that we will require alignment on this chunk, and thus to not allow the request for this AG if the extra space is not available. Suppose that the AG in question has just enough space for this request, but not at the requested bno. xfs_alloc_fix_freelist() will proceed as normal as it determines the request should succeed, and thus it is allowed to modify the agf. xfs_alloc_ag_vextent() ultimately fails because the requested bno is not available. In response, the caller moves on to a NEAR_BNO allocation request for the same AG. The alignment is set, but the minalignslop field is never reset. This increases the overall requirement of the request from the first attempt. If this delta is the difference between allocation success and failure for the AG, xfs_alloc_fix_freelist() rejects this request outright the second time around and causes the allocation request to unnecessarily fail for this AG. To address this situation, reset the minalignslop field immediately after use and prevent it from leaking into subsequent requests. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NMark Tinguely <tinguely@sgi.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 27 2月, 2014 4 次提交
-
-
由 Eric Sandeen 提交于
Modify all read & write verifiers to differentiate between CRC errors and other inconsistencies. This sets the appropriate error number on bp->b_error, and then calls xfs_verifier_error() if something went wrong. That function will issue the appropriate message to the user. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
Many/most callers of xfs_update_cksum() pass bp->b_addr and BBTOB(bp->b_length) as the first 2 args. Add a helper which can just accept the bp and the crc offset, and work it out on its own, for brevity. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
Many/most callers of xfs_verify_cksum() pass bp->b_addr and BBTOB(bp->b_length) as the first 2 args. Add a helper which can just accept the bp and the crc offset, and work it out on its own, for brevity. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
Some calls to crc functions used useful #defines, others used awkward offsetof() constructs. Switch them all to #define to make things a bit cleaner. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 13 12月, 2013 5 次提交
-
-
由 Jie Liu 提交于
Use xfs_icluster_size_fsb() in xfs_imap(). Signed-off-by: NJie Liu <jeff.liu@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Jie Liu 提交于
Use xfs_icluster_size_fsb() in xfs_ialloc_inode_init(), rename variable ninodes to inodes_per_cluster, the latter is more meaningful. Signed-off-by: NJie Liu <jeff.liu@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Jie Liu 提交于
Get rid of XFS_IALLOC_BLOCKS() marcos, use mp->m_ialloc_blks directly. Signed-off-by: NJie Liu <jeff.liu@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Jie Liu 提交于
Get rid of XFS_INODE_CLUSTER_SIZE() macros, use mp->m_inode_cluster_size directly. Signed-off-by: NJie Liu <jeff.liu@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Jie Liu 提交于
Get rid of XFS_IALLOC_INODES() marcos, use mp->m_ialloc_inos directly. Signed-off-by: NJie Liu <jeff.liu@oracle.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 12 12月, 2013 1 次提交
-
-
由 Ben Myers 提交于
rec.ir_startino is an agino rather than an ino. Use the correct macro when dealing with it in xfs_difree. Signed-off-by: NBen Myers <bpm@sgi.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 07 11月, 2013 1 次提交
-
-
由 Dave Chinner 提交于
To help track down AGI/AGF lock ordering issues, I added these tracepoints to tell us when an AGI or AGF is read and locked. With these we can now determine if the lock ordering goes wrong from tracing captures. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBen Myers <bpm@sgi.com>
-
- 24 10月, 2013 3 次提交
-
-
由 Dave Chinner 提交于
Currently the xfs_inode.h header has a dependency on the definition of the BMAP btree records as the inode fork includes an array of xfs_bmbt_rec_host_t objects in it's definition. Move all the btree format definitions from xfs_btree.h, xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to xfs_format.h to continue the process of centralising the on-disk format definitions. With this done, the xfs inode definitions are no longer dependent on btree header files. The enables a massive culling of unnecessary includes, with close to 200 #include directives removed from the XFS kernel code base. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBen Myers <bpm@sgi.com> Signed-off-by: NBen Myers <bpm@sgi.com>
-
由 Dave Chinner 提交于
xfs_trans.h has a dependency on xfs_log.h for a couple of structures. Most code that does transactions doesn't need to know anything about the log, but this dependency means that they have to include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header files and clean up the includes to be in dependency order. In doing this, remove the direct include of xfs_trans_reserve.h from xfs_trans.h so that we remove the dependency between xfs_trans.h and xfs_mount.h. Hence the xfs_trans.h include can be moved to the indicate the actual dependencies other header files have on it. Note that these are kernel only header files, so this does not translate to any userspace changes at all. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBen Myers <bpm@sgi.com> Signed-off-by: NBen Myers <bpm@sgi.com>
-
由 Dave Chinner 提交于
All of the buffer operations structures are needed to be exported for xfs_db, so move them all to a common location rather than spreading them all over the place. They are verifying the on-disk format, so while xfs_format.h might be a good place, it is not part of the on disk format. Hence we need to create a new header file that we centralise these related definitions. Start by moving the bffer operations structures, and then also move all the other definitions that have crept into xfs_log_format.h and xfs_format.h as there was no other shared header file to put them in. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBen Myers <bpm@sgi.com>
-
- 31 8月, 2013 1 次提交
-
-
由 Brian Foster 提交于
The call to xfs_inobt_get_rec() in xfs_dialloc_ag() passes 'j' as the output status variable. The immediately following XFS_WANT_CORRUPTED_GOTO() checks the value of 'i,' which is from the previous lookup call and has already been checked. Fix the corruption check to use 'j.' Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NBen Myers <bpm@sgi.com>
-
- 21 8月, 2013 1 次提交
-
-
由 Zhi Yong Wu 提交于
Signed-off-by: NZhi Yong Wu <wuzhy@linux.vnet.ibm.com> Reviewed-by: NMark Tinguely <tinguely@sgi.com> Signed-off-by: NBen Myers <bpm@sgi.com>
-