- 01 12月, 2008 5 次提交
-
-
由 Dave Chinner 提交于
XFS gets the sign of the error wrong in several places when gathering the error from generic linux functions. These functions return negative error values, while the core XFS code returns positive error values. Hence when XFS inverts the error to be returned to the VFS, it can incorrectly invert a negative error and this error will be ignored by the syscall return. Fix all the problems related to calling filemap_* functions. Problem initially identified by Nick Piggin in xfs_fsync(). Signed-off-by: NDave Chinner <david@fromorbit.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NNiv Sardi <xaiki@sgi.com>
-
由 Christoph Hellwig 提交于
Some recent gcc warnings don't like passing string variables to printf-like functions without using at least a "%s" format string. Change the two occurances of that in xfs to please gcc. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NNiv Sardi <xaiki@sgi.com>
-
由 Christoph Hellwig 提交于
Now that we've stopped using the Linux inode cache when can trivally support the inode64 mount option on 32bit architectures. As far as the kernel and most userspace is concerned this works perfectly, but applications still using really old stat and readdir interfaces will get an EOVERFLOW error when hitting an inode number not fitting into 32 bits (that problem of course also exists when using these applications on a 64bit kernel). Note that because inode64 is simply a mount option we can currently mount a filesystem having > 32 bit inode numbers and cause a variety of problems, all this is solved but this patch which enables XFS_BIG_INUMS, even when inode64 is not used. (First sent on October 18th) Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NNiv Sardi <xaiki@sgi.com>
-
由 Christoph Hellwig 提交于
Currently there's no ->open method set for directories on XFS. That means we don't perform any check for opening too large directories without O_LARGEFILE, we don't check for shut down filesystems, and we don't actually do the readahead for the first block in the directory. Instead of just setting the directories open routine to xfs_file_open we merge the shutdown check directly into xfs_file_open and create a new xfs_dir_open that first calls xfs_file_open and then performs the readahead for block 0. (First sent on September 29th) Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NNiv Sardi <xaiki@sgi.com>
-
由 Christoph Hellwig 提交于
xfs_log_force_umount may be called very early during log recovery where If we fail a buffer read in xlog_recover_do_inode_trans we abort the mount. But at that point log recovery has started delayed writeback of inode buffers. As part of the aborted mount we try to flush out all delwri buffers, but at that point we have already freed the superblock, and set mp->m_sb_bp to NULL, and xfs_log_force_umount which gets called after the inode buffer writeback trips over it. Make xfs_log_force_umount a little more careful when accessing mp->m_sb_bp to avoid this. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NNiv Sardi <xaiki@sgi.com>
-
- 17 11月, 2008 1 次提交
-
-
由 Dave Chinner 提交于
When an I/O error occurs during an intermediate commit on a rolling transaction, xfs_trans_commit() will free the transaction structure and the related ticket. However, the duplicate transaction that gets used as the transaction continues still contains a pointer to the ticket. Hence when the duplicate transaction is cancelled and freed, we free the ticket a second time. Add reference counting to the ticket so that we hold an extra reference to the ticket over the transaction commit. We drop the extra reference once we have checked that the transaction commit did not return an error, thus avoiding a double free on commit error. Credit to Nick Piggin for tripping over the problem. SGI-PV: 989741 Signed-off-by: NDave Chinner <david@fromorbit.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
- 10 11月, 2008 8 次提交
-
-
由 David Chinner 提交于
When we are about to add a new item to a transaction in recovery, we need to check that it is valid first. Currently we just assert that header magic number matches, but in production systems that is not present and we add a corrupted transaction to the list to be processed. This results in a kernel oops later when processing the corrupted transaction. Instead, if we detect a corrupted transaction, abort recovery and leave the user to clean up the mess that has occurred. SGI-PV: 988145 SGI-Modid: xfs-linux-melb:xfs-kern:32356a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NTim Shimmin <tes@sgi.com> Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Dave Chinner 提交于
When there is no memory left in the system, xfs_buf_get_noaddr() can fail. If this happens at mount time during xlog_alloc_log() we fail to catch the error and oops. Catch the error from xfs_buf_get_noaddr(), and allow other memory allocations to fail and catch those errors too. Report the error to the console and fail the mount with ENOMEM. Tested by manually injecting errors into xfs_buf_get_noaddr() and xlog_alloc_log(). Version 2: o remove unnecessary casts of the returned pointer from kmem_zalloc() SGI-PV: 987246 Signed-off-by: NDave Chinner <david@fromorbit.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
When we create a directory, we reserve a number of blocks for the maximum possible expansion of of the directory due to various btree splits, freespace allocation, etc. Unfortunately, each allocation is not reflected in the total number of blocks still available to the transaction, so the maximal reservation is used over and over again. This leads to problems where an allocation group has only enough blocks for *some* of the allocations required for the directory modification. After the first N allocations, the remaining blocks in the allocation group drops below the total reservation, and subsequent allocations fail because the allocator will not allow the allocation to proceed if the AG does not have the enough blocks available for the entire allocation total. This results in an ENOSPC occurring after an allocation has already occurred. This results in aborting the directory operation (leaving the directory in an inconsistent state) and cancelling a dirty transaction, which results in a filesystem shutdown. Avoid the problem by reflecting the number of blocks allocated in any directory expansion in the total number of blocks available to the modification in progress. This prevents a directory modification from being aborted part way through with an ENOSPC. SGI-PV: 988144 SGI-Modid: xfs-linux-melb:xfs-kern:32340a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Lachlan McIlroy 提交于
It's possible to have outstanding xfs_ioend_t's queued when the file size is zero. This can happen in the direct I/O path when a direct I/O write fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie xfs_end_io_direct() does not know that the I/O failed so can't force the xfs_ioend_t to be flushed synchronously). When we truncate a file on unlink we don't know to wait for these xfs_ioend_ts and we can have a use-after-free situation if the inode is reclaimed before the xfs_ioend_t is finally processed. As was suggested by Dave Chinner lets wait for all I/Os to complete when truncating the file size to zero. SGI-PV: 981668 SGI-Modid: xfs-linux-melb:xfs-kern:32216a Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 Lachlan McIlroy 提交于
Destroying the quota stuff on unmount can access the log - ie XFS_QM_DONE() ends up in xfs_dqunlock() which calls xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the log has already been destroyed. Just move the cleanup of the quota code earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot. SGI-PV: 987086 SGI-Modid: xfs-linux-melb:xfs-kern:32148a Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NPeter Leckie <pleckie@sgi.com>
-
由 Dave Chinner 提交于
The radix tree walks in xfs_sync_inodes_ag and xfs_qm_dqrele_all_inodes() can find inodes that are still undergoing initialisation. Avoid them by checking for the the XFS_INEW() flag once we have a reference on the inode. This flag is cleared once the inode is properly initialised. SGI-PV: 987246 Signed-off-by: NDave Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Dave Chinner 提交于
gcc on ARM warns about an using an uninitialised variable in xfs_qm_dqrele_all_inodes(). This is a real bug, but gcc on x86_64 is not reporting this warning so it went unnoticed. Fix the bug by bring the inode radix tree walk code up to date with xfs_sync_inodes_ag(). SGI-PV: 987246 Signed-off-by: NDave Chinner <david@fromorbit.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Dave Chinner 提交于
When there is no memory left in the system, xfs_buf_get_noaddr() can fail. If this happens at mount time during xlog_alloc_log() we fail to catch the error and oops. Catch the error from xfs_buf_get_noaddr(), and allow other memory allocations to fail and catch those errors too. Report the error to the console and fail the mount with ENOMEM. Tested by manually injecting errors into xfs_buf_get_noaddr() and xlog_alloc_log(). Version 2: o remove unnecessary casts of the returned pointer from kmem_zalloc() SGI-PV: 987246 Signed-off-by: NDave Chinner <david@fromorbit.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
- 31 10月, 2008 1 次提交
-
-
由 David Howells 提交于
Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NJames Morris <jmorris@namei.org> Acked-by: NSerge Hallyn <serue@us.ibm.com>
-
- 30 10月, 2008 25 次提交
-
-
由 David Chinner 提交于
If we get a race looking up a reclaimable inode, we can end up with the winner proceeding to use the inode before it has been completely re-initialised. This is a Bad Thing. Fix the race by checking whether we are still initialising the inod eonce we have a reference to it, and if so wait for the initialisation to complete before continuing. While there, fix a leaked reference count in the same code when encountering an unlinked inode and we are not doing a lookup for a create operation. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32429a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Tim Shimmin 提交于
On Linux all filesystems are supposed to be operating under Posix' restricted chown. Restricted chown means it restricts chown to the owner unless you have CAP_FOWNER. NOTE: that 2 files outside of fs/xfs have been modified too for this change. Reviewed-by: NDave Chinner <david@fromorbit.com> SGI-PV: 988919 SGI-Modid: xfs-linux-melb:xfs-kern:32413a Signed-off-by: NTim Shimmin <tes@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
capable_cred has been unused for a while so we can kill it and sys_cred. That also means the cred argument to xfs_setattr and xfs_change_file_space can be removed now. SGI-PV: 988918 SGI-Modid: xfs-linux-melb:xfs-kern:32412a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NTim Shimmin <tes@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Under heavy metadata load we are seeing log hangs. The AIL has items in it ready to be pushed, and they are within the push target window. However, we are not pushing them when the last pushed LSN is less than the LSN of the first log item on the AIL. This is a regression introduced by the AIL push cursor modifications. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32409a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NTim Shimmin <tes@sgi.com>
-
由 Christoph Hellwig 提交于
To make sure we free the security data inodes need to be freed using the proper VFS helper (which we also need to export for this). We mark these inodes bad so we can skip the flush path for them. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32398a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com>
-
由 Christoph Hellwig 提交于
xfs_bulkstat only wants the dinode, offset and buffer from a given inode number. Instead of using xfs_itobp on a fake inode which is complicated and currently leads to leaks of the security data just use xfs_inotobp which is designed to do exactly the kind of lookup xfs_bulkstat wants. The only thing that's missing in xfs_inotobp is a flags paramter that let's us pass down XFS_IMAP_BULKSTAT, but that can easily added. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32397a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com>
-
由 David Chinner 提交于
If we are syncing data in xfs_sync_inodes_ag(), the VFS inode must still be referencable as the dirty data state is carried on the VFS inode. hence if we can't get a reference via igrab(), the inode must be in reclaim which implies that it has no dirty data attached. Leave such inodes to the reclaim code to flush the dirty inode state to disk and so avoid attempting to access the VFS inode when it may not exist in xfs_sync_inodes_ag(). Version 4: o don't reference linux inode until after igrab() succeeds Version 3: o converted unlock/rele to an xfs_iput() call. Version 2: o change igrab logic to be more linear o remove initial reclaimable inode check now that we are using igrab() failure to find reclaimable inodes o assert that igrab failure occurs only on reclaimable inodes o clean up inode locking - only grab the iolock if we are doing a SYNC_DELWRI call and we have a dirty inode. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32391a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NPeter Leckie <pleckie@sgi.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
When we are inside a radix tree preload region, we cannot sleep. Recently we moved the inode locking inside the preload region for the inode radix tree. Fix that, and fix a missed unlock in another error path in the same code at the same time. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32385a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 Christoph Hellwig 提交于
The dp to ip comment should be for the unconditional xfs_droplink call, and the "." link obviously only exists for directories, so it should be in the is_dir conditional. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32374a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NDonald Douwsma <donaldd@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
iosizelog shouldn't be the same as iosize but the logarithm of it. Then again the current biosize option doesn't make much sense to me as it doesn't set the preferred I/O size as mentioned in the comment next to it but rather the allocation size and thus is identical to the allocsize option (except for the missing logarithm). It's also not documented in Documentation/filesystems/xfs.txt or the mount manpage. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32373a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NDonald Douwsma <donaldd@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
Noquota should clear all mount options, and not just user and group quota. Probably doesn't matter very much in real life. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32372a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NDonald Douwsma <donaldd@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 Christoph Hellwig 提交于
No need to parse the mount option into a structure before applying it to struct xfs_mount. The content of xfs_start_flags gets merged into xfs_parseargs. Calls inbetween don't care and can use mount members instead of the args struct. This patch uncovered that the mount option for shared filesystems wasn't ever exposed on Linux. The code to handle it is #if 0'ed in this patch pending a decision on this feature. I'll send a writeup about it to the list soon. SGI-PV: 987246 SGI-Modid: xfs-linux-melb:xfs-kern:32371a Signed-off-by: NChristoph Hellwig <hch@infradead.org> Signed-off-by: NDonald Douwsma <donaldd@sgi.com> Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
When we are about to add a new item to a transaction in recovery, we need to check that it is valid first. Currently we just assert that header magic number matches, but in production systems that is not present and we add a corrupted transaction to the list to be processed. This results in a kernel oops later when processing the corrupted transaction. Instead, if we detect a corrupted transaction, abort recovery and leave the user to clean up the mess that has occurred. SGI-PV: 988145 SGI-Modid: xfs-linux-melb:xfs-kern:32356a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NTim Shimmin <tes@sgi.com> Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
Change all the remaining AIL API functions that are passed struct xfs_mount pointers to pass pointers directly to the struct xfs_ail being used. With this conversion, all external access to the AIL is via the struct xfs_ail. Hence the operation and referencing of the AIL is almost entirely independent of the xfs_mount that is using it - it is now much more tightly tied to the log and the items it is tracking in the log than it is tied to the xfs_mount. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32353a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
Add an xfs_ail pointer to log items so that the log items can reference the AIL directly during callbacks without needed a struct xfs_mount. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32352a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
When we need to go from the log to the AIL, we have to go via the xfs_mount. Add a xfs_ail pointer to the log so we can go directly to the AIL associated with the log. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32351a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
Bring the ail lock inside the struct xfs_ail. This means the AIL can be entirely manipulated via the struct xfs_ail rather than needing both the struct xfs_mount and the struct xfs_ail. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32350a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
When copying lsn's from the log item to the inode or dquot flush lsn, we currently grab the AIL lock. We do this because the LSN is a 64 bit quantity and it needs to be read atomically. The lock is used to guarantee atomicity for 32 bit platforms. Make the LSN copying a small function, and make the function used conditional on BITS_PER_LONG so that 64 bit machines don't need to take the AIL lock in these places. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32349a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
With the new cursor interface, it makes sense to make all the traversing code use the cursor interface and make the old one go away. This means more of the AIL interfacing is done by passing struct xfs_ail pointers around the place instead of struct xfs_mount pointers. We can replace the use of xfs_trans_first_ail() in xfs_log_need_covered() as it is only checking if the AIL is empty. We can do that with a call to xfs_trans_ail_tail() instead, where a zero LSN returned indicates and empty AIL... SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32348a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
To replace the current generation number ensuring sanity of the AIL traversal, replace it with an external cursor that is linked to the AIL. Basically, we store the next item in the cursor whenever we want to drop the AIL lock to do something to the current item. When we regain the lock. the current item may already be free, so we can't reference it, but the next item in the traversal is already held in the cursor. When we move or delete an object, we search all the active cursors and if there is an item match we clear the cursor(s) that point to the object. This forces the traversal to restart transparently. We don't invalidate the cursor on insert because the cursor still points to a valid item. If the intem is inserted between the current item and the cursor it does not matter; the traversal is considered to be past the insertion point so it will be picked up in the next traversal. Hence traversal restarts pretty much disappear altogether with this method of traversal, which should substantially reduce the overhead of pushing on a busy AIL. Version 2 o add restart logic o comment cursor interface o minor cleanups SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32347a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
Rather than embedding the struct xfs_ail in the struct xfs_mount, allocate it during AIL initialisation. Add a back pointer to the struct xfs_ail so that we can pass around the xfs_ail and still be able to access the xfs_mount if need be. This is th first step involved in isolating the AIL implementation from the surrounding filesystem code. SGI-PV: 988143 SGI-Modid: xfs-linux-melb:xfs-kern:32346a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
When we create a directory, we reserve a number of blocks for the maximum possible expansion of of the directory due to various btree splits, freespace allocation, etc. Unfortunately, each allocation is not reflected in the total number of blocks still available to the transaction, so the maximal reservation is used over and over again. This leads to problems where an allocation group has only enough blocks for *some* of the allocations required for the directory modification. After the first N allocations, the remaining blocks in the allocation group drops below the total reservation, and subsequent allocations fail because the allocator will not allow the allocation to proceed if the AG does not have the enough blocks available for the entire allocation total. This results in an ENOSPC occurring after an allocation has already occurred. This results in aborting the directory operation (leaving the directory in an inconsistent state) and cancelling a dirty transaction, which results in a filesystem shutdown. Avoid the problem by reflecting the number of blocks allocated in any directory expansion in the total number of blocks available to the modification in progress. This prevents a directory modification from being aborted part way through with an ENOSPC. SGI-PV: 988144 SGI-Modid: xfs-linux-melb:xfs-kern:32340a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
-
由 David Chinner 提交于
If the last block of the AG has inodes in it and the AG is an exactly power-of-2 size then the last inode in the AG points to the last block in the AG. If we try to find the next inode in the AG by adding one to the inode number, we increment the inode number past the size of the AG. The result is that the macro XFS_INO_TO_AGINO() will strip the AG portion of the inode number and return an inode number of zero. That is, instead of terminating the lookup loop because we hit the inode number went outside the valid range for the AG, the search index returns to zero and we start traversing the radix tree from the start again. This results in an endless loop in xfs_sync_inodes_ag(). Fix it be detecting if the new search index decreases as a result of incrementing the current inode number. That indicate an overflow and hence that we have finished processing the AG so we can terminate the loop. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32335a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
Now that the deleted inodes list is unused, kill it. This also removes the i_reclaim list head from the xfs_inode, shrinking it by two pointers. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32334a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-
由 David Chinner 提交于
Use the reclaim tag to walk the radix tree and find the inodes under reclaim. This was the only user of the deleted inode list. SGI-PV: 988142 SGI-Modid: xfs-linux-melb:xfs-kern:32333a Signed-off-by: NDavid Chinner <david@fromorbit.com> Signed-off-by: NLachlan McIlroy <lachlan@sgi.com> Signed-off-by: NChristoph Hellwig <hch@infradead.org>
-