- 27 4月, 2007 23 次提交
-
-
由 Mark Fasheh 提交于
Return an optional extent flags field from our lookup functions and wire up callers to treat unwritten regions as holes for the purpose of returning zeros to the user. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Due to the size of our group bitmaps, we'll never have a leaf node extent record with more than 16 bits worth of clusters. Split e_clusters up so that leaf nodes can get a flags field where we can mark unwritten extents. Interior nodes whose length references all the child nodes beneath it can't split their e_clusters field, so we use a union to preserve sizing there. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
We need to fill holes during a splice write. Provide our own splice write actor which can call ocfs2_file_buffered_write() with a splice-specific callback. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Do this instead of filemap_fdatawrite() - this way we sync only the range between i_size and the cluster boundary. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
do_sync_file_range() accepts a file * from which it takes an address_space to sync. Abstract out the bulk of the function into do_sync_mapping_range() which takes the address_space directly. This way callers who want to sync an address_space directly can take advantage of the functionality provided. do_sync_file_range() is preserved as a small wrapper around do_sync_mapping_range(). Ocfs2 in particular would like to use this to initiate a sync of a specific inode range during truncate, where a file * may not be available. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mark Fasheh 提交于
Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a subsequent extend could expose bad data. This introduced a new helper, which can be used in ocfs2_write(). Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
ocfs2_get_block() didn't understand sparse files, fix that. Also remove some code that isn't really useful anymore. We can fix up ocfs2_direct_IO_get_blocks() at the same time. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
These are no longer used, and can't handle file systems with sparse file allocation. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent to the I/O for cluster sizes greater than page size. Implement a custom file write here, which can order page locks for zeroing. This also has the advantage that cluster locks can easily be ordered outside of the page locks. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
This will be turned back on once we can do allocation in ->page_mkwrite(). Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Right now, file allocation for ocfs2 is done within ocfs2_extend_file(), which is either called from ->setattr() (for an i_size change), or at the top of ocfs2_file_aio_write(). Inodes on file systems with sparse file support will want to do their allocation during the actual write call. In either case the cluster locking decisions are the same. We abstract out that code into a new function, ocfs2_lock_allocators() which will be used by a later patch to enable writing to sparse files. This also provides a nice cleanup of ocfs2_extend_allocation(). Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
For ocfs2_truncate_file(), we eliminate the "simple" truncate case which no longer exists since i_size is not tied to i_clusters. In ocfs2_extend_file(), we skip the allocation / page zeroing code for file systems which understand sparse files. The core truncate code is changed to do a bottom up tree traversal. This gets abstracted out into it's own function. To make things more readable, most of the special case handling for in-inode extents from ocfs2_do_truncate() is also removed. Though write support for sparse files comes in a later patch, we at least update ocfs2_prepare_inode_for_write() to skip allocation for sparse files. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
Introduce tree rotations into the b-tree code. This will allow ocfs2 to support sparse files. Much of the added code is designed to be generic (in the ocfs2 sense) so that it can later be re-used to implement large extended attributes. This patch only adds the rotation code and does minimal updates to callers of the extent api. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
There are two checks in there (one for inode newness, one for other mounted nodes) which are unnecessary, so remove them. The DLM will allow the trylock in either case without any messaging overhead. Removing these makes ocfs2_request_delete() a one liner function, so just move the trylock out one level into ocfs2_query_inode_wipe(). Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Tiger Yang 提交于
Remove node messaging code that becomes unused with the delete inode vote removal. [Removed even more cruft which I spotted during review --Mark] Signed-off-by: NTiger Yang <tiger.yang@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Tiger Yang 提交于
Ocfs2 currently does cluster-wide node messaging to check the open state of an inode during delete. This patch removes that mechanism in favor of an inode cluster lock which is taken at shared read when an inode is first read and dropped in clear_inode(). This allows a deleting node to test the liveness of an inode by attempting to take an exclusive lock. Signed-off-by: NTiger Yang <tiger.yang@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
We don't want to print anything at all in ocfs2_lookup() when getting an error from ocfs2_iget() - it could be something as innocuous as a signal being detected in the dlm. ocfs2_permission() should filter on -ENOENT which ocfs2_meta_lock() can return if the inode was deleted on another node. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Sunil Mushran 提交于
We have noticed panic() hanging leading us to a situation in which the node, while otherwise dead, is still disk heartbeating. This leads to a hung cluster as the other nodes are waiting for this node to stop disk heartbeating. This situation is only resolved by power resetting the box. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Sunil Mushran 提交于
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Mark Fasheh 提交于
We don't want the extent map and uptodate cache destruction in ocfs2_meta_lock_update() on a local mount, so skip that. This fixes several bugs with uptodate being cleared on buffers and extent maps being corrupted. Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Sunil Mushran 提交于
In dlm_migrate_all_locks(), we currently call cond_resched_lock() after processing each lockres in a hash bucket. Move it outside the loop so as to call it only after the entire hash bucket has been processed. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
由 Srinivas Eeda 提交于
There is a possibility that dlm_remaster_locks could overwride node->state with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock spinlock. Signed-off-by: NSrinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
-
- 24 4月, 2007 2 次提交
-
-
由 Jeff Mahoney 提交于
The listxattr() and getxattr() operations are only protected by a read lock. As a result, if either of these operations run in parallel, a race condition exists where the xattr_root will end up being cached twice, which results in the leaking of a reference and a BUG() on umount. This patch refactors get_xa_root(), __get_xa_root(), and create_xa_root(), into one get_xa_root() function that takes the appropriate locking around the entire critical section. Reported, diagnosed and tested by Andrea Righi <a.righi@cineca.it> Signed-off-by: NJeff Mahoney <jeffm@suse.com> Cc: Andrea Righi <a.righi@cineca.it> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Edward Shishkin <edward@namesys.com> Cc: Alex Zarochentsev <zam@namesys.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Latchesar Ionkov 提交于
v9fs_insert uses v9fs_fid_lookup (which also locks the fid) to get the primary fid associated with the dentry and destroys the v9fs_fid struct after removing the file. If another process called v9fs_fid_lookup on the same dentry, it may wait undefinitely for the fid's lock (as the struct is freed). This patch changes v9fs_remove to use a cloned fid, so the primary fid is not locked and freed. Signed-off-by: NLatchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@hera.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 4月, 2007 4 次提交
-
-
由 Trond Myklebust 提交于
Protect nfs_set_page_dirty() against races with nfs_inode_add_request. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trond Myklebust 提交于
Redirtying a request that is already marked for commit will screw up the accounting for NR_UNSTABLE_NFS as well as nfs_i.ncommit. Ensure that all requests on the commit queue are labelled with the PG_NEED_COMMIT flag, and avoid moving them onto the dirty list inside nfs_page_mark_flush(). Also inline nfs_mark_request_dirty() into nfs_page_mark_flush() for atomicity reasons. Avoid dropping the spinlock until we're done marking the request in the radix tree and have added it to the ->dirty list. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trond Myklebust 提交于
Ensure that we don't release the PG_writeback lock until after the page has either been redirtied, or queued on the nfs_inode 'commit' list. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trond Myklebust 提交于
Get rid of the inlined #ifdefs. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 4月, 2007 2 次提交
-
-
由 Evgeniy Dushistov 提交于
This patch should fix or partly fix this bug: http://bugzilla.kernel.org/show_bug.cgi?id=8276 The problem is: - if we see "zero link case" during reading inode operation, we call ufs_error(which remount fs readonly), but not "mark" inode as bad (1) - in readonly case we do not fill some data structures, which are used in read and write case (2) - VFS call ufs_delete_inode if link count is zero (3) so (1)->(3)->(2) cause oops, this patch should fix such scenario Signed-off-by: NEvgeniy Dushistov <dushistov@mail.ru> Cc: Jim Paris <jim@jtan.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alan Cox 提交于
The patch checks for "|" in the pattern not the output and doesn't nail a pid on to a piped name (as it is a program name not a file) Also fixes a very very obscure security corner case. If you happen to have decided on a core pattern that starts with the program name then the user can run a program called "|myevilhack" as it stands. I doubt anyone does this. Signed-off-by: NAlan Cox <alan@redhat.com> Confirmed-by: NChristopher S. Aker <caker@theshore.net> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 4月, 2007 1 次提交
-
-
由 Trond Myklebust 提交于
We must remove the request from whatever list it is currently on before we can add it to the dirty list. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 4月, 2007 3 次提交
-
-
由 Trond Myklebust 提交于
If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must ensure that the PG_writeback flag is cleared. Also ensure that we actually own the PG_writeback flag whenever we schedule a new writeback by making nfs_set_page_writeback() return the value of test_set_page_writeback(). The PG_writeback page flag ends up replacing the functionality of the PG_FLUSHING nfs_page flag, so we rip that out too. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trond Myklebust 提交于
Do not flag an error if the COMMIT call fails and we decide to resend the writes. Let the resend flag the error if it fails. If a write has failed, then nfs_direct_write_result should not attempt to send a commit. It should just exit asap and return the error to the user. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trond Myklebust 提交于
It looks like nfs_setattr() and nfs_rename() also need to test whether the target is a regular file before calling nfs_wb_all()... Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 4月, 2007 2 次提交
-
-
由 Jeff Mahoney 提交于
Commit f50b6f86 introduced a race in autofs4 between autofs_lookup_unhashed() and autofs_dentry_release(). autofs_dentry_release() ends up clearing the ->dentry and ->inode members of autofs_info before removing it from the rehash list. The list is protected by the rehash lock in both functions, but since autofs_dentry_release() starts tearing the autofs_info struct down before removing it from the list, autofs_lookup_unhashed() can get a autofs_info with a NULL dentry. This patch moves the clearing of ->dentry and ->inode after the removal from the rehash list. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Acked-by: NIan Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vladimir Saveliev 提交于
This patch fixes a bug in function decrementing a key of stat data item. Offset of reiserfs keys are compared as signed values. To set key offset to maximal possible value maximal signed value has to be used. This bug is responsible for severe reiserfs filesystem corruption which shows itself as warning vs-13060. reiserfsck fixes this corruption by filesystem tree rebuilding. Signed-off-by: NVladimir Saveliev <vs@namesys.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 4月, 2007 1 次提交
-
-
由 Timo Savola 提交于
If rootmode isn't valid, we hit the BUG() in fuse_init_inode. Now EINVAL is returned. Signed-off-by: NTimo Savola <tsavola@movial.fi> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 4月, 2007 1 次提交
-
-
由 Andrew Morton 提交于
Revert all this. It can cause device-mapper to receive a different major from earlier kernels and it turns out that the Amanda backup program (via GNU tar, apparently) checks major numbers on files when performing incremental backups. Which is a bit broken of Amanda (or tar), but this feature isn't important enough to justify the churn. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 4月, 2007 1 次提交
-
-
由 Andrew Morton 提交于
Revert b46be050. Same reasoning as for ext3. Cc: Kirill Korotaev <dev@openvz.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ken Chen <kenneth.w.chen@intel.com> Cc: Andrey Savochkin <saw@sw.ru> Cc: <linux-ext4@vger.kernel.org> Cc: Dmitriy Monakhov <dmonakhov@openvz.org> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-