- 25 7月, 2011 5 次提交
-
-
由 Sunil Mushran 提交于
dlm_finish_local_lockres_recovery() needed a facelift. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
由 Sunil Mushran 提交于
o2cb messages needed a facelift. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
由 Sunil Mushran 提交于
o2dlm messages needed a facelift. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
由 Sunil Mushran 提交于
o2net messages needed a facelift. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
由 Sunil Mushran 提交于
Currently if the heartbeat device is hard-ro, the o2hb thread keeps chugging along and dumping errors along the way. The user needs to manually stop the heartbeat. The patch addresses this shortcoming by adding a limit to the number of times the hb thread will iterate in an unsteady state. If the hb thread does not ready steady state in that many interation, the start is aborted. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
-
- 21 7月, 2011 5 次提交
-
-
由 Josef Bacik 提交于
Btrfs needs to be able to control how filemap_write_and_wait_range() is called in fsync to make it less of a painful operation, so push down taking i_mutex and the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some file systems can drop taking the i_mutex altogether it seems, like ext3 and ocfs2. For correctness sake I just pushed everything down in all cases to make sure that we keep the current behavior the same for everybody, and then each individual fs maintainer can make up their mind about what to do from there. Thanks, Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NJosef Bacik <josef@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
For filesystems that delay their end_io processing we should keep our i_dio_count until the the processing is done. Enable this by moving the inode_dio_done call to the end_io handler if one exist. Note that the actual move to the workqueue for ext4 and XFS is not done in this patch yet, but left to the filesystem maintainers. At least for XFS it's not needed yet either as XFS has an internal equivalent to i_dio_count. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Maintain i_dio_count for all filesystems, not just those using DIO_LOCKING. This these filesystems to also protect truncate against direct I/O requests by using common code. Right now the only non-DIO_LOCKING filesystem that appears to do so is XFS, which uses an opencoded variant of the i_dio_count scheme. Behaviour doesn't change for filesystems never calling inode_dio_wait. For ext4 behaviour changes when using the dioread_nonlock option, which previously was missing any protection between truncate and direct I/O reads. For ocfs2 that handcrafted i_dio_count manipulations are replaced with the common code now enable. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Let filesystems handle waiting for direct I/O requests themselves instead of doing it beforehand. This means filesystem-specific locks to prevent new dio referenes from appearing can be held. This is important to allow generalizing i_dio_count to non-DIO_LOCKING filesystems. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
i_alloc_sem is a rather special rw_semaphore. It's the last one that may be released by a non-owner, and it's write side is always mirrored by real exclusion. It's intended use it to wait for all pending direct I/O requests to finish before starting a truncate. Replace it with a hand-grown construct: - exclusion for truncates is already guaranteed by i_mutex, so it can simply fall way - the reader side is replaced by an i_dio_count member in struct inode that counts the number of pending direct I/O requests. Truncate can't proceed as long as it's non-zero - when i_dio_count reaches non-zero we wake up a pending truncate using wake_up_bit on a new bit in i_flags - new references to i_dio_count can't appear while we are waiting for it to read zero because the direct I/O count always needs i_mutex (or an equivalent like XFS's i_iolock) for starting a new operation. This scheme is much simpler, and saves the space of a spinlock_t and a struct list_head in struct inode (typically 160 bits on a non-debug 64-bit system). Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 7月, 2011 6 次提交
-
-
由 Al Viro 提交于
combination of kern_path_parent() and lookup_create(). Does *not* expose struct nameidata to caller. Syscalls converted to that... Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
not used by the instances anymore. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of them removes that bit. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
not used in the instances anymore. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
its value depends only on inode and does not change; we might as well store it in ->i_op->check_acl and be done with that. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 6月, 2011 1 次提交
-
-
由 Al Viro 提交于
Caching "we have already removed suid/caps" was overenthusiastic as merged. On network filesystems we might have had suid/caps set on another client, silently picked by this client on revalidate, all of that *without* clearing the S_NOSEC flag. AFAICS, the only reasonably sane way to deal with that is * new superblock flag; unless set, S_NOSEC is not going to be set. * local block filesystems set it in their ->mount() (more accurately, mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than local block ones clear it) * if any network filesystem (or a cluster one) wants to use S_NOSEC, it'll need to set MS_NOSEC in sb->s_flags *AND* take care to clear S_NOSEC when inode attribute changes are picked from other clients. It's not an earth-shattering hole (anybody that can set suid on another client will almost certainly be able to write to the file before doing that anyway), but it's a bug that needs fixing. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 5月, 2011 3 次提交
-
-
由 Tristan Ye 提交于
though the goal_to_be_moved will be validated again in following moving, it's still a good idea to validate it after adjustment at the very beginning, instead of validating it before adjustment. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
It's not wise enough to do a 64bits division anywhere in kernside, replace it with a decent helper or proper shifts. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Dan Magenheimer 提交于
This eighth patch of eight in this cleancache series "opts-in" cleancache for ocfs2. Clustered filesystems must explicitly enable cleancache by calling cleancache_init_shared_fs anytime an instance of the filesystem is mounted. Ocfs2 is currently the only user of the clustered filesystem interface but nevertheless, the cleancache hooks in the VFS layer are sufficient for ocfs2 including the matching cleancache_flush_fs hook which must be called on unmount. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v8: trivial merge conflict update] [v5: jeremy@goop.org: simplify init hook and any future fs init changes] Signed-off-by: NDan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com> Reviewed-by: NJeremy Fitzhardinge <jeremy@goop.org> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik Van Riel <riel@redhat.com> Cc: Jan Beulich <JBeulich@novell.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Andreas Dilger <adilger@sun.com> Cc: Ted Tso <tytso@mit.edu> Cc: Nitin Gupta <ngupta@vflare.org>
-
- 26 5月, 2011 6 次提交
-
-
由 Sage Weil 提交于
Ocfs2 has no issues with lingering references to unlinked directory inodes. CC: Mark Fasheh <mfasheh@suse.com> CC: ocfs2-devel@oss.oracle.com Acked-by: NJoel Becker <jlbec@evilplan.org> Signed-off-by: NSage Weil <sage@newdream.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Sage Weil 提交于
Only a few file systems need this. Start by pushing it down into each rename method (except gfs2 and xfs) so that it can be dealt with on a per-fs basis. Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSage Weil <sage@newdream.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Sage Weil 提交于
Only a few file systems need this. Start by pushing it down into each fs rmdir method (except gfs2 and xfs) so it can be dealt with on a per-fs basis. This does not change behavior for any in-tree file systems. Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSage Weil <sage@newdream.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Tristan Ye 提交于
Oops, local-mounted of 'ocfs2_fops_no_plocks' is just missing the support of unwritten_extents/punching-hole due to no func pointer was given correctly to '.follocate' field. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Sunil Mushran 提交于
During dlm domain shutdown, o2dlm has to free all the lock resources. Ones that have no locks and references are freed. Ones that have locks and/or references are migrated to another node. The first task in migration is finding a target. Currently we scan the lock resource and find one node that either has a lock or a reference. This is not very efficient in a parallel umount case as we might end up migrating the lock resource to a node which itself may have to migrate it to a third node. The patch scans the dlm->exit_domain_map to ensure the target node is not leaving the domain. If no valid target node is found, o2dlm does not migrate the resource but instead waits for the unlock and deref messages that will allow it to free the resource. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Signed-off-by: NJoel Becker <jlbec@evilplan.org>
-
由 Sunil Mushran 提交于
This patch adds a new dlm message DLM_BEGIN_EXIT_DOMAIN_MSG and ups the dlm protocol to 1.2. o2dlm sends this new message in dlm_unregister_domain() to mark the beginning of the exit domain. This message is sent to all nodes in the domain. Currently o2dlm has no way of informing other nodes of its impending exit. This information is useful as the other nodes could disregard the exiting node in certain operations. For example, in resource migration. If two or more nodes were umounting in parallel, it would be more efficient if o2dlm were to choose a non-exiting node to be the new master node rather than an exiting one. Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com> Reviewed-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <jlbec@evilplan.org>
-
- 25 5月, 2011 14 次提交
-
-
由 Tristan Ye 提交于
The threshold should be greater than clustersize and less than i_size. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
We're going to support partial extent moving, which may split entire extent movement into pieces to compromise the insuffice allocations, it eases the 'ENSPC' pain and makes the whole moving much less likely to fail, the downside is it may make the fs even more fragmented before moving, just let the userspace make a trade-off here. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
the basic logic of moving extents for a file is pretty like punching-hole sequence, walk the extents within the range as user specified, calculating an appropriate len to defrag/move, then let ocfs2_defrag/move_extent() to do the actual moving. This func ends up setting 'OCFS2_MOVE_EXT_FL_COMPLETE' to userpace if operation gets done successfully. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
The helper is to calculate the defrag length in one run according to a threshold, it will proceed doing defragmentation until the threshold was meet, and skip a LARGE extent if any. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
ocfs2_move_extent() logic will validate the goal_offset_in_block, where extents to be moved, what's more, it also compromises a bit to probe the appropriate region around given goal_offset when the original goal is not able to fit the movement. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
These helpers were actually borrowed from alloc.c, which may be publicized later. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
Before doing the movement of extents, we'd better probe the alloc group from 'goal_blk' for searching a contiguous region to fit the wanted movement, we even will have a best-effort try by compromising to a threshold around the given goal. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
First best-effort attempt to validate and adjust the goal (physical address in block), while it can't guarantee later operation can succeed all the time since global_bitmap may change a bit over time. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
This function tries locate the right alloc group, where a given physical block resides, it returns the caller a buffer_head of victim group descriptor, and also the offset of block in this group, by passing the block number. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
It's a relatively complete function to accomplish defragmentation for entire or partial extent, one journal handle was kept during the operation, it was logically doing one more thing than ocfs2_move_extent() acutally, yes, it's claiming the new clusters itself;-) Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
The moving range of __ocfs2_move_extent() was within one extent always, it consists following parts: 1. Duplicates the clusters in pages to new_blkoffset, where extent to be moved. 2. Split the original extent with new extent, coalecse the nearby extents if possible. 3. Append old clusters to truncate log, or decrease_refcount if the extent was refcounted. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving. ocfs2_lock_allocators_move_extents() was like the common ocfs2_lock_allocators(), to lock metadata and data alloctors during extents moving, reserve appropriate metadata blocks and data clusters, also performa a best- effort to calculate the credits for journal transaction in one run of movement. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
Adding new files move_extents.[c|h] and fill it with nothing but only a context structure. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-
由 Tristan Ye 提交于
Patch also manages to add a manipulative struture for this ioctl. Signed-off-by: NTristan Ye <tristan.ye@oracle.com>
-