- 25 1月, 2008 3 次提交
-
-
由 Steven Whitehouse 提交于
Previously we were doing (write data, wait for data, write metadata, wait for metadata). After this patch we so (write metadata, write data, wait for data, wait for metadata) which should be more efficient. Also I noticed that the drop_bh and xmote_bh functions were almost identical. In fact the only difference was a single test, and that test is such that in the drop_bh case, it would always evaluate to the correct result. As such we can use the xmote_bh functions in all the places where we were using the drop_bh function and remove the drop_bh functions. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This call to reclaim glocks is not needed, and in particular we don't want it in the fast path for locking glocks. The limit was entirely arbitrary anyway and we can't expect users to adjust things like this, the remaining code will do the right thing on its own. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Wendy Cheng 提交于
Fix a race condition where multiple glock demote requests are sent to a node back-to-back. This patch does a check inside handle_callback() to see whether a demote request is in progress. If true, it sets a flag to make sure run_queue() will loop again to handle the new request, instead of erronously setting gl_demote_state to a different state. Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 10 10月, 2007 10 次提交
-
-
由 Wendy Cheng 提交于
Move inode deletion code out of blocking_cb handle_callback route to avoid racy conditions that end up blocking lock_dlm1 thread. Fix bugzilla 286821. Signed-off-by: NWendy Cheng <wcheng@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Abhijith Das 提交于
This patch adds a new flag to the gfs2_holder structure GL_FLOCK. It is set on holders of glocks representing flocks. This flag is checked in add_to_queue() and a process is permitted to queue more than one holder onto a glock if it is set. This solves the issue of a process not being able to do multiple flocks on the same file. Through a single descriptor, a process can now promote and demote flocks. Through multiple descriptors a process can now queue multiple flocks on the same file. There's still the problem of a process deadlocking itself (because gfs2 blocking locks are not interruptible) by queueing incompatible deadlock. Signed-off-by: NAbhijith Das <adas@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Benjamin Marzinski 提交于
When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in a cluster, it will deadlock. The reason is that do_no_page() will repeatedly call gfs2_sharewrite_nopage(), because each node keeps giving up the glock too early, and is forced to call unmap_mapping_range(). This bumps the mapping->truncate_count sequence count, forcing do_no_page() to retry. This patch institutes a minimum glock hold time a tenth a second. This insures that even in heavy contention cases, the node has enough time to get some useful work done before it gives up the glock. A second issue is that when gfs2_glock_dq() is called from within a page fault to demote a lock, and the associated page needs to be written out, it will try to acqire a lock on it, but it has already been locked at a higher level. This patch puts makes gfs2_glock_dq() use the work queue as well, to avoid this issue. This is the same patch as Steve Whitehouse originally proposed to fix this issue, execpt that gfs2_glock_dq() now grabs a reference to the glock before it queues up the work on it. Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Abhijith Das 提交于
With this patch, gfs2 glockdump through the debugfs filesystem will only dump glocks for the specified filesystem instead of all glocks. Also, to aid debugging, the glock number is dumped in hex instead of decimal. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com> Signed-off-by: NAbhijith Das <adas@redhat.com>
-
由 Steven Whitehouse 提交于
We only need a single gfs2_scand process rather than the one per filesystem which we had previously. As a result the parameter determining the frequency of gfs2_scand runs becomes a module parameter rather than a mount parameter as it was before. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Denis Cheng 提交于
these struct *_operations are all method tables, thus should be const. Signed-off-by: NDenis Cheng <crquan@gmail.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This fixes an oops which was occurring during glock dumping due to the seq file code not taking a reference to the glock. Also this fixes a memory leak which occurred in certain cases, in turn preventing the filesystem from unmounting. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Jesper Juhl 提交于
This patch cleans up duplicate includes in fs/gfs2/ Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Josef Whiter 提交于
If a glock is in the exclusive state and a request for demote to deferred has been received, then further requests for demote to shared are being ignored. This patch fixes that by ensuring that we demote to unlocked in that case. Signed-off-by: NJosef Whiter <jwhiter@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
One of the races relates to referencing a variable while not holding its protecting spinlock. The patch simply moves the test inside the spin lock. The other races occurs when a demote to unlocked request occurs during the time a demote to shared request is already running. This of course only happens in the case that the lock was in the exclusive mode to start with. The patch adds a check to see if another demote request has occurred in the mean time and if it has, then it performs a second demote. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 09 7月, 2007 4 次提交
-
-
由 Steven Whitehouse 提交于
There is a bug in the code which acquires multiple glocks where if the initial out-of-order attempt fails part way though we can land up trying to acquire the wrong number of glocks. This is part of the fix for red hat bz #239737. The other part of the bz doesn't apply to upstream kernels since it was fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d3717bdf8f08a0e1039158c8bab2c24d20f492b6 Since the out-of-order code doesn't appear to add anything to the performance of GFS2, this patch just removed it rather than trying to fix it. It should be much easier to see whats going on here now. In addition, we don't allocate any memory unless we are using a lot of glocks (which is a relatively uncommon case). Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Abhijith Das 提交于
There were two issues during deallocation of unlinked inodes. The first was relating to the use of a "try" lock which in the case of the inode lock wasn't trying hard enough to deallocate in all circumstances (now changed to a normal glock) and in the case of the iopen lock didn't wait for the demotion of the shared lock before attempting to get the exclusive lock, and thereby sometimes (timing dependent) not completing the deallocation when it should have done. The second issue related to the lack of a way to invalidate dcache entries on remote nodes (now fixed by this patch) which meant that unlinks were taking a long time to return disk space to the fs. By adding some code to invalidate the dcache entries across the cluster for unlinked inodes, that is now fixed. This patch was written jointly by Abhijith Das and Steven Whitehouse. Signed-off-by: NAbhijith Das <adas@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Robert Peterson 提交于
This addendum patch 2 corrects three things: 1. It fixes a stupid mistake in the previous addendum that broke gfs2. Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html 2. It fixes a problem that Dave Teigland pointed out regarding the external declarations in ops_address.h being in the wrong place. 3. It recasts a couple more %llu printks to (unsigned long long) as requested by Steve Whitehouse. I would have loved to put this all in one revised patch, but there was a rush to get some patches for RHEL5. Therefore, the previous patches were applied to the git tree "as is" and therefore, I'm posting another addendum. Sorry. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 01 5月, 2007 8 次提交
-
-
由 Steven Whitehouse 提交于
Now that the patch from -mm has gone upstream, we can uncomment the code in GFS2 which uses sprintf_symbol. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Robert Peterson <rpeterso@redhat.com>
-
由 Robert Peterson 提交于
The patch below consists of the following changes (in code order): 1. I fixed a minor compiler warning regarding the printing of a kernel symbol address. 2. I implemented a suggestion from Dave Teigland that moves the debugfs information for gfs2 into a subdirectory so we can easily expand our use of debugfs in the future. The current code keeps the glock information in: /debug/gfs2/<fs> With the patch, the new code keeps the glock information in: /debug/gfs2/<fs>/glock That will allow us to create more debugfs files in the future. 3. This fixes a bug whereby a failed mount attempt causes the debugfs file to not be deleted. Failed mount attempts should always clean up after themselves, including deleting the debugfs file and/or directory. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Robert Peterson 提交于
This is for Bugzilla Bug 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) seen at the "gfs2 summit". This also fixes the bug that caused garbage to be printed by the "initialized at" field. I apologize for the kludge, but that code will all be ripped out anyway when the official sprint_symbol function becomes available in the Linux kernel. I also changed some formatting so that spaces are replaced by proper tabs. Signed-off-by: NRobert Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Robert Peterson 提交于
In Testing the previously posted and accepted patch for https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228540 I uncovered some gfs2 badness. It turns out that the current gfs2 code saves off a process pointer when glocks is taken in both the glock and glock holder structures. Those structures will persist in memory long after the process has ended; pointers to poisoned memory. This problem isn't caused by the 228540 fix; the new capability introduced by the fix just uncovered the problem. I wrote this patch that avoids saving process pointers and instead saves off the process pid. Rather than referencing the bad pointers, it now does process lookups. There is special code that makes the output nicer for printing holder information for processes that have ended. This patch also adds a stub for the new "sprint_symbol" function that exists in Andrew Morton's -mm patch set, but won't go into the base kernel until 2.6.22, since it adds functionality but doesn't fix a bug. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Since gcc didn't evaluate the last two terms of the expression in glock.c:1881 as a constant expression, it resulted in an error on i386 due to the lack of a 64bit divide instruction. This adds some brackets to fix the problem. This was reported by Andrew Morton. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org>
-
由 Steven Whitehouse 提交于
This patch prevents the printing of a warning message in cases where the fs is functioning normally by handing off responsibility for unlinked, but still open inodes, to another node for eventual deallocation. Also, there is now an improved system for ensuring that such requests to other nodes do not get lost. The callback on the iopen lock is only ever called when i_nlink == 0 and when a node is unable to deallocate it due to it still being in use on another node. When a node receives the callback therefore, it knows that i_nlink must be zero, so we mark it as such (in gfs2_drop_inode) in order that it will then attempt deallocation of the inode itself. As an additional benefit, queuing a demote request no longer requires a memory allocation. This simplifies the code for dealing with gfs2_holders as it removes one special case. There are two new fields in struct gfs2_glock. gl_demote_state is the state which the remote node has requested and gl_demote_time is the time when the request came in. Both fields are only valid when the GLF_DEMOTE flag is set in gl_flags. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Josef Whiter 提交于
If you specify an invalid mount option when trying to mount a gfs2 filesystem, gfs2 will oops. The attached patch resolves this problem. Signed-off-by: NJosef Whiter <jwhiter@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Robert Peterson 提交于
The attached patch resolves bz 228540. This adds the capability for gfs2 to dump gfs2 locks through the debugfs file system. This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from gfs2 because all the ioctls were stripped out. Please see the bugzilla for more history about the fix. This patch is also attached to the bugzilla record. The patch is against Steve Whitehouse's latest nmw git tree kernel (2.6.21-rc1) and has been tested on system trin-10. Signed-off-by: NRobert Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 08 3月, 2007 2 次提交
-
-
fs/gfs2/glock.c:2198: error: 'THIS_MODULE' undeclared here (not in a function) Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Steven Whitehouse 提交于
The ->go_drop_bh function is never used, so this removes it and the single caller, Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 06 2月, 2007 10 次提交
-
-
由 Steven Whitehouse 提交于
Dave Teigland fixed this bug a while back, but I managed to mistakenly remove the semaphore during later development. It is required to avoid the list of inodes changing during an invalidate_inodes call. I have made it an rwsem since the read side will be taken frequently during normal filesystem operation. The write site will only happen during umount of the file system. Also the bug only triggers when using the DLM lock manager and only then under certain conditions as its timing related. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>
-
由 Steven Whitehouse 提交于
This is a one letter typo fix in glock.c, spotted by Rob Kenna. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This one liner got missed from the previous patch. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This function is not longer required since we do not do recursive locking in the glock layer. As a result all its callers can be replaceed with list_empty() calls. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This patch doesn't make any changes to the ordering of the various operations related to glocking, but it does tidy up the calls to the glops.c functions to make the structure more obvious. The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be made static within glock.c since they are called by every set of glock operations. The xmote_th and drop_th glock operations are then made conditional upon those two routines existing and called from the previously mentioned functions in glock.c respectively. Also it can be seen that the go_sync operation isn't needed since it can easily be replaced by calls to xmote_bh and drop_bh respectively. This results in no longer (confusingly) calling back into routines in glock.c from glops.c and also reducing the glock operations by one member. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Here is a patch for GFS2 to remove the local exclusive flag. In the places it was used, mutex's are always held earlier in the call path, so it appears redundant in the LM_ST_SHARED case. Also, the GFS2 holders were setting local exclusive in any case where the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock code where the flag was tested have been replaced with tests for the lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well as globally exclusive). Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This is never used, so we might as well remove it. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
The "greedy" code was an attempt to retain glocks for a minimum length of time when they relate to mmap()ed files. The current implementation of this feature is not, however, ideal in that it required allocating memory in order to do this and its overly complicated. It also misses the mark by ignoring the other I/O operations which are just as likely to suffer from the same problem. So the plan is to remove this now and then add the functionality back as part of the glock state machine at a later date (and thus take into account all the possible users of this feature) Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Here is something I spotted (while looking for something entirely different) the other day. Rather than using a completion in each and every struct gfs2_holder, this removes it in favour of hashed wait queues, thus saving a considerable amount of memory both on the stack (where a number of gfs2_holder structures are allocated) and in particular in the gfs2_inode which has 8 gfs2_holder structures embedded within it. As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to 1912 bytes, a saving of 576 bytes per inode (no thats not a typo!). In actual practice we get a much better result than that since now that a gfs2_inode is under the 2048 byte barrier, we get two per 4k slab page effectively halving the amount of memory required to store gfs2_inodes. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 30 11月, 2006 3 次提交
-
-
由 Randy Dunlap 提交于
Fix function parameter typing: fs/gfs2/glock.c:100: warning: function declaration isn't a prototype Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This fixes a bug which resulted in poor performance due to flushing the journal too often. The code path in question was via the inode_go_sync() function in glops.c. The solution is not to flush the journal immediately when inodes are ejected from memory, but batch up the work for glockd to deal with later on. This means that glocks may now live on beyond the end of the lifetime of their inodes (but not very much longer in the normal case). Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in calculation of the number of free journal blocks. The gfs2_logd process has been altered to be more responsive to the journal filling up. We now wake it up when the number of uncommitted journal blocks has reached the threshold level rather than trying to flush directly at the end of each transaction. This again means doing fewer, but larger, log flushes in general. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
The go_sync callback took two flags, but one of them was set on every call, so this patch removes once of the flags and makes the previously conditional operations (on this flag), unconditional. The go_inval callback took three flags, each of which was set on every call to it. This patch removes the flags and makes the operations unconditional, which makes the logic rather more obvious. Two now unused flags are also removed from incore.h. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-