- 20 8月, 2013 1 次提交
-
-
由 Steven Whitehouse 提交于
Since gfs2_sync_meta() is only called from a single file, lets move it to lops.c where it is used, and mark it static. At the same time, we can clean up the meta_io.h header too. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 19 6月, 2013 1 次提交
-
-
由 Benjamin Marzinski 提交于
This patch looks at all the outstanding blocks in all the transactions on the log, and moves the completed ones to the ail2 list. Then it issues revokes for these blocks. This will hopefully speed things up in situations where there is a lot of contention for glocks, especially if they are acquired serially. revoke_lo_before_commit will issue at most one log block's full of these preemptive revokes. The amount of reserved log space that gfs2_log_reserve() ignores has been incremented to allow for this extra block. This patch also consolidates the common revoke instructions into one function, gfs2_add_revoke(). Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 05 6月, 2013 2 次提交
-
-
由 Bob Peterson 提交于
With recent changes to the transactions, it appears that we are no longer using the "log ops" for resource groups. Since the log commit code processes the array of log ops, eliminating this should be marginally better for performance. Therefore this patch eliminates it. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Benjamin Marzinski 提交于
This patch simply sort the data and metadata buffer lists by their inplace block number. This makes gfs2_log_flush issue the inplace IO in sequential order, which will hopefully speed up writing the IO out to disk. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 03 6月, 2013 1 次提交
-
-
由 Bob Peterson 提交于
This patch sets the log descriptor type according to whether the journal commit is for (journaled) data or metadata. This was recently broken when the functions to process data and metadata log ops were combined. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 24 5月, 2013 1 次提交
-
-
由 Steven Whitehouse 提交于
There was a missing _all in this loop iterator Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 08 4月, 2013 1 次提交
-
-
由 Benjamin Marzinski 提交于
In order to allow transactions and log flushes to happen at the same time, gfs2 needs to move the transaction accounting and active items list code into the gfs2_trans structure. As a first step toward this, this patch removes the gfs2_ail structure, and handles the active items list in the gfs_trans structure. This keeps gfs2 from allocating an ail structure on log flushes, and gives us a struture that can later be used to store the transaction accounting outside of the gfs2 superblock structure. With this patch, at the end of a transaction, gfs2 will add the gfs2_trans structure to the superblock if there is not one already. This structure now has the active items fields that were previously in gfs2_ail. This is not necessary in the case where the transaction was simply used to add revokes, since these are never written outside of the journal, and thus, don't need an active items list. Also, in order to make sure that the transaction structure is not removed while it's still in use by gfs2_trans_end, unlocking the sd_log_flush_lock has to happen slightly later in ending the transaction. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 24 3月, 2013 1 次提交
-
-
由 Kent Overstreet 提交于
Just a little convenience macro - main reason to add it now is preparing for immutable bio vecs, it'll reduce the size of the patch that puts bi_sector/bi_size/bi_idx into a struct bvec_iter. Signed-off-by: NKent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Lars Ellenberg <drbd-dev@lists.linbit.com> CC: Jiri Kosina <jkosina@suse.cz> CC: Alasdair Kergon <agk@redhat.com> CC: dm-devel@redhat.com CC: Neil Brown <neilb@suse.de> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: linux-s390@vger.kernel.org CC: Chris Mason <chris.mason@fusionio.com> CC: Steven Whitehouse <swhiteho@redhat.com> Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 29 1月, 2013 2 次提交
-
-
由 Steven Whitehouse 提交于
This patch copies the body of gfs2_trans_add_bh into the two newly added gfs2_trans_add_data and gfs2_trans_add_meta functions. We can then move the .lo_add functions from lops.c into trans.c and call them directly. As a result of this, we no longer need to use the .lo_add functions at all, so that is removed from the log operations structure. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This moves the lo_add function for revokes into trans.c, removing a function call and making the code easier to read. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 07 11月, 2012 2 次提交
-
-
由 Benjamin Marzinski 提交于
In gfs2_trans_add_bh(), gfs2 was testing if a there was a bd attached to the buffer without having the gfs2_log_lock held. It was then assuming it would stay attached for the rest of the function. However, without either the log lock being held of the buffer locked, __gfs2_ail_flush() could detach bd at any time. This patch moves the locking before the test. If there isn't a bd already attached, gfs2 can safely allocate one and attach it before locking. There is no way that the newly allocated bd could be on the ail list, and thus no way for __gfs2_ail_flush() to detach it. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Andrew Price 提交于
Cleans up two cases where variables were assigned values but then never used again. Signed-off-by: NAndrew Price <anprice@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 06 6月, 2012 1 次提交
-
-
由 Steven Whitehouse 提交于
When we read an invalid block from the journal, we should not call withdraw, but simply print a message and return an error. It is up to the caller to then handle that error. In the case of mount that means a failed mount, rather than a withdraw (requiring a reboot). In the case of recovering another nodes journal then we return an error via the uevent. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 02 5月, 2012 1 次提交
-
-
由 Bob Peterson 提交于
This patch eliminates the gfs2_log_element data structure and rolls its two components into the gfs2_bufdata. This makes the code easier to understand and makes it easier to migrate to a rbtree to keep the list sorted. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 24 4月, 2012 5 次提交
-
-
由 Steven Whitehouse 提交于
This patch removes a log lock from around atomic operation where it is not needed, removes an unused variable, and also changes a void pointer used incorrectly to a struct page pointer. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This is another clean up in the logging code. This per-transaction list was largely unused. Its main function was to ensure that the number of buffers in a transaction was correct, however that counter was only used to check the number of buffers in the bd_list_tr, plus an assert at the end of each transaction. With the assert now changed to use the calculated buffer counts, we can remove both bd_list_tr and its associated counter. This should make the code easier to understand as well as shrinking a couple of structures. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
The main part of this patch merges the two functions used to write metadata and data buffers to the log. Most of the code is common between the two functions, so this provides a nice clean up, and makes the code more readable. The gfs2_get_log_desc() function is also extended to take two more arguments, and thus avoid having to set the length and data1 fields of this strucuture as a separate operation. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Prior to this patch, we have two ways of sending i/o to the log. One of those is used when we need to allocate both the data to be written itself and also a buffer head to submit it. This is done via sb_getblk and friends. This is used mostly for writing log headers. The other method is used when writing blocks which have some in-place counterpart. This is the case for all the metadata blocks which are journalled, and when journaled data is in use, for unescaped journalled data blocks. This patch replaces both of those two methods, and about half a dozen separate i/o submission points with a single i/o submission function. We also go direct to bio rather than using buffer heads, since this allows us to build i/o requests of the maximum size for the block device in question. It also reduces the memory required for flushing the log, which can be very useful in low memory situations. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Since we always write the buffer directly after this function returns, we might as well merge it into here. This is a clean up in preparation for some further updates to the log code which are coming soon. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 20 3月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 08 3月, 2012 1 次提交
-
-
由 Steven Whitehouse 提交于
In order to ensure that we've got enough buffer heads for flushing the journal, the orignal code used __GFP_NOFAIL when performing this allocation. Here we dispense with that in favour of using a mempool. This should improve efficiency in low memory conditions since flushing the journal is a good way to get memory back, we don't want to be spinning, waiting on memory allocations. The buffers which are allocated via this mempool are fairly short lived, so that we'll recycle them pretty quickly. Although there are other memory allocations which occur during the journal flush process, this is the one which can potentially require the most memory, so the most important one to fix. The amount of memory reserved is a fixed amount, and we should not need to scale it when there are a greater number of filesystems in use. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 29 2月, 2012 2 次提交
-
-
由 Steven Whitehouse 提交于
The FITRIM ioctl provides an alternative way to send discard requests to the underlying device. Using the discard mount option results in every freed block generating a discard request to the block device. This can be slow, since many block devices can only process discard requests of larger sizes, and also such operations can be time consuming. Rather than using the discard mount option, FITRIM allows a sweep of the filesystem on an occasional basis, and also to optionally avoid sending down discard requests for smaller regions. In GFS2 FITRIM will work at resource group granularity. There is a flag for each resource group which keeps track of which resource groups have been trimmed. This flag is reset whenever a deallocation occurs in the resource group, and set whenever a successful FITRIM of that resource group has taken place. This helps to reduce repeated discard requests for the same block ranges, again improving performance. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
gfs2_log_get_buf() and gfs2_log_fake_buf() are both used only in lops.c, so move them next to their callers and they can then become static. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 21 10月, 2011 2 次提交
-
-
由 Steven Whitehouse 提交于
Some items picked up through automated code analysis. A few bits of unreachable code and two unchecked return values. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Bob Peterson 提交于
Here is an update of Bob's original rbtree patch which, in addition, also resolves the rather strange ref counting that was being done relating to the bitmap blocks. Originally we had a dual system for journaling resource groups. The metadata blocks were journaled and also the rgrp itself was added to a list. The reason for adding the rgrp to the list in the journal was so that the "repolish clones" code could be run to update the free space, and potentially send any discard requests when the log was flushed. This was done by comparing the "cloned" bitmap with what had been written back on disk during the transaction commit. Due to this, there was a requirement to hang on to the rgrps' bitmap buffers until the journal had been flushed. For that reason, there was a rather complicated set up in the ->go_lock ->go_unlock functions for rgrps involving both a mutex and a spinlock (the ->sd_rindex_spin) to maintain a reference count on the buffers. However, the journal maintains a reference count on the buffers anyway, since they are being journaled as metadata buffers. So by moving the code which deals with the post-journal accounting for bitmap blocks to the metadata journaling code, we can entirely dispense with the rather strange buffer ref counting scheme and also the requirement to journal the rgrps. The net result of all this is that the ->sd_rindex_spin is left to do exactly one job, and that is to look after the rbtree or rgrps. This patch is designed to be a stepping stone towards using RCU for the rbtree of resource groups, however the reduction in the number of uses of the ->sd_rindex_spin is likely to have benefits for multi-threaded workloads, anyway. The patch retains ->go_lock and ->go_unlock for rgrps, however these maybe also be removed in future in favour of calling the functions directly where required in the code. That will allow locking of resource groups without needing to actually read them in - something that could be useful in speeding up statfs. In the mean time though it is valid to dereference ->bi_bh only when the rgrp is locked. This is basically the same rule as before, modulo the references not being valid until the following journal flush. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Signed-off-by: NBob Peterson <rpeterso@redhat.com> Cc: Benjamin Marzinski <bmarzins@redhat.com>
-
- 20 4月, 2011 2 次提交
-
-
由 Steven Whitehouse 提交于
The GLF_LRU flag introduced in the previous patch can be used to check if a glock is on the lru list when a new holder is queued and if so remove it, without having first to get the lru_lock. The main purpose of this patch however is to optimise the glocks left over when an inode at end of life is being evicted. Previously such glocks were left with the GLF_LFLUSH flag set, so that when reclaimed, each one required a log flush. This patch resets the GLF_LFLUSH flag when there is nothing left to flush thus preventing later log flushes as glocks are reused or demoted. In order to do this, we need to keep track of the number of revokes which are outstanding, and also to clear the GLF_LFLUSH bit after a log commit when only revokes have been processed. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Rather than allowing the glocks to be scheduled for possible reclaim as soon as they have exited the journal, this patch delays their entry to the list until the glocks in question are no longer in use. This means that we will rely on the vm for writeback of all dirty data and metadata from now on. When glocks are added to the lru list they should be freeable much faster since all the I/O required to free them should have already been completed. This should lead to much better I/O patterns under low memory conditions. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 14 3月, 2011 1 次提交
-
-
由 Steven Whitehouse 提交于
The previous patch missed a couple of places where the AIL list needed locking, so this fixes up those places, plus a comment is corrected too. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Dave Chinner <dchinner@redhat.com>
-
- 11 3月, 2011 1 次提交
-
-
由 Dave Chinner 提交于
The log lock is currently used to protect the AIL lists and the movements of buffers into and out of them. The lists are self contained and no log specific items outside the lists are accessed when starting or emptying the AIL lists. Hence the operation of the AIL does not require the protection of the log lock so split them out into a new AIL specific lock to reduce the amount of traffic on the log lock. This will also reduce the amount of serialisation that occurs when the gfs2_logd pushes on the AIL to move it forward. This reduces the impact of log pushing on sequential write throughput. Signed-off-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 10 3月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 21 1月, 2011 1 次提交
-
-
由 Steven Whitehouse 提交于
This has a number of advantages: - Reduces contention on the hash table lock - Makes the code smaller and simpler - Should speed up glock dumps when under load - Removes ref count changing in examine_bucket - No longer need hash chain lock in glock_put() in common case There are some further changes which this enables and which we may do in the future. One is to look at using SLAB_RCU, and another is to look at using a per-cpu counter for the per-sb glock counter, since that is touched twice in the lifetime of each glock (but only used at umount time). Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 05 5月, 2010 1 次提交
-
-
由 Benjamin Marzinski 提交于
This patch contains various tweaks to how log flushes and active item writeback work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits for gfs2_logd to do the log flushing. Multiple functions were rewritten to remove the need to call gfs2_log_lock(). Instead of using one test to see if gfs2_logd had work to do, there are now seperate tests to check if there are two many buffers in the incore log or if there are two many items on the active items list. This patch is a port of a patch Steve Whitehouse wrote about a year ago, with some minor changes. Since gfs2_ail1_start always submits all the active items, it no longer needs to keep track of the first ai submitted, so this has been removed. In gfs2_log_reserve(), the order of the calls to prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has been switched. If it called wake_up first there was a small window for a race, where logd could run and return before gfs2_log_reserve was ready to get woken up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve() would be left waiting for gfs2_logd to eventualy run because it timed out. Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times out, and flushes the log, can now be set on mount with ar_commit. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 01 3月, 2010 1 次提交
-
-
由 Dave Chinner 提交于
When we queue data buffers for ordered write, the buffers are added to the head of the ordered write list. When the log needs to push these buffers to disk, it also walks the list from the head. The result is that the the ordered buffers are submitted to disk in reverse order. For large writes, this means that whenever the log flushes large streams of reverse sequential order buffers are pushed down into the block layers. The elevators don't handle this particularly well, so IO rates tend to be significantly lower than if the IO was issued in ascending block order. Queue new ordered buffers to the tail of the ordered buffer list to ensure that IO is dispatched in the order it was submitted. This should significantly improve large sequential write speeds. On a disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for noop and from 38MB/s to 50MB/s for cfq. Signed-off-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 03 12月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
There are two spare field in the header common to all GFS2 metadata. One is just the right size to fit a journal id in it, and this patch updates the journal code so that each time a metadata block is modified, we tag it with the journal id of the node which is performing the modification. The reason for this is that it should make it much easier to debug issues which arise if we can tell which node was the last to modify a particular metadata block. Since the field is updated before the block is written into the journal, each journal should only contain metadata which is tagged with its own journal id. The one exception to this is the journal header block, which might have a different node's id in it, if that journal was recovered by another node in the cluster. Thus each journal will contain a record of which nodes recovered it, via the journal header. The other field in the metadata header could potentially be used to hold information about what kind of operation was performed, but for the time being we just zero it on each transaction so that if we use it for that in future, we'll know that the information (where it exists) is reliable. I did consider using the other field to hold the journal sequence number, however since in GFS2's journaling we write the modified data into the journal and not the original data, this gives no information as to what action caused the modification, so I think we can probably come up with a better use for those 64 bits in the future. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 12 6月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been chosen because they allow inspection of the major internal functions of GFS2 and they are also generic enough that they are unlikely to need any major changes as the filesystem evolves. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 11 5月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
After Jens recent updates: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e et al. this is a patch to bring gfs2 uptodate with the core code. Also I've managed to squash another call to ll_rw_block() along the way. There is still one part of the GFS2 I/O paths which are not correctly annotated and that is due to the sharing of the writeback code between the data and metadata address spaces. I would like to change that too, but this patch is still worth doing on its own, I think. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 24 3月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 31 3月, 2008 2 次提交
-
-
由 Bob Peterson 提交于
Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Bob Peterson 提交于
This patch is performance related. When we're doing a log flush, I noticed we were calling buf_lo_incore_commit twice: once for data bufs and once for metadata bufs. Since this is the same function and does the same thing in both cases, there should be no reason to call it twice. Since we only need to call it once, we can also make it faster by removing it from the generic "lops" code and making it a stand-along static function. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 25 1月, 2008 1 次提交
-
-
由 Steven Whitehouse 提交于
The only reason for adding glocks to the journal was to keep track of which locks required a log flush prior to release. We add a flag to the glock to allow this check to be made in a simpler way. This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64) and means that we can avoid extra work during the journal flush. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-