- 21 10月, 2011 4 次提交
-
-
由 Bob Peterson 提交于
This patch improves the performance of delete/unlink operations in a GFS2 file system where the files are large by adding a layer of metadata read-ahead for indirect blocks. Mileage will vary, but on my system, deleting an 8.6G file dropped from 22 seconds to about 4.5 seconds. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Each block which is deallocated, requires a call to gfs2_rlist_add() and each of those calls was calling gfs2_blk2rgrpd() in order to figure out which rgrp the block belonged in. This can be speeded up by making use of the rgrp cached in the inode. We also reset this cached rgrp in case the block has changed rgrp. This should provide a big reduction in gfs2_blk2rgrpd() calls during deallocation. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
The recursive_scan() function only ever takes a single "bc" argument, so we might as well just call do_strip() directly from resource_scan() rather than pass it in as an argument. Also the "data" argument is always a struct strip_mine, so we can pass that in, rather than using a void pointer. This also moves do_strip() ahead of recursive_scan() so that we don't need to add a prototype. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Since we have ruled out supporting online filesystem shrink, it is possible to make the resource group list append only during the life of a super block. This gives several benefits: Firstly, we only need to read new rindex elements as they are added rather than needing to reread the whole rindex file each time one element is added. Secondly, the rindex glock can be held for much shorter periods of time, and is completely removed from the fast path for allocations. The lock is taken in shared mode only when updating the resource groups when the first allocation occurs, and after a grow has taken place. Thirdly, this results in a reduction in code size, and everything gets a lot simpler to understand in this area. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 21 7月, 2011 1 次提交
-
-
由 Christoph Hellwig 提交于
Let filesystems handle waiting for direct I/O requests themselves instead of doing it beforehand. This means filesystem-specific locks to prevent new dio referenes from appearing can be held. This is important to allow generalizing i_dio_count to non-DIO_LOCKING filesystems. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 15 7月, 2011 1 次提交
-
-
由 Eric Sandeen 提交于
__gfs2_free_data and __gfs2_free_meta are almost identical, and can be trivially combined. [This is as per Eric's original patch minus gfs2_free_data() which had no callers left and plus the conversion of the bmap.c calls to these functions. All in all, a nice clean up] Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 21 5月, 2011 1 次提交
-
-
由 Steven Whitehouse 提交于
The deallocation code for directories in GFS2 is largely divided into two parts. The first part deallocates any directory leaf blocks and marks the directory as being a regular file when that is complete. The second stage was identical to deallocating regular files. Regular files have their data blocks in a different address space to directories, and thus what would have been normal data blocks in a regular file (the hash table in a GFS2 directory) were deallocated correctly. However, a reference to these blocks was left in the journal (assuming of course that some previous activity had resulted in those blocks being in the journal or ail list). This patch uses the i_depth as a test of whether the inode is an exhash directory (we cannot test the inode type as that has already been changed to a regular file at this stage in deallocation) The original issue was reported by Chris Hertel as an issue he encountered running bonnie++ Reported-by: NChristopher R. Hertel <crh@samba.org> Cc: Abhijith Das <adas@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 24 2月, 2011 1 次提交
-
-
由 Bob Peterson 提交于
This patch is a performance improvement to GFS2's dealloc code. Rather than update the quota file and statfs file for every single block that's stripped off in unlink function do_strip, this patch keeps track and updates them once for every layer that's stripped. This is done entirely inside the existing transaction, so there should be no risk of corruption. The other functions that deallocate blocks will be unaffected because they are using wrapper functions that do the same thing that they do today. I tested this code on my roth cluster by creating 200 files in a directory, each of which is 100MB, then on four nodes, I simultaneously deleted the files, thus competing for GFS2 resources (but different files). The commands I used were: [root@roth-01]# time for i in `seq 1 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-02]# time for i in `seq 2 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-03]# time for i in `seq 3 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-05]# time for i in `seq 4 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done The performance increase was significant: roth-01 roth-02 roth-03 roth-05 --------- --------- --------- --------- old: real 0m34.027 0m25.021s 0m23.906s 0m35.646s new: real 0m22.379s 0m24.362s 0m24.133s 0m18.562s Total time spent deleting: old: 118.6s new: 89.4 For this particular case, this showed a 25% performance increase for GFS2 unlinks. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 30 11月, 2010 2 次提交
-
-
由 Steven Whitehouse 提交于
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Benjamin Marzinski 提交于
When you truncate the rindex file, you need to avoid calling gfs2_rindex_hold, since you already hold it. However, if you haven't already read in the resource groups, you need to do that. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 28 9月, 2010 1 次提交
-
-
由 Benjamin Marzinski 提交于
Some of the functions in GFS2 were not reserving space in the transaction for the resource group header and the resource groups bitblocks that get added when you do allocation. GFS2 now makes sure to reserve space for the resource group header and either all the bitblocks in the resource group, or one for each block that it may allocate, whichever is smaller using the new gfs2_rg_blocks() inline function. Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 20 9月, 2010 2 次提交
-
-
由 Steven Whitehouse 提交于
With the update of the truncate code, ip->i_disksize and inode->i_size are merely copies of each other. This means we can remove ip->i_disksize and use inode->i_size exclusively reducing the size of a GFS2 inode by 8 bytes. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This updates GFS2's truncate code to use the new truncate sequence correctly. This is a stepping stone to being able to remove ip->i_disksize in favour of using i_size everywhere now that the two sizes are always identical. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@lst.de>
-
- 30 7月, 2010 1 次提交
-
-
由 Abhijith Das 提交于
trunc_start() in bmap.c incorrectly uses sizeof(struct gfs2_inode) instead of sizeof(struct gfs2_dinode). Signed-off-by: NAbhi Das <adas@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 29 7月, 2010 1 次提交
-
-
由 Bob Peterson 提交于
Function gfs2_write_alloc_required always returned zero as its return code. Therefore, it doesn't need to return a return code at all. Given that, we can use the return value to return whether or not the dinode needs block allocations rather than passing that value in, which in turn simplifies a bunch of error checking. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 15 7月, 2010 1 次提交
-
-
由 Bob Peterson 提交于
This patch replaces a statement that got dropped out by accident. Without the patch, truncates on stuffed (very small) files cause those files to have an unpredictable size. Signed-off-by: NBob Peterson <rpeterso@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 29 3月, 2010 1 次提交
-
-
由 Steven Whitehouse 提交于
If the inode size was corrupt for stuffed files, it was possible for the copying of data to overrun the block and/or page. This patch checks for that condition so that this is no longer possible. This is also preparation for the new truncate sequence patch which requires the ability to have stuffed files with larger sizes than (disk block size - sizeof(on disk inode)) with the restriction that only the initial part of the file may be non-zero. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 12 2月, 2010 1 次提交
-
-
由 Steven Whitehouse 提交于
This patch solves a corner case during allocation which occurs if both metadata (indirect) and data blocks are required but there is an obstacle in the filesystem (e.g. a resource group header or another allocated block) such that when the allocation is requested only enough blocks for the metadata are returned. By changing the exit condition of this loop, we ensure that a minimum of one data block will always be returned. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 12 6月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been chosen because they allow inspection of the major internal functions of GFS2 and they are also generic enough that they are unlikely to need any major changes as the filesystem evolves. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 10 6月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
If a page was partially zeroed as the result of a truncate, then it was not being correctly marked dirty. This resulted in the deleted data reappearing if the file was read back via direct I/O. Reported-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 22 5月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
This patch renames the ops_*.c files which have no counterpart without the ops_ prefix in order to shorten the name and make it more readable. In addition, ops_address.h (which was very small) is moved into inode.h and inode.h is cleaned up by adding extern where required. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 20 5月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
This patch improves the error handling in the case where we discover that the summary information in the resource group doesn't match the bitmap information while in the process of allocating blocks. Originally this resulted in a kernel bug, but this patch changes that so that we return -EIO and print some messages explaining what went wrong, and how to fix it. We also remember locally not to try and allocate from the same rgrp again, so that a subsequent allocation in a different rgrp should succeed. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 24 3月, 2009 1 次提交
-
-
由 Steven Whitehouse 提交于
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 05 1月, 2009 3 次提交
-
-
由 Steven Whitehouse 提交于
This patch removes some unused code, and make the calculation of the number of blocks required conditional in order to reduce the number of times this (potentially expensive) calculation is done. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
The final field in gfs2_dinode_host was the i_flags field. Thats renamed to i_diskflags in order to avoid confusion with the existing inode flags, and moved into the inode proper at a suitable location to avoid creating a "hole". At that point struct gfs2_dinode_host is no longer needed and as promised (quite some time ago!) it can now be removed completely. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This patch moved the i_size field from the gfs2_dinode_host and following the ext3 convention renames it i_disksize. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 25 6月, 2008 1 次提交
-
-
由 Benjamin Marzinski 提交于
This patch fixes bz 450641. This patch changes the computation for zero_metapath_length(), which it renames to metapath_branch_start(). When you are extending the metadata tree, The indirect blocks that point to the new data block must either diverge from the existing tree either at the inode, or at the first indirect block. They can diverge at the first indirect block because the inode has room for 483 pointers while the indirect blocks have room for 509 pointers, so when the tree is grown, there is some free space in the first indirect block. What metapath_branch_start() now computes is the height where the first indirect block for the new data block is located. It can either be 1 (if the indirect block diverges from the inode) or 2 (if it diverges from the first indirect block). Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
- 31 3月, 2008 11 次提交
-
-
由 Steven Whitehouse 提交于
This patch streamlines the quota checking in the "no quota" case by making the check inline in the calling function, thus reducing the number of function calls. Eventually we might be able to remove the checks from the gfs2_quota_lock() and gfs2_quota_check() functions, but currently we can't as there are a very few places in the code which need to call these functions directly still. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com> Cc: Abhijith Das <adas@redhat.com>
-
由 Cyrill Gorcunov 提交于
gfs2_alloc_get may fail so we have to check it to prevent NULL pointer dereference. Signed-off-by: NCyrill Gorcunov <gorcunov@gamil.com> Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
We've supported mapping of extents when no block allocation is required for some time. This patch extends that to mapping of extents when an allocation has been requested. In that case we try to allocate as many blocks as are requested, but we might return fewer in case there is something preventing us from returning the complete amount (e.g. an already allocated block is in the way). Currently the only code path which can actually request multiple data blocks in a single bmap call is the page_mkwrite path and even then it only happens if there are multiple blocks per page. What this patch does do however, is merge the allocation requests for metadata (growing the metadata tree in either height or depth) with the allocation of the data blocks in the case that both are needed. This results in lower overheads even in the single block allocation case. The one thing which we can't handle here at the moment is unstuffing. I would like to be able to do that, but the problem which arises is that in order to unstuff one has to get a locked page from the page cache which results in locking problems in the (usual) case that the caller is holding the page lock on the page it wishes to map. So that case will have to be addressed in future patches. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
In the case that we needed to grow the height of the metadata tree we were looking up the inode buffer and then brelse()ing it despite the fact that it is needed later in the block map process. This patch ensures that we look up the inode's buffer once and only once during the block map process. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
The blocks counter is almost a duplicate of the i_blocks field in the VFS inode. The only difference is that i_blocks can be only 32bits long for 32bit arch without large single file support. Since GFS2 doesn't handle the non-large single file case (for 32 bit anyway) this adds a new config dependency on 64BIT || LSF. This has always been the case, however we've never explicitly said so before. Even if we do add support for the non-LSF case, we will still not require this field to be duplicated since we will not be able to access oversized files anyway. So the net result of all this is that we shave 8 bytes from a gfs2_inode and get our config deps correct. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
This adds a function (currently the only use is during mapping of already allocated blocks, but watch this space) which iterates over a number of pointers in a block and returns the extent length. If the initial pointer is 0 (i.e. unallocated) it will return the number of unallocated blocks in the extent. If the initial pointer is allocated, then it returns the number of contiguously allocated blocks in the extent. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
A dereference was forgotten. This adds it back correctly. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Rather than having to allocate a single block at a time, this patch allows the block allocator to allocate an extent. Since there is no difference (so far as the block allocator is concerned) between data blocks and indirect blocks, it is posible to allocate a single extent and for the caller to unrevoke just the blocks required for indirect blocks. Currently the only bit of GFS2 to make use of this feature is the build height function. The intention is that gfs2_block_map will be changed to make use of this feature in future patches. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
Thanks to the preceeding patches, the only difference between these two functions is their name. We can thus merge them and call the new function gfs2_alloc_block to reflect the fact that it can allocate either kind of block. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
By adding an extra argument to gfs2_trans_add_unrevoke we can now specify an extent length of blocks to unrevoke. This means that we only need to make one pass through the list for each extent rather than each block. Currently the only extent length which is used is 1, but that will change in the future. Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta since its the only difference between this and gfs2_alloc_data which is left. This will allow a future patch to merge these two functions into one (i.e. one call to allocate both data and metadata in a single extent in the future). Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-
由 Steven Whitehouse 提交于
There were three fields being used to keep track of the location of the most recently allocated block for each inode. These have been merged into a single field in order to better keep the data and metadata for an inode close on disk, and also to reduce the space required for storage. Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
-