- 10 9月, 2011 7 次提交
-
-
由 Aditya Kali 提交于
With bigalloc changes, the i_blocks value was not correctly set (it was still set to number of blocks being used, but in case of bigalloc, we want i_blocks to represent the number of clusters being used). Since the quota subsystem sets the i_blocks value, this patch fixes the quota accounting and makes sure that the i_blocks value is set correctly. Signed-off-by: NAditya Kali <adityakali@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The default group preallocation size had been previously set to 512 blocks/clusters, regardless of the block/cluster size. This is probably too big for large cluster sizes. So adjust the default so that it is 2 megabytes or 32 clusters, whichever is larger. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Convert the free_blocks to be free_clusters to make the final revised bigalloc changes easier to read/understand. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Convert the percpu counters s_dirtyblocks_counter and s_freeblocks_counter in struct ext4_super_info to be s_dirtyclusters_counter and s_freeclusters_counter. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The ext4_free_blocks() function now has two new flags that indicate whether a partial cluster at the beginning or the end of the block extents should be freed or not. That will be up the caller (i.e., truncate), who can figure out whether partial clusters at the beginning or the end of a block range can be freed. We also have to update the ext4_mb_free_metadata() and release_blocks_on_commit() machinery to be cluster-based, since it is used by ext4_free_blocks(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
In most of mballoc.c, we do everything in units of clusters, since the block allocation bitmaps and buddy bitmaps are all denominated in clusters. The one place where we do deal with absolute block numbers is in the code that handles the preallocation regions, since in the case of inode-based preallocation regions, the start of the preallocation region can't be relative to the beginning of the group. So this adds a bit of complexity, where pa_pstart and pa_lstart are block numbers, while pa_free, pa_len, and fe_len are denominated in units of clusters. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Change the places in fs/ext4/mballoc.c where EXT4_BLOCKS_PER_GROUP are used to indicate the number of bits in a block bitmap (which is really a cluster allocation bitmap in bigalloc file systems). There are still some places in the ext4 codebase where usage of EXT4_BLOCKS_PER_GROUP needs to be audited/fixed, in code paths that aren't used given the initial restricted assumptions for bigalloc. These will need to be fixed before we can relax those restrictions. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 8月, 2011 3 次提交
-
-
由 Yu Jian 提交于
In ext4_mb_init(), if the s_locality_group allocation fails it will currently cause the allocations made in ext4_mb_init_backend() to be leaked. Moving the ext4_mb_init_backend() allocation after the s_locality_group allocation avoids that problem. Signed-off-by: NYu Jian <yujian@whamcloud.com> Signed-off-by: NAndreas Dilger <adilger@whamcloud.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yu Jian 提交于
Signed-off-by: NYu Jian <yujian@whamcloud.com> Signed-off-by: NAndreas Dilger <adilger@whamcloud.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 8月, 2011 1 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 27 7月, 2011 4 次提交
-
-
由 Yongqiang Yang 提交于
Rename mb_set_bits() to ext4_set_bits() and make it a global function so that setup_new_group_blocks() can use it. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
If ext4_group_add_blocks() is called with 0 block, make it return 0 without doing any extra work. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
This patch lets ext4_group_add_blocks() return an error code if it fails, so that upper functions can handle error correctly. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 7月, 2011 2 次提交
-
-
由 Tao Ma 提交于
In ext4_mb_release, we use s_mb_buddies_generated++. Although the output is OK, but I don't think we need this extra ++. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
ext4_mb_load_buddy() calls ext4_get_group_info() for setting both "grp" and "e4b->bd_info", but it could do "e4b->bd_info = grp". Reported-by: NAndreas Dilger <adilger@whamcloud.com> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 18 7月, 2011 1 次提交
-
-
由 Dan Ehrenberg 提交于
Previously, if a stripe width was provided, then it would be used as the preallocation granularity, with no santiy checking and no way to override this. Now, mb_prealloc_size defaults to the smallest multiple of stripe size that is greater than or equal to the old default mb_prealloc_size, and this can be overridden with the sysfs interface. Signed-off-by: NDan Ehrenberg <dehrenberg@google.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 12 7月, 2011 2 次提交
-
-
由 Tao Ma 提交于
If we meet with an error in ext4_mb_add_groupinfo, we kfree sbi->s_group_info[group >> EXT4_DESC_PER_BLOCK_BITS(sb)], but fail to reset it to NULL. So the caller ext4_mb_init_backend will try to kfree it again and causes a double free. So fix it by resetting it to NULL. Some typo in comments of mballoc.c are also changed. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
In ext4_groupinfo_create_slab, we create ext4_groupinfo_caches within ext4_grpinfo_slab_create_mutex, but set it outside the lock, and there does exist some case that we may create it twice and causes a memory leak. So set it before we call mutex_unlock. Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 11 7月, 2011 6 次提交
-
-
由 Tao Ma 提交于
at ext4_trim_all_free() comment, there is no longer an @e4b parameter, instead it is @group. Reported-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
In ext4, when FITRIM is called every time, we iterate all the groups and do trim one by one. It is a bit time wasting if the group has been trimmed and there is no change since the last trim. So this patch adds a new flag in ext4_group_info->bb_state to indicate that the group has been trimmed, and it will be cleared if some blocks is freed(in release_blocks_on_commit). Another trim_minlen is added in ext4_sb_info to record the last minlen we use to trim the volume, so that if the caller provide a small one, we will go on the trim regardless of the bb_state. A simple test with my intel x25m ssd: df -h shows: /dev/sdb1 40G 21G 17G 56% /mnt/ext4 Block size: 4096 run the FITRIM with the following parameter: range.start = 0; range.len = UINT64_MAX; range.minlen = 1048576; without the patch: [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a real 0m5.505s user 0m0.000s sys 0m1.224s [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a real 0m5.359s user 0m0.000s sys 0m1.178s [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a real 0m5.228s user 0m0.000s sys 0m1.151s with the patch: [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a real 0m5.625s user 0m0.000s sys 0m1.269s [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a real 0m0.002s user 0m0.000s sys 0m0.001s [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a real 0m0.002s user 0m0.000s sys 0m0.001s A big improvement for the 2nd and 3rd run. Even after I delete some big image files, it is still much faster than iterating the whole disk. [root@boyu-tm test]# time ./ftrim /mnt/ext4/a real 0m1.217s user 0m0.000s sys 0m0.196s Cc: Lukas Czerner <lczerner@redhat.com> Reviewed-by: NAndreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
Add ext4_trim_extent and ext4_trim_all_free. Reviewed-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
When we trim some free blocks in a group of ext4, we need to calculate the free blocks properly and check whether there are enough freed blocks left for us to trim. Current solution will only calculate free spaces if they are large for a trim which isn't appropriate. Let us see a small example: a group has 1.5M free which are 300k, 300k, 300k, 300k, 300k. And minblocks is 1M. With current solution, we have to iterate the whole group since these 300k will never be subtracted from 1.5M. But actually we should exit after we find the first 2 free spaces since the left 3 chunks only sum up to 900K if we subtract the first 600K although they can't be trimed. Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Tao Ma 提交于
In 0f0a25bf, we adjust 'len' with s_first_data_block - start, but it could underflow in case blocksize=1K, fstrim_range.len=512 and fstrim_range.start = 0. In this case, when we run the code: len -= first_data_blk - start; len will be underflow to -1ULL. In the end, although we are safe that last_group check later will limit the trim to the whole volume, but that isn't what the user really want. So this patch fix it. It also adds the check for 'start' like ext3 so that we can break immediately if the start is invalid. Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: NTao Ma <boyu.mt@taobao.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Maxim Patlasov 提交于
The current implementation of ext4_free_blocks() always calls dquot_free_block This looks quite sensible in the most cases: blocks to be freed are associated with inode and were accounted in quota and i_blocks some time ago. However, there is a case when blocks to free were not accounted by the time calling ext4_free_blocks() yet: 1. delalloc is on, write_begin pre-allocated some space in quota 2. write-back happens, ext4 allocates some blocks in ext4_ext_map_blocks() 3. then ext4_ext_map_blocks() gets an error (e.g. ENOSPC) from ext4_ext_insert_extent() and calls ext4_free_blocks(). In this scenario, ext4_free_blocks() calls dquot_free_block() who, in turn, decrements i_blocks for blocks which were not accounted yet (due to delalloc) After clean umount, e2fsck reports something like: > Inode 21, i_blocks is 5080, should be 5128. Fix<y>? because i_blocks was erroneously decremented as explained above. The patch fixes the problem by passing the new flag EXT4_FREE_BLOCKS_NO_QUOT_UPDATE to ext4_free_blocks(), to request that the dquot_free_block() call be skipped. Signed-off-by: NMaxim Patlasov <maxim.patlasov@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 28 6月, 2011 1 次提交
-
-
由 Yongqiang Yang 提交于
Unused variables was deleted. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 06 6月, 2011 2 次提交
-
-
由 Lukas Czerner 提交于
While creating fixed tracepoints for ext3, basically by porting them from ext4, I found a lot of useless retyping, wrong type usage, useless variable passing and other inconsistencies in the ext4 fixed tracepoint code. This patch cleans the fixed tracepoint code for ext4 and also simplify some of them. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Yongqiang Yang 提交于
metadata is not parameter of ext4_free_blocks() any more. Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 25 5月, 2011 3 次提交
-
-
由 Allison Henderson 提交于
This patch adds an allocation request flag to the ext4_has_free_blocks function which enables the use of reserved blocks. This will allow a punch hole to proceed even if the disk is full. Punching a hole may require additional blocks to first split the extents. Because ext4_has_free_blocks is a low level function, the flag needs to be passed down through several functions listed below: ext4_ext_insert_extent ext4_ext_create_new_leaf ext4_ext_grow_indepth ext4_ext_split ext4_ext_new_meta_block ext4_mb_new_blocks ext4_claim_free_blocks ext4_has_free_blocks [ext4 punch hole patch series 1/5 v7] Signed-off-by: NAllison Henderson <achender@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: NMingming Cao <cmm@us.ibm.com>
-
由 Lukas Czerner 提交于
We should protect reading bd_info->bb_first_free with the group lock because otherwise we might miss some free blocks. This is not a big deal at all, but the change to do right thing is really simple, so lets do that. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Currently we are loading buddy ext4_mb_load_buddy() for every block group we are going through in ext4_trim_fs() in many cases just to find out that there is not enough space to be bothered with. As Amir Goldstein suggested we can use bb_free information directly from ext4_group_info. This commit removes ext4_mb_load_buddy() from ext4_trim_fs() and rather get the ext4_group_info via ext4_get_group_info() and use the bb_free information directly from that. This avoids unnecessary call to load buddy in the case the group does not have enough free space to trim. Loading buddy is now moved to ext4_trim_all_free(). Tested by me with xfstests 251. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 10 5月, 2011 4 次提交
-
-
由 Amir Goldstein 提交于
After taking care of all group init races, all that remains is to remove alloc_semp from ext4_allocation_context and ext4_buddy structs. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
After online resize which adds new groups, some of the groups in a buddy page may be initialized and uptodate, while other (new ones) may be uninitialized. The indication for init of new block groups is when ext4_mb_init_cache() is called with an uptodate buddy page. In this case, initialized groups on that buddy page must be skipped when initializing the buddy cache. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
The old routines ext4_mb_[get|put]_buddy_cache_lock(), which used to take grp->alloc_sem for all groups on the buddy page have been replaced with the routines ext4_mb_[get|put]_buddy_page_lock(). The new routines take both buddy and bitmap page locks to protect against concurrent init of groups on the same buddy page. The GROUP_NEED_INIT flag is tested again under page lock to check if the group was initialized by another caller. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
The old imlementation used to take grp->alloc_sem and set the GROUP_NEED_INIT flag, so that the buddy cache would be reloaded. The new implementation updates the buddy cache by freeing the added blocks and making them available for use, so there is no need to reload the buddy cache and there is no need to take grp->alloc_sem. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 5月, 2011 2 次提交
-
-
由 Theodore Ts'o 提交于
The block allocation code used to use jbd2_journal_get_undo_access as a way to make changes that wouldn't show up until the commit took place. The new multi-block allocation code has a its own way of preventing newly freed blocks from getting reused until the commit takes place (it avoids updating the buddy bitmaps until the commit is done), so we don't need to use jbd2_journal_get_undo_access(), which has extra overhead compared to jbd2_journal_get_write_access(). There was one last vestigal use of ext4_journal_get_undo_access() in ext4_add_groupblocks(); change it to use ext4_journal_get_write_access() and then remove the ext4_journal_get_undo_access() support. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Amir Goldstein 提交于
In preparation for the next patch, the function ext4_add_groupblocks() is moved to mballoc.c, where it could use some static functions. Signed-off-by: NAmir Goldstein <amir73il@users.sf.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 5月, 2011 1 次提交
-
-
由 Theodore Ts'o 提交于
This is an effective revert of commit a30eec2a: "ext4: stop issuing discards if not supported by device". The problem is that there are some devices that may return errors in response to a discard request some times but not others. (One example would be a hybrid dm device which concatenates an SSD and an HDD device). By this logic, I also removed the error checking from ext4's FITRIM code; so that an error from a discard will not stop the FITRIM from trying to trim the rest of the file system. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 17 4月, 2011 1 次提交
-
-
由 Yang Ruirui 提交于
Add missing page_cache_release in the error path of ext4_mb_load_buddy Signed-off-by: NYang Ruirui <ruirui.r.yang@tieto.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-