- 17 10月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
The multiblock allocator needs to be able to release blocks (and issue a blkdev discard request) when the transaction which freed those blocks is committed. Previously this was done via a polling mechanism when blocks are allocated or freed. A much better way of doing things is to create a jbd2 callback function and attaching the list of blocks to be freed directly to the transaction structure. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 18 10月, 2008 1 次提交
-
-
由 Manish Katiyar 提交于
Signed-off-by: NManish Katiyar <mkatiyar@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 16 10月, 2008 2 次提交
-
-
由 Theodore Ts'o 提交于
Let the block device know when unused blocks can be discarded, using the new sb_issue_discard() interface. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
With this patch we track the block freed during a transaction using red-black tree. We also make sure contiguous blocks freed are collected in one node in the tree. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 14 10月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
We should use kmem_cache_free to free memory allocated via kmem_cache_alloc Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 10 10月, 2008 2 次提交
-
-
由 Frederic Bohe 提交于
This fixes a bug which caused on-line resizing of filesystems with a 1k blocksize to fail. The root cause of this bug was the fact that if an uninitalized bitmap block gets read in by userspace (which e2fsprogs does try to avoid, but can happen when the blocksize is less than the pagesize and an adjacent blocks is read into memory) ext4_read_block_bitmap() was erroneously depending on the buffer uptodate flag to decide whether it needed to initialize the bitmap block in memory --- i.e., to set the standard set of blocks in use by a block group (superblock, bitmaps, inode table, etc.). Essentially, ext4_read_block_bitmap() assumed it was the only routine that might try to read a block containing a block bitmap, which is simply not true. To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap() must always initialize uninitialized bitmap blocks. Once a block or inode is allocated out of that bitmap, it will be marked as initialized in the block group descriptor, so in general this won't result any extra unnecessary work. Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 9月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
Previously mballoc created a separate set of functions for each proc file. This combines the tunables into a single set of functions which gets used for all of the per-superblock proc files, saving approximately 2k of compiled object code. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 23 9月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
...and into the core setup/teardown code in fs/ext4/super.c so that other parts of ext4 can define tuning parameters. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 9月, 2008 2 次提交
-
-
由 Eric Sandeen 提交于
lg_prealloc_list seems to cry out for a per-cpu data structure; on a large smp system I think this should be better. I've lightly tested this change on a 4-cpu system. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Acked-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Alexey Dobriyan 提交于
ext4 creates per-suberblock directory in /proc/ext4/ . Name used as basis is taken from bdevname, which, surprise, can contain slash. However, proc while allowing to use proc_create("a/b", parent) form of PDE creation, assumes that parent/a was already created. bdevname in question is 'cciss/c0d0p9', directory is not created and all this stuff goes directly into /proc (which is real bug). Warning comes when _second_ partition is mounted. http://bugzilla.kernel.org/show_bug.cgi?id=11321Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 10 10月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
This patch adds dirty block accounting using percpu_counters. Delayed allocation block reservation is now done by updating dirty block counter. In a later patch we switch to non delalloc mode if the filesystem free blocks is greater than 150% of total filesystem dirty blocks Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 9月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
During block reservation if we don't have enough blocks left, retry block reservation with smaller block counts. This makes sure we try fallocate and DIO with smaller request size and don't fail early. The delayed allocation reservation cannot try with smaller block count. So retry block reservation to handle temporary disk full conditions. Also print free blocks details if we fail block allocation during writepages. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 10月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
With delayed allocation we need to make sure block are reserved before we attempt to allocate them. Otherwise we get block allocation failure (ENOSPC) during writepages which cannot be handled. This would mean silent data loss (We do a printk stating data will be lost). This patch updates the DIO and fallocate code path to do block reservation before block allocation. This is needed to make sure parallel DIO and fallocate request doesn't take block out of delayed reserve space. When free blocks count go below a threshold we switch to a slow patch which looks at other CPU's accumulated percpu counter values. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 09 9月, 2008 1 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 19 8月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
For small file block allocations, mballoc uses per cpu prealloc space. Use goal block when searching for the right prealloc space. Also make sure ext4_da_writepages tries to write all the pages for small files in single attempt Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 7月, 2008 3 次提交
-
-
由 Aneesh Kumar K.V 提交于
Currently, the locality group prealloc list is freed only when there is a block allocation failure. This can result in large number of entries in the preallocation list making ext4_mb_use_preallocated() expensive. To fix this, we convert the locality group prealloc list to a hash list. The hash index is the order of number of blocks in the prealloc space with a max order of 9. When adding prealloc space to the list we make sure total entries for each order does not exceed 8. If it is more than 8 we discard few entries and make sure the we have only <= 5 entries. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
NR_CPUS can be really large. We should be using nr_cpu_ids instead. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Don't call BUG_ON on file system failures. Instead use ext4_error and also handle the continue case properly. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 03 8月, 2008 1 次提交
-
-
由 Eric Sandeen 提交于
I noticed when filling a 1T filesystem with 4 threads using the fs_mark benchmark: fs_mark -d /mnt/test -D 256 -n 100000 -t 4 -s 20480 -F -S 0 that I occasionally got checksum mismatch errors: EXT4-fs error (device sdb): ext4_init_inode_bitmap: Checksum bad for group 6935 etc. I'd reliably get 4-5 of them during the run. It appears that the problem is likely a race to init the bg's when the uninit_bg feature is enabled. With the patch below, which adds sb_bgl_locking around initialization, I was able to complete several runs with no errors or warnings. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 15 7月, 2008 1 次提交
-
-
由 Mingming Cao 提交于
This patch does block reservation for delayed allocation, to avoid ENOSPC later at page flush time. Blocks(data and metadata) are reserved at da_write_begin() time, the freeblocks counter is updated by then, and the number of reserved blocks is store in per inode counter. At the writepage time, the unused reserved meta blocks are returned back. At unlink/truncate time, reserved blocks are properly released. Updated fix from Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> to fix the oldallocator block reservation accounting with delalloc, added lock to guard the counters and also fix the reservation for meta blocks. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 12 7月, 2008 5 次提交
-
-
由 Frederic Bohe 提交于
Update group infos when updating a group's descriptor. Add group infos when adding a group's descriptor. Refresh cache pages used by mb_alloc when changes occur. This will probably need modifications when META_BG resizing will be allowed. Signed-off-by: NFrederic Bohe <frederic.bohe@bull.net> Signed-off-by: NMingming Cao <cmm@us.ibm.com>
-
由 Mingming Cao 提交于
mballoc allocation missed check for blocks reserved for root users. Add ext4_has_free_blocks() check before allocation. Also modified ext4_has_free_blocks() to support multiple block allocation request. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Move the code related to block allocation to a single function and add helper funtions to differient allocation for data and meta data blocks Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Shen Feng 提交于
Quota allocation is not removed when ext4_mb_new_blocks calls kmem_cache_alloc failed. Also make sure the allocation context is freed on the error path. Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jose R. Santos 提交于
This patch mostly controls the way inode are allocated in order to make ialloc aware of flex_bg block group grouping. It achieves this by bypassing the Orlov allocator when block group meta-data are packed toghether through mke2fs. Since the impact on the block allocator is minimal, this patch should have little or no effect on other block allocation algorithms. By controlling the inode allocation, it can basically control where the initial search for new block begins and thus indirectly manipulate the block allocator. This allocator favors data and meta-data locality so the disk will gradually be filled from block group zero upward. This helps improve performance by reducing seek time. Since the group of inode tables within one flex_bg are treated as one giant inode table, uninitialized block groups would not need to partially initialize as many inode table as with Orlov which would help fsck time as the filesystem usage goes up. Signed-off-by: NJose R. Santos <jrs@us.ibm.com> Signed-off-by: NValerie Clement <valerie.clement@bull.net> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 7月, 2008 2 次提交
-
-
由 Shen Feng 提交于
The error processing of the return value of mb_free_blocks is meanless because it only returns 0. This fix includes - make mb_free_blocks return void - remove the error processing part in callers - unlock group before calling ext4_error in mb_free_blocks Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Cc: Mingming Cao <cmm@us.ibm.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Shen Feng 提交于
When the directory fs/ext4 is not correctly created under proc, the entry under this directory should not be created. Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 12 7月, 2008 7 次提交
-
-
由 Theodore Ts'o 提交于
Since this a non-static function, make it be ext4 specific to avoid conflicts with potentially other filesystems. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Shen Feng 提交于
ext4_mb_seq_history_open(): check if sbi->s_mb_history is NULL ext4_mb_history_init(): replace kmalloc and memset with kzalloc ext4_mb_init_backend(): remove memset since kzalloc is used ext4_mb_init(): the return value of ext4_mb_init_backend is int, but i is unsigned, replace it with a new int variable. Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Shen Feng 提交于
Add error processing for ext4_mb_load_buddy when it calls ext4_mb_init_cache. Signed-off-by: NShen Feng <shen@cn.fujitsu.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Mingming Cao 提交于
ext4_mb_init_cache() incorrectly always return EIO on success. This causes the caller of ext4_mb_init_cache() fail when it checks the return value. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Alexey Dobriyan 提交于
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Aneesh Kumar K.V 提交于
With mballoc we search for the best extent using different criteria. We should always use the goal group when we are starting with a new criteria. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
Some architectures implement ext4_find_next_bit and ext4_find_next_zero_bit in such a way that they return greater than max for some input values. Make sure mb_find_next_bit and mb_find_next_zero_bit return the right values. On 2.6.25 we have include/asm-x86/bitops_32.h static inline unsigned find_first_bit(const unsigned long *addr, unsigned size) { unsigned x = 0; while (x < size) { unsigned long val = *addr++; if (val) return __ffs(val) + x; x += (sizeof(*addr)<<3); } return x; } This can return value greater than size. Reported and fixed here for lustre https://bugzilla.lustre.org/show_bug.cgi?id=15932 https://bugzilla.lustre.org/attachment.cgi?id=17205Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 06 6月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Fix use of uninitialized data with debug enabled. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 16 5月, 2008 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
If the block allocator gets blocks out of system zone ext4 calls ext4_error. But if the file system is mounted with errors=continue retry block allocation. We need to mark the system zone blocks as in use to make sure retry don't pick them again System zone is the block range mapping block bitmap, inode bitmap and inode table. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 5月, 2008 1 次提交
-
-
由 Valerie Clement 提交于
In case of inode preallocation, the number of blocks to allocate depends on the file size and it is calculated in ext4_mb_normalize_request(). Each group in the filesystem is then checked to find one that can be used for allocation; this is done in ext4_mb_good_group(). When a file bigger than 4MB is created, the requested number of blocks to preallocate, calculated by ext4_mb_normalize_request is 4096. However for a filesystem with 1KB block size, the maximum size of the block buddies used by the multiblock allocator is 2048, so none of groups in the filesystem satisfies the search criteria in ext4_mb_good_group(). Scanning all the filesystem groups impacts performance. This was demonstrated by using a freshly created, 70GB, 1k block filesystem, with caches dropped write before the test via /proc/sys/vm/drop_caches, and with the filesystem mounted with nodelalloc and nodealloc,nomballoc. The time to write an 8 megabyte file using "dd if=/dev/zero of=/mnt/test/fo bs=8k count=1k conv=fsync" took 35.5091 seconds (236kB/s) with nodellaloc, and 0.233754 seconds (35.9 MB/s) with the nodelloc,nomballoc options. With a 1TB partition, it took several minutes to write 8MB! This patch modifies the algorithm in ext4_mb_normalize_group_request to calculate the number of blocks to allocate by taking into account the maximum size of free blocks chunks handled by the multiblock allocator. It has also been tested for filesystems with 2KB and 4KB block sizes to ensure that those cases don't regress. Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NValerie Clement <valerie.clement@bull.net> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 5月, 2008 1 次提交
-
-
由 Jean Delvare 提交于
bdevname() fills the buffer that it is given as a parameter, so calling strcpy() or snprintf() on the returned value is redundant (and probably not guaranteed to work - I don't think strcpy and snprintf support overlapping buffers.) Signed-off-by: NJean Delvare <khali@linux-fr.org> Cc: Stephen Tweedie <sct@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2008 1 次提交
-
-
由 Roel Kluin 提交于
In ext4_mb_init_backend() 'i' is of type ext4_group_t. Since unsigned, i >= 0 is always true, so fix hot spins after err_freebuddy: and -meta: and prevent decrements when zero. Signed-off-by: NRoel Kluin <12o3l@tiscali.nl> Signed-off-by: NMingming Cao <cmm@us.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-