- 23 4月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
If the Orlov allocator is having trouble finding an appropriate block group, the fallback code could loop forever, causing a soft lockup warning in find_group_orlov(): BUG: soft lockup - CPU#0 stuck for 61s! [cp:11728] ... Pid: 11728, comm: cp Not tainted (2.6.30-rc1-dirty #77) Lenovo EIP: 0060:[<c021650e>] EFLAGS: 00000246 CPU: 0 EIP is at ext4_get_group_desc+0x54/0x9d ... Call Trace: [<c0218021>] find_group_orlov+0x2ee/0x334 [<c0120a5f>] ? sched_clock+0x8/0xb [<c02188e3>] ext4_new_inode+0x2cf/0xb1a Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Instead of just checking that the extent block number is greater or equal than s_first_data_block, make sure it it is not pointing into the block group descriptors, since that is clearly wrong. This helps prevent filesystem from getting very badly corrupted in case an extent block is corrupted. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 15 4月, 2009 1 次提交
-
-
由 Nikanth Karthikesan 提交于
Remove code handling bio_alloc failure with __GFP_WAIT. GFP_NOIO implies __GFP_WAIT. Signed-off-by: NNikanth Karthikesan <knikanth@suse.de> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 14 4月, 2009 1 次提交
-
-
由 Chuck Ebbert 提交于
Missing braces caused the warning to print more than once. Signed-Off-By: NChuck Ebbert <cebbert@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 08 4月, 2009 1 次提交
-
-
由 From: Thiemo Nagel 提交于
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 4月, 2009 1 次提交
-
-
由 Thiemo Nagel 提交于
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 08 4月, 2009 1 次提交
-
-
由 Thiemo Nagel 提交于
Commit fe2c8191 introduced a regression on big-endian system, because the checks to make sure block references in non-extent inodes are valid failed to use le32_to_cpu(). Reported-by: NAlexander Beregalov <a.beregalov@gmail.com> Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de> Tested-by: NAlexander Beregalov <a.beregalov@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 01 4月, 2009 2 次提交
-
-
由 Nick Piggin 提交于
Change the page_mkwrite prototype to take a struct vm_fault, and return VM_FAULT_xxx flags. There should be no functional change. This makes it possible to return much more detailed error information to the VM (and also can provide more information eg. virtual_address to the driver, which might be important in some special cases). This is required for a subsequent fix. And will also make it easier to merge page_mkwrite() with fault() in future. Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Felix Blyakher <felixb@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
current->fs->umask is what most of fs_struct users are doing. Put that into a helper function. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 30 3月, 2009 1 次提交
-
-
由 Matt LaPlante 提交于
Signed-off-by: NMatt LaPlante <kernel1@cyberdogtech.com> Acked-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 28 3月, 2009 4 次提交
-
-
由 Theodore Ts'o 提交于
Add support for using the mount options "barrier" and "nobarrier", and "auto_da_alloc" and "noauto_da_alloc", which is more consistent than "barrier=<0|1>" or "auto_da_alloc=<0|1>". Most other ext3/ext4 mount options use the foo/nofoo naming convention. We allow the old forms of these mount options for backwards compatibility. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Smatch (http://repo.or.cz/w/smatch.git/) complains about the locking in ext4_mb_add_n_trim() from fs/ext4/mballoc.c 4438 list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order], 4439 pa_inode_list) { 4440 spin_lock(&tmp_pa->pa_lock); 4441 if (tmp_pa->pa_deleted) { 4442 spin_unlock(&pa->pa_lock); 4443 continue; 4444 } Brown paper bag time... Reported-by: NDan Carpenter <error27@gmail.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
由 Dan Carpenter 提交于
This was found by smatch (http://repo.or.cz/w/smatch.git/) Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
由 Aneesh Kumar K.V 提交于
Impact: code cleanup This patch rename pa_linear to pa_type and add MB_INODE_PA and MB_GROUP_PA to indicate inode and group prealloc space. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 31 3月, 2009 1 次提交
-
-
由 Thiemo Nagel 提交于
Check block references in the inode and indorect blocks for non-extent inodes to make sure they are valid, and flag an error if they are invalid. Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 3月, 2009 4 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Jan Kara 提交于
Use lowercase names of quota functions instead of old uppercase ones. Signed-off-by: NJan Kara <jack@suse.cz> Acked-by: NMingming Cao <cmm@us.ibm.com> CC: linux-ext4@vger.kernel.org
-
由 Mingming Cao 提交于
Uses quota reservation/claim/release to handle quota properly for delayed allocation in the three steps: 1) quotas are reserved when data being copied to cache when block allocation is defered 2) when new blocks are allocated. reserved quotas are converted to the real allocated quota, 2) over-booked quotas for metadata blocks are released back. Signed-off-by: NMingming Cao <cmm@us.ibm.com> Acked-by: N"Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
ext4_dquot_initialize() and ext4_dquot_drop() is no longer needed because of modified quota locking. Signed-off-by: NJan Kara <jack@suse.cz>
-
- 17 3月, 2009 2 次提交
-
-
由 Eric Sandeen 提交于
This is for Red Hat bug 490026: EXT4 panic, list corruption in ext4_mb_new_inode_pa ext4_lock_group(sb, group) is supposed to protect this list for each group, and a common code flow to remove an album is like this: ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL); ext4_lock_group(sb, grp); list_del(&pa->pa_group_list); ext4_unlock_group(sb, grp); so it's critical that we get the right group number back for this prealloc context, to lock the right group (the one associated with this pa) and prevent concurrent list manipulation. however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a comment, "-1 is to protect from crossing allocation group". This makes sense for the group_pa, where pa_pstart is advanced by the length which has been used (in ext4_mb_release_context()), and when the entire length has been used, pa_pstart has been advanced to the first block of the next group. However, for inode_pa, pa_pstart is never advanced; it's just set once to the first block in the group and not moved after that. So in this case, if we subtract one in ext4_mb_put_pa(), we are actually locking the *previous* group, and opening the race with the other threads which do not subtract off the extra block. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Add a mount option which allows the user to disable automatic allocation of blocks whose allocation by delayed allocation when the file was originally truncated or when the file is renamed over an existing file. This feature is intended to save users from the effects of naive application writers, but it reduces the effectiveness of the delayed allocation code. This mount option disables this safety feature, which may be desirable for prodcutions systems where the risk of unclean shutdowns or unexpected system crashes is low. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 14 3月, 2009 1 次提交
-
-
由 Eric Sandeen 提交于
Thiemo Nagel reported that: # dd if=/dev/zero of=image.ext4 bs=1M count=2 # mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \ -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4 # mount -o loop image.ext4 mnt/ # dd if=/dev/zero of=mnt/file oopsed, with a BUG_ON in ext4_mb_normalize_request because size == EXT4_BLOCKS_PER_GROUP It appears to me (esp. after talking to Andreas) that the BUG_ON is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should be allowed, though larger sizes do indicate a problem. Fix that an another (apparently rare) codepath with a similar check. Reported-by: NThiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 3月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
This is a short-term warning, and even printk_ratelimit() can result in too much noise in system logs. So only print it once as a warning. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 11 3月, 2009 1 次提交
-
-
由 Eric Sandeen 提交于
The ext4_ext_search_right() function is confusing; it uses a "depth" variable which is 0 at the root and maximum at the leaves, but the on-disk metadata uses a "depth" (actually eh_depth) which is opposite: maximum at the root, and 0 at the leaves. The ext4_ext_check_header() function is given a depth and checks the header agaisnt that depth; it expects the on-disk semantics, but we are giving it the opposite in the while loop in this function. We should be giving it the on-disk notion of "depth" which we can get from (p_depth - depth) - and if you look, the last (more commonly hit) call to ext4_ext_check_header() does just this. Sending in the wrong depth results in (incorrect) messages about corruption: EXT4-fs error (device sdb1): ext4_ext_search_right: bad header in inode #2621457: unexpected eh_depth - magic f30a, entries 340, max 340(0), depth 1(2) http://bugzilla.kernel.org/show_bug.cgi?id=12821Reported-by: NDavid Dindorp <ddi@dubex.dk> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 3月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
Instead of looping over all of the block groups in a flex group summing their summary statistics, start tracking used_dirs in struct flex_groups, and use struct flex_groups instead. This should save a bit of CPU for mkdir-heavy workloads. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Reduce pressure on the sb_bgl_lock family of locks by using atomic_t's to track the number of free blocks and inodes in each flex_group. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 31 3月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
Remove tuning knobs in /proc/fs/ext4/<dev/* since they have been replaced by knobs in sysfs at /sys/fs/ext4/<dev>/*. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Add basic sysfs support so that information about the mounted filesystem and various tuning parameters can be accessed via /sys/fs/ext4/<dev>/*. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 05 3月, 2009 1 次提交
-
-
由 Eric Sandeen 提交于
I was seeing fsck errors on inode bitmaps after a 4 thread dbench run on a 4 cpu machine: Inode bitmap differences: -50736 -(50752--50753) etc... I believe that this is because ext4_free_inode() uses atomic bitops, and although ext4_new_inode() *used* to also use atomic bitops for synchronization, commit 39341867 changed this to use the sb_bgl_lock, so that we could also synchronize against read_inode_bitmap and initialization of uninit inode tables. However, that change left ext4_free_inode using atomic bitops, which I think leaves no synchronization between setting & unsetting bits in the inode table. The below patch fixes it for me, although I wonder if we're getting at all heavy-handed with this spinlock... Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 3月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Add a new superblock value which tracks the lifetime amount of writes to the filesystem. This is useful in estimating the amount of wear on solid state drives (SSD's) caused by writes to the filesystem. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 3月, 2009 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
With delayed allocation we should not/cannot discard inode prealloc space during file close. We would still have dirty pages for which we haven't allocated blocks yet. With this fix after each get_blocks request we check whether we have zero reserved blocks and if yes and we don't have any writers on the file we discard inode prealloc space. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 2月, 2009 1 次提交
-
-
由 Eric Sandeen 提交于
Running without a journal, I oopsed when I ran out of space, because we called jbd2_journal_force_commit_nested() from ext4_should_retry_alloc() without a journal. This should take care of it, I think. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 2月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Commit c4be0c1d added error checking to ext4_freeze() when calling ext4_commit_super(). Unfortunately the patch failed to remove the original call to ext4_commit_super(), with the net result that when freezing the filesystem, the superblock gets written twice, the first time without error checking. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 2月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
When renaming a file such that a link to another inode is overwritten, force any delay allocated blocks that to be allocated so that if the filesystem is mounted with data=ordered, the data blocks will be pushed out to disk along with the journal commit. Many application programs expect this, so we do this to avoid zero length files if the system crashes unexpectedly. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
When closing a file that had been previously truncated, force any delay allocated blocks that to be allocated so that if the filesystem is mounted with data=ordered, the data blocks will be pushed out to disk along with the journal commit. Many application programs expect this, so we do this to avoid zero length files if the system crashes unexpectedly. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 26 2月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Add an ioctl which forces all of the delay allocated blocks to be allocated. This also provides a function ext4_alloc_da_blocks() which will be used by the following commits to force files to be fully allocated to preserve application-expected ext3 behaviour. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 24 2月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
The mpage_da_writepages() function is only used in one place, so inline it to simplify the call stack and make the code easier to understand. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 23 2月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
Struct mpage_da_data and mpage_add_bh_to_extent() use a fake struct buffer_head which is 104 bytes on an x86_64 system, but only use 24 bytes of the structure. On systems that use a spinlock for atomic_t, the stack savings will be even greater. It turns out that using a fake struct buffer_head doesn't even save that much code, and it makes the code more confusing since it's not used as a "real" buffer head. So just store pass b_size and b_state in mpage_add_bh_to_extent(), and store b_size, b_state, and b_block_nr in the mpage_da_data structure. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This parameter was always set to ext4_da_get_block_write(). Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 3月, 2009 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Make sure we validate extent details only when read from the disk. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-