提交 · 9f24e4208f7ee2748f157368b63287dc903fcf60 · bug2833 / cloud-kernel

05 3月, 2009 1 次提交

ext4: Use atomic_t's in struct flex_groups · 9f24e420

由 Theodore Ts'o 提交于 3月 04, 2009

Reduce pressure on the sb_bgl_lock family of locks by using atomic_t's
to track the number of free blocks and inodes in each flex_group.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9f24e420

31 3月, 2009 2 次提交

ext4: remove /proc tuning knobs · b713a5ec

由 Theodore Ts'o 提交于 3月 31, 2009

Remove tuning knobs in /proc/fs/ext4/<dev/* since they have been
replaced by knobs in sysfs at /sys/fs/ext4/<dev>/*.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b713a5ec

ext4: Add sysfs support · 3197ebdb

由 Theodore Ts'o 提交于 3月 31, 2009

Add basic sysfs support so that information about the mounted
filesystem and various tuning parameters can be accessed via
/sys/fs/ext4/<dev>/*.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3197ebdb

01 3月, 2009 1 次提交

ext4: Track lifetime disk writes · afc32f7e

由 Theodore Ts'o 提交于 2月 28, 2009

Add a new superblock value which tracks the lifetime amount of writes
to the filesystem. This is useful in estimating the amount of wear on
solid state drives (SSD's) caused by writes to the filesystem.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

afc32f7e

28 3月, 2009 1 次提交

ext4: Fix discard of inode prealloc space with delayed allocation. · d6014301

由 Aneesh Kumar K.V 提交于 3月 27, 2009

With delayed allocation we should not/cannot discard inode prealloc
space during file close. We would still have dirty pages for which we
haven't allocated blocks yet. With this fix after each get_blocks
request we check whether we have zero reserved blocks and if yes and
we don't have any writers on the file we discard inode prealloc space.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d6014301

24 2月, 2009 2 次提交

ext4: Automatically allocate delay allocated blocks on rename · 8750c6d5

由 Theodore Ts'o 提交于 2月 23, 2009

When renaming a file such that a link to another inode is overwritten,
force any delay allocated blocks that to be allocated so that if the
filesystem is mounted with data=ordered, the data blocks will be
pushed out to disk along with the journal commit.  Many application
programs expect this, so we do this to avoid zero length files if the
system crashes unexpectedly.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8750c6d5

ext4: Automatically allocate delay allocated blocks on close · 7d8f9f7d

由 Theodore Ts'o 提交于 2月 24, 2009

When closing a file that had been previously truncated, force any
delay allocated blocks that to be allocated so that if the filesystem
is mounted with data=ordered, the data blocks will be pushed out to
disk along with the journal commit.  Many application programs expect
this, so we do this to avoid zero length files if the system crashes
unexpectedly.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7d8f9f7d

26 2月, 2009 1 次提交

ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl · ccd2506b

由 Theodore Ts'o 提交于 2月 26, 2009

Add an ioctl which forces all of the delay allocated blocks to be
allocated.  This also provides a function ext4_alloc_da_blocks() which
will be used by the following commits to force files to be fully
allocated to preserve application-expected ext3 behaviour.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ccd2506b

24 2月, 2009 1 次提交

ext4: Simplify delalloc code by removing mpage_da_writepages() · f63e6005

由 Theodore Ts'o 提交于 2月 23, 2009

The mpage_da_writepages() function is only used in one place, so
inline it to simplify the call stack and make the code easier to
understand.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f63e6005

23 2月, 2009 2 次提交

ext4: Save stack space by removing fake buffer heads · 8dc207c0

由 Theodore Ts'o 提交于 2月 23, 2009

Struct mpage_da_data and mpage_add_bh_to_extent() use a fake struct
buffer_head which is 104 bytes on an x86_64 system, but only use 24
bytes of the structure.  On systems that use a spinlock for atomic_t,
the stack savings will be even greater.

It turns out that using a fake struct buffer_head doesn't even save
that much code, and it makes the code more confusing since it's not
used as a "real" buffer head.  So just store pass b_size and b_state
in mpage_add_bh_to_extent(), and store b_size, b_state, and b_block_nr
in the mpage_da_data structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8dc207c0

T
ext4: Simplify delalloc implementation by removing mpd.get_block · ed5bde0b
由 Theodore Ts'o 提交于 2月 23, 2009
```
This parameter was always set to ext4_da_get_block_write().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
ed5bde0b

28 3月, 2009 1 次提交

ext4: Validate extent details only when read from the disk · 7a262f7c

由 Aneesh Kumar K.V 提交于 3月 27, 2009

Make sure we validate extent details only when read from the disk.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7a262f7c

12 3月, 2009 1 次提交

ext4: Add checks to validate extent entries. · 56b19868

由 Aneesh Kumar K.V 提交于 3月 12, 2009

This patch adds checks to validate the extent entries along with extent
headers, to avoid crashes caused by corrupt filesystems.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

56b19868

23 2月, 2009 1 次提交

ext4: return -EIO not -ESTALE on directory traversal through deleted inode · e6f009b0

由 Bryan Donlan 提交于 2月 22, 2009

ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly.  However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode.  This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.

The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.

This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.
Signed-off-by: NBryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e6f009b0

13 3月, 2009 1 次提交

ext4: New inode/block allocation algorithms for flex_bg filesystems · a4912123

由 Theodore Ts'o 提交于 3月 12, 2009

The find_group_flex() inode allocator is now only used if the
filesystem is mounted using the "oldalloc" mount option.  It is
replaced with the original Orlov allocator that has been updated for
flex_bg filesystems (it should behave the same way if flex_bg is
disabled).  The inode allocator now functions by taking into account
each flex_bg group, instead of each block group, when deciding whether
or not it's time to allocate a new directory into a fresh flex_bg.

The block allocator has also been changed so that the first block
group in each flex_bg is preferred for use for storing directory
blocks.  This keeps directory blocks close together, which is good for
speeding up e2fsck since large directories are more likely to look
like this:

debugfs:  stat /home/tytso/Maildir/cur
Inode: 1844562   Type: directory    Mode:  0700   Flags: 0x81000
Generation: 1132745781    Version: 0x00000000:0000ad71
User: 15806   Group: 15806   Size: 1060864
File ACL: 0    Directory ACL: 0
Links: 2   Blockcount: 2072
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x499c0ff4:164961f4 -- Wed Feb 18 08:41:08 2009
 atime: 0x499c0ff4:00000000 -- Wed Feb 18 08:41:08 2009
 mtime: 0x49957f51:00000000 -- Fri Feb 13 09:10:25 2009
crtime: 0x499c0f57:00d51440 -- Wed Feb 18 08:38:31 2009
Size of extra inode fields: 28
BLOCKS:
(0):7348651, (1-258):7348654-7348911
TOTAL: 259
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a4912123

16 2月, 2009 3 次提交

ext4: tighten restrictions on inode flags · 2dc6b0d4

由 Duane Griffin 提交于 2月 15, 2009

At the moment there are few restrictions on which flags may be set on
which inodes.  Specifically DIRSYNC may only be set on directories and
IMMUTABLE and APPEND may not be set on links.  Tighten that to disallow
TOPDIR being set on non-directories and only NODUMP and NOATIME to be set
on non-regular file, non-directories.

Introduces a flags masking function which masks flags based on mode and
use it during inode creation and when flags are set via the ioctl to
facilitate future consistency.
Signed-off-by: NDuane Griffin <duaneg@dghda.com>
Acked-by: NAndreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2dc6b0d4

ext4: don't inherit inappropriate inode flags from parent · 8fa43a81

由 Duane Griffin 提交于 2月 15, 2009

At present INDEX and EXTENTS are the only flags that new ext4 inodes do
NOT inherit from their parent.  In addition prevent the flags DIRTY,
ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited. 
List inheritable flags explicitly to prevent future flags from
accidentally being inherited.

This fixes the TOPDIR flag inheritance bug reported at
http://bugzilla.kernel.org/show_bug.cgi?id=9866.
Signed-off-by: NDuane Griffin <duaneg@dghda.com>
Acked-by: NAndreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8fa43a81

ext4: allocate ->s_blockgroup_lock separately · 705895b6

由 Pekka Enberg 提交于 2月 15, 2009

As spotted by kmemtrace, struct ext4_sb_info is 17664 bytes on 64-bit
which makes it a very bad fit for SLAB allocators.  The culprit of the
wasted memory is ->s_blockgroup_lock which can be as big as 16 KB when
NR_CPUS >= 32.

To fix that, allocate ->s_blockgroup_lock, which fits nicely in a order 2
page in the worst case, separately.  This shinks down struct ext4_sb_info
enough to fit a 2 KB slab cache so now we allocate 16 KB + 2 KB instead of
32 KB saving 14 KB of memory.
Acked-by: NAndreas Dilger <adilger@sun.com>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

705895b6

15 2月, 2009 2 次提交

ext4: New rec_len encoding for very large blocksizes · 3d0518f4

由 Wei Yongjun 提交于 2月 14, 2009

The rec_len field in the directory entry is 16 bits, so to encode
blocksizes larger than 64k becomes problematic. This patch allows us
to supprot block sizes up to 256k, by using the low 2 bits to extend
the range of rec_len to 2**18-1 (since valid rec_len sizes must be a
multiple of 4). We use the convention that a rec_len of 0 or 65535
means the filesystem block size, for compatibility with older kernels.

It's unlikely we'll see VM pages of up to 256k, but at some point we
might find that the Linux VM has been enhanced to support filesystem
block sizes > than the VM page size, at which point it might be useful
for some applications to allow very large filesystem block sizes.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3d0518f4

T
ext4: Use unsigned int for blocksize in dx_make_map() and dx_pack_dirents() · 8bad4597
由 Theodore Ts'o 提交于 2月 14, 2009
```
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
8bad4597

07 2月, 2009 2 次提交

ext4: remove call to ext4_group_desc() in ext4_group_used_meta_blocks() · e187c658

由 Theodore Ts'o 提交于 2月 06, 2009

The static function ext4_group_used_meta_blocks() only has one caller,
who already has access to the block group's group descriptor. So it's
better to have ext4_init_block_bitmap() pass the group descriptor to
ext4_group_used_meta_blocks(), so it doesn't need to call
ext4_group_desc(). Previously this function did not check if
ext4_group_desc() returned NULL due to an error, potentially causing a
kernel OOPS report. This avoids the issue entirely.
Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e187c658

ext4: Remove stale block allocator references from ext4.h · 074ca442

由 Mike Snitzer 提交于 2月 06, 2009

Remove some leftovers from when the old block allocator was removed
(c2ea3fde).  ext4_sb_info is now a bit lighter.  Also remove a dangling
read_block_bitmap() prototype.
Signed-off-by: NMike Snitzer <snitzer@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

074ca442

26 3月, 2009 3 次提交

ext4: Use lowercase names of quota functions · a269eb18

由 Jan Kara 提交于 1月 26, 2009

Use lowercase names of quota functions instead of old uppercase ones.
Signed-off-by: NJan Kara <jack@suse.cz>
Acked-by: NMingming Cao <cmm@us.ibm.com>
CC: linux-ext4@vger.kernel.org

a269eb18

ext4: quota reservation for delayed allocation · 60e58e0f

由 Mingming Cao 提交于 1月 22, 2009

Uses quota reservation/claim/release to handle quota properly for delayed
allocation in the three steps: 1) quotas are reserved when data being copied
to cache when block allocation is defered 2) when new blocks are allocated.
reserved quotas are converted to the real allocated quota, 2) over-booked
quotas for metadata blocks are released back.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>

60e58e0f

ext4: Remove unnecessary quota functions · edf72453

由 Jan Kara 提交于 1月 12, 2009

ext4_dquot_initialize() and ext4_dquot_drop() is no longer
needed because of modified quota locking.
Signed-off-by: NJan Kara <jack@suse.cz>

edf72453

17 3月, 2009 1 次提交

ext4: fix bb_prealloc_list corruption due to wrong group locking · d33a1976

由 Eric Sandeen 提交于 3月 16, 2009

This is for Red Hat bug 490026: EXT4 panic, list corruption in
ext4_mb_new_inode_pa

ext4_lock_group(sb, group) is supposed to protect this list for
each group, and a common code flow to remove an album is like
this:

    ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL);
    ext4_lock_group(sb, grp);
    list_del(&pa->pa_group_list);
    ext4_unlock_group(sb, grp);

so it's critical that we get the right group number back for
this prealloc context, to lock the right group (the one 
associated with this pa) and prevent concurrent list manipulation.

however, ext4_mb_put_pa() passes in (pa->pa_pstart - 1) with a 
comment, "-1 is to protect from crossing allocation group".

This makes sense for the group_pa, where pa_pstart is advanced
by the length which has been used (in ext4_mb_release_context()),
and when the entire length has been used, pa_pstart has been
advanced to the first block of the next group.

However, for inode_pa, pa_pstart is never advanced; it's just
set once to the first block in the group and not moved after
that.  So in this case, if we subtract one in ext4_mb_put_pa(),
we are actually locking the *previous* group, and opening the
race with the other threads which do not subtract off the extra
block.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d33a1976

14 3月, 2009 1 次提交

ext4: fix bogus BUG_ONs in in mballoc code · 8d03c7a0

由 Eric Sandeen 提交于 3月 14, 2009

Thiemo Nagel reported that:

# dd if=/dev/zero of=image.ext4 bs=1M count=2
# mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
  -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
# mount -o loop image.ext4 mnt/
# dd if=/dev/zero of=mnt/file

oopsed, with a BUG_ON in ext4_mb_normalize_request because
size == EXT4_BLOCKS_PER_GROUP

It appears to me (esp. after talking to Andreas) that the BUG_ON
is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
be allowed, though larger sizes do indicate a problem.

Fix that an another (apparently rare) codepath with a similar check.
Reported-by: NThiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8d03c7a0

13 3月, 2009 1 次提交

ext4: Print the find_group_flex() warning only once · 2842c3b5

由 Theodore Ts'o 提交于 3月 12, 2009

This is a short-term warning, and even printk_ratelimit() can result
in too much noise in system logs. So only print it once as a warning.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2842c3b5

11 3月, 2009 1 次提交

ext4: fix header check in ext4_ext_search_right() for deep extent trees. · 395a87bf

由 Eric Sandeen 提交于 3月 10, 2009

The ext4_ext_search_right() function is confusing; it uses a
"depth" variable which is 0 at the root and maximum at the leaves, 
but the on-disk metadata uses a "depth" (actually eh_depth) which
is opposite: maximum at the root, and 0 at the leaves.

The ext4_ext_check_header() function is given a depth and checks
the header agaisnt that depth; it expects the on-disk semantics,
but we are giving it the opposite in the while loop in this 
function.  We should be giving it the on-disk notion of "depth"
which we can get from (p_depth - depth) - and if you look, the last
(more commonly hit) call to ext4_ext_check_header() does just this.

Sending in the wrong depth results in (incorrect) messages
about corruption:

EXT4-fs error (device sdb1): ext4_ext_search_right: bad header
in inode #2621457: unexpected eh_depth - magic f30a, entries 340,
max 340(0), depth 1(2)

http://bugzilla.kernel.org/show_bug.cgi?id=12821Reported-by: NDavid Dindorp <ddi@dubex.dk>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

395a87bf

05 3月, 2009 1 次提交

ext4: fix ext4_free_inode() vs. ext4_claim_inode() race · 7ce9d5d1

由 Eric Sandeen 提交于 3月 04, 2009

I was seeing fsck errors on inode bitmaps after a 4 thread
dbench run on a 4 cpu machine:

Inode bitmap differences: -50736 -(50752--50753) etc...

I believe that this is because ext4_free_inode() uses atomic
bitops, and although ext4_new_inode() *used* to also use atomic 
bitops for synchronization, commit 
39341867 changed this to use
the sb_bgl_lock, so that we could also synchronize against
read_inode_bitmap and initialization of uninit inode tables.

However, that change left ext4_free_inode using atomic bitops,
which I think leaves no synchronization between setting & 
unsetting bits in the inode table.

The below patch fixes it for me, although I wonder if we're 
getting at all heavy-handed with this spinlock...
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7ce9d5d1

26 2月, 2009 1 次提交

ext4: don't call jbd2_journal_force_commit_nested without journal · 8f64b32e

由 Eric Sandeen 提交于 2月 26, 2009

Running without a journal, I oopsed when I ran out of space,
because we called jbd2_journal_force_commit_nested() from
ext4_should_retry_alloc() without a journal.

This should take care of it, I think.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8f64b32e

28 2月, 2009 1 次提交

ext4: Remove duplicate call to ext4_commit_super() in ext4_freeze() · 8b1a8ff8

由 Theodore Ts'o 提交于 2月 28, 2009

Commit c4be0c1d added error checking to ext4_freeze() when calling
ext4_commit_super().  Unfortunately the patch failed to remove the
original call to ext4_commit_super(), with the net result that when
freezing the filesystem, the superblock gets written twice, the first
time without error checking.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8b1a8ff8

23 2月, 2009 1 次提交

ext4: Fix deadlock in ext4_write_begin() and ext4_da_write_begin() · ebd3610b

由 Jan Kara 提交于 2月 22, 2009

Functions ext4_write_begin() and ext4_da_write_begin() call
grab_cache_page_write_begin() without AOP_FLAG_NOFS. Thus it
can happen that page reclaim is triggered in that function
and it recurses back into the filesystem (or some other filesystem).
But this can lead to various problems as a transaction is already
started at that point. Add the necessary flag.

http://bugzilla.kernel.org/show_bug.cgi?id=11688Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ebd3610b

22 2月, 2009 1 次提交

ext4: Add fallback for find_group_flex · 05bf9e83

由 Theodore Ts'o 提交于 2月 21, 2009

This is a workaround for find_group_flex() which badly needs to be
replaced.  One of its problems (besides ignoring the Orlov algorithm)
is that it is a bit hyperactive about returning failure under
suspicious circumstances.  This can lead to spurious ENOSPC failures
even when there are inodes still available.

Work around this for now by retrying the search using
find_group_other() if find_group_flex() returns -1.  If
find_group_other() succeeds when find_group_flex() has failed, log a
warning message.

A better block/inode allocator that will fix this problem for real has
been queued up for the next merge window.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

05bf9e83

16 2月, 2009 1 次提交

ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling · 09054264

由 Dan Carpenter 提交于 2月 15, 2009

This was found through a code checker (http://repo.or.cz/w/smatch.git/). 
It looks like you might be able to trigger the error by trying to migrate 
a readonly file system.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

09054264

14 2月, 2009 2 次提交

ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages · 2acf2c26

由 Aneesh Kumar K.V 提交于 2月 14, 2009

With delayed allocation we lock the page in write_cache_pages() and
try to build an in memory extent of contiguous blocks. This is needed
so that we can get large contiguous blocks request. If range_cyclic
mode is enabled, write_cache_pages() will loop back to the 0 index if
no I/O has been done yet, and try to start writing from the beginning
of the range. That causes an attempt to take the page lock of lower
index page while holding the page lock of higher index page, which can
cause a dead lock with another writeback thread.

The solution is to implement the range_cyclic behavior in
ext4_da_writepages() instead.

http://bugzilla.kernel.org/show_bug.cgi?id=12579Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2acf2c26

ext4: Initialize preallocation list_head's properly · d794bf8e

由 Aneesh Kumar K.V 提交于 2月 14, 2009

When creating a new ext4_prealloc_space structure, we have to
initialize its list_head pointers before we add them to any prealloc
lists.  Otherwise, with list debug enabled, we will get list
corruption warnings.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d794bf8e

11 2月, 2009 1 次提交

ext4: Fix lockdep warning · ba443916

由 Aneesh Kumar K.V 提交于 2月 10, 2009

We should not call ext4_mb_add_n_trim while holding alloc_semp.

    =============================================
    [ INFO: possible recursive locking detected ]
    2.6.29-rc4-git1-dirty #124
    ---------------------------------------------
    ffsb/3116 is trying to acquire lock:
     (&meta_group_info[i]->alloc_sem){----}, at: [<ffffffff8035a6e8>]
     ext4_mb_load_buddy+0xd2/0x343

    but task is already holding lock:
     (&meta_group_info[i]->alloc_sem){----}, at: [<ffffffff8035a6e8>]
     ext4_mb_load_buddy+0xd2/0x343

http://bugzilla.kernel.org/show_bug.cgi?id=12672Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ba443916

10 2月, 2009 1 次提交

ext4: Fix to read empty directory blocks correctly in 64k · 7be2baaa

由 Wei Yongjun 提交于 2月 10, 2009

The rec_len field in the directory entry is 16 bits, so there was a
problem representing rec_len for filesystems with a 64k block size in
the case where the directory entry takes the entire 64k block.
Unfortunately, there were two schemes that were proposed; one where
all zeros meant 65536 and one where all ones (65535) meant 65536.
E2fsprogs used 0, whereas the kernel used 65535.  Oops.  Fortunately
this case happens extremely rarely, with the most common case being
the lost+found directory, created by mke2fs.

So we will be liberal in what we accept, and accept both encodings,
but we will continue to encode 65536 as 65535.  This will require a
change in e2fsprogs, but with fortunately ext4 filesystems normally
have the dir_index feature enabled, which precludes having a
completely empty directory block.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7be2baaa

11 2月, 2009 1 次提交

jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate() · 7f5aa215

由 Jan Kara 提交于 2月 10, 2009

If we race with commit code setting i_transaction to NULL, we could
possibly dereference it.  Proper locking requires the journal pointer
(to access journal->j_list_lock), which we don't have.  So we have to
change the prototype of the function so that filesystem passes us the
journal pointer.  Also add a more detailed comment about why the
function jbd2_journal_begin_ordered_truncate() does what it does and
how it should be used.

Thanks to Dan Carpenter <error27@gmail.com> for pointing to the
suspitious code.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Acked-by: NJoel Becker <joel.becker@oracle.com>
CC: linux-ext4@vger.kernel.org
CC: ocfs2-devel@oss.oracle.com
CC: mfasheh@suse.de
CC: Dan Carpenter <error27@gmail.com>

7f5aa215

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致