提交 · ba869023eac8354b17acdcff82b851ea8e7b1809 · openeuler / raspberrypi-kernel

05 3月, 2010 1 次提交

ext4: correctly calculate number of blocks for fiemap · aca92ff6

由 Leonard Michlmayr 提交于 3月 04, 2010

ext4_fiemap() rounds the length of the requested range down to
blocksize, which is is not the true number of blocks that cover the
requested region.  This problem is especially impressive if the user
requests only the first byte of a file: not a single extent will be
reported.

We fix this by calculating the last block of the region and then
subtract to find the number of blocks in the extents.
Signed-off-by: NLeonard Michlmayr <leonard.michlmayr@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

aca92ff6

16 2月, 2010 2 次提交

R
ext4: add missing error checking to ext4_expand_extra_isize_ea() · 9aaab058
由 Roel Kluin 提交于 2月 15, 2010
```
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
```
9aaab058

ext4: move __func__ into a macro for ext4_warning, ext4_error · 12062ddd

由 Eric Sandeen 提交于 2月 15, 2010

Just a pet peeve of mine; we had a mishash of calls with either __func__
or "function_name" and the latter tends to get out of sync.

I think it's easier to just hide the __func__ in a macro, and it'll
be consistent from then on.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

12062ddd

25 1月, 2010 2 次提交

T
ext4: Reserve INCOMPAT_EA_INODE and INCOMPAT_DIRDATA feature codepoints · f710b4b9
由 Theodore Ts'o 提交于 1月 25, 2010
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
f710b4b9

ext4: Use bitops to read/modify EXT4_I(inode)->i_state · 19f5fb7a

由 Theodore Ts'o 提交于 1月 24, 2010

At several places we modify EXT4_I(inode)->i_state without holding
i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage,
ext4_do_update_inode, ...). These modifications are racy and we can
lose updates to i_state. So convert handling of i_state to use bitops
which are atomic.

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

19f5fb7a

01 1月, 2010 1 次提交

ext4: Add new tracepoints to debug delayed allocation space functions · f8ec9d68

由 Theodore Ts'o 提交于 1月 01, 2010

Add tracepoints for ext4_da_reserve_space(),
ext4_da_update_reserve_space(), and ext4_da_release_space().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f8ec9d68

23 1月, 2010 1 次提交

ext4: Add block validity check when truncating indirect block mapped inodes · 1f2acb60

由 Theodore Ts'o 提交于 1月 22, 2010

Add checks to ext4_free_branches() to make sure a block number found
in an indirect block are valid before trying to free it.  If a bad
block number is found, stop freeing the indirect block immediately,
since the file system is corrupt and we will need to run fsck anyway.
This also avoids spamming the logs, and specifically avoids
driver-level "attempt to access beyond end of device" errors obscure
what is really going on.

If you get *really*, *really*, *really* unlucky, without this patch, a
supposed indirect block containing garbage might contain a reference
to a primary block group descriptor, in which case
ext4_free_branches() could end up zero'ing out a block group
descriptor block, and if then one of the block bitmaps for a block
group described by that bg descriptor block is not in memory, and is
read in by ext4_read_block_bitmap().  This function calls
ext4_valid_block_bitmap(), which assumes that bg_inode_table() was
validated at mount time and hasn't been modified since.  Since this
assumption is no longer valid, it's possible for the value
(ext4_inode_table(sb, desc) - group_first_block) to go negative, which
will cause ext4_find_next_zero_bit() to trigger a kernel GPF.

Addresses-Google-Bug: #2220436
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f2acb60

16 2月, 2010 1 次提交

ext4: Fix optional-arg mount options · 15121c18

由 Eric Sandeen 提交于 2月 15, 2010

We have 2 mount options, "barrier" and "auto_da_alloc" which may or
may not take a 1/0 argument.  This causes the ext4 superblock mount
code to subtract uninitialized pointers and pass the result to
kmalloc, which results in very noisy failures.

Per Ted's suggestion, initialize the args struct so that
we know whether match_token() found an argument for the
option, and skip match_int() if not.

Also, return error (0) from parse_options if we thought
we found an argument, but match_int() Fails.
Reported-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

15121c18

05 2月, 2010 1 次提交

ext4: fix async i/o writes beyond 4GB to a sparse file · a1de02dc

由 Eric Sandeen 提交于 2月 04, 2010

The "offset" member in ext4_io_end holds bytes, not blocks, so
ext4_lblk_t is wrong - and too small (u32).

This caused the async i/o writes to sparse files beyond 4GB to fail
when they wrapped around to 0.

Also fix up the type of arguments to ext4_convert_unwritten_extents(),
it gets ssize_t from ext4_end_aio_dio_nolock() and
ext4_ext_direct_IO().
Reported-by: NGiel de Nijs <giel@vectorwise.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>

a1de02dc

15 1月, 2010 1 次提交

ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag · 1296cc85

由 Aneesh Kumar K.V 提交于 1月 15, 2010

We should update reserve space if it is delalloc buffer
and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag.
So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of
EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

1296cc85

25 1月, 2010 1 次提交

ext4: Fix quota accounting error with fallocate · 5f634d06

由 Aneesh Kumar K.V 提交于 1月 25, 2010

When we fallocate a region of the file which we had recently written,
and which is still in the page cache marked as delayed allocated blocks
we need to make sure we don't do the quota update on writepage path.
This is because the needed quota updated would have already be done
by fallocate.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

5f634d06

23 1月, 2010 1 次提交

ext4: Handle -EDQUOT error on write · 1db91382

由 Aneesh Kumar K.V 提交于 1月 22, 2010

We need to release the journal before we do a write_inode.  Otherwise
we could deadlock.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

1db91382

01 1月, 2010 2 次提交

ext4: Calculate metadata requirements more accurately · 9d0be502

由 Theodore Ts'o 提交于 1月 01, 2010

In the past, ext4_calc_metadata_amount(), and its sub-functions
ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
badly over-estimated the number of metadata blocks that might be
required for delayed allocation blocks. This didn't matter as much
when functions which managed the reserved metadata blocks were more
aggressive about dropping reserved metadata blocks as delayed
allocation blocks were written, but unfortunately they were too
aggressive. This was fixed in commit 0637c6f4, but as a result the
over-estimation by ext4_calc_metadata_amount() would lead to reserving
2-3 times the number of pending delayed allocation blocks as
potentially required metadata blocks. So if there are 1 megabytes of
blocks which have been not yet been allocation, up to 3 megabytes of
space would get reserved out of the user's quota and from the file
system free space pool until all of the inode's data blocks have been
allocated.

This commit addresses this problem by much more accurately estimating
the number of metadata blocks that will be required. It will still
somewhat over-estimate the number of blocks needed, since it must make
a worst case estimate not knowing which physical blocks will be
needed, but it is much more accurate than before.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9d0be502

ext4: Fix accounting of reserved metadata blocks · ee5f4d9c

由 Theodore Ts'o 提交于 1月 01, 2010

Commit 0637c6f4 had a typo which caused the reserved metadata blocks to
not be released correctly.   Fix this.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ee5f4d9c

31 12月, 2009 1 次提交

ext4: Patch up how we claim metadata blocks for quota purposes · 0637c6f4

由 Theodore Ts'o 提交于 12月 30, 2009

As reported in Kernel Bugzilla #14936, commit d21cd8f1 triggered a BUG
in the function ext4_da_update_reserve_space() found in
fs/ext4/inode.c.  The root cause of this BUG() was caused by the fact
that ext4_calc_metadata_amount() can severely over-estimate how many
metadata blocks will be needed, especially when using direct
block-mapped files.

In addition, it can also badly *under* estimate how much space is
needed, since ext4_calc_metadata_amount() assumes that the blocks are
contiguous, and this is not always true.  If the application is
writing blocks to a sparse file, the number of metadata blocks
necessary can be severly underestimated by the functions
ext4_da_reserve_space(), ext4_da_update_reserve_space() and
ext4_da_release_space().  This was the cause of the dq_claim_space
reports found on kerneloops.org.

Unfortunately, doing this right means that we need to massively
over-estimate the amount of free space needed.  So in some cases we
may need to force the inode to be written to disk asynchronously in
to avoid spurious quota failures.

http://bugzilla.kernel.org/show_bug.cgi?id=14936Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0637c6f4

30 12月, 2009 1 次提交

ext4: Ensure zeroout blocks have no dirty metadata · 515f41c3

由 Aneesh Kumar K.V 提交于 12月 29, 2009

This fixes a bug (found by Curt Wohlgemuth) in which new blocks
returned from an extent created with ext4_ext_zeroout() can have dirty
metadata still associated with them.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

515f41c3

26 12月, 2009 1 次提交

ext4: return correct wbc.nr_to_write in ext4_da_writepages · 2faf2e19

由 Richard Kennedy 提交于 12月 25, 2009

When ext4_da_writepages increases the nr_to_write in writeback_control
then it must always re-base the return value.  Originally there was a
(misguided) attempt prevent wbc.nr_to_write from going negative.  In
fact, it's necessary to allow nr_to_write to be negative so that
wb_writeback() can correctly calculate how many pages were actually
written.  
Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2faf2e19

23 12月, 2009 7 次提交

ext4: flush delalloc blocks when space is low · c8afb446

由 Eric Sandeen 提交于 12月 23, 2009

Creating many small files in rapid succession on a small
filesystem can lead to spurious ENOSPC; on a 104MB filesystem:

for i in `seq 1 22500`; do
    echo -n > $SCRATCH_MNT/$i
    echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i
done

leads to ENOSPC even though after a sync, 40% of the fs is free
again.

This is because we reserve worst-case metadata for delalloc writes,
and when data is allocated that worst-case reservation is not
usually needed.

When freespace is low, kicking off an async writeback will start
converting that worst-case space usage into something more realistic,
almost always freeing up space to continue.

This resolves the testcase for me, and survives all 4 generic
ENOSPC tests in xfstests.

We'll still need a hard synchronous sync to squeeze out the last bit,
but this fixes things up to a large degree.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c8afb446

ext4: Eliminate potential double free on error path · d3533d72

由 Julia Lawall 提交于 12月 23, 2009

b_entry_name and buffer are initially NULL, are initialized within a loop
to the result of calling kmalloc, and are freed at the bottom of this loop.
The loop contains gotos to cleanup, which also frees b_entry_name and
buffer.  Some of these gotos are before the reinitializations of
b_entry_name and buffer.  To maintain the invariant that b_entry_name and
buffer are NULL at the top of the loop, and thus acceptable arguments to
kfree, these variables are now set to NULL after the kfrees.

This seems to be the simplest solution.  A more complicated solution
would be to introduce more labels in the error handling code at the end of
the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r@
identifier E;
expression E1;
iterator I;
statement S;
@@

*kfree(E);
... when != E = E1
    when != I(E,...) S
    when != &E
*kfree(E);
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d3533d72

ext4: fix unsigned long long printk warning in super.c · a6b43e38

由 Andrew Morton 提交于 12月 23, 2009

sparc64 allmodconfig:

fs/ext4/super.c: In function `lifetime_write_kbytes_show':
fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a6b43e38

ext4: fix sleep inside spinlock issue with quota and dealloc (#14739) · 39bc680a

由 Dmitry Monakhov 提交于 12月 10, 2009

Unlock i_block_reservation_lock before vfs_dq_reserve_block().
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14739

CC: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

39bc680a

ext4: Fix potential quota deadlock · d21cd8f1

由 Dmitry Monakhov 提交于 12月 10, 2009

We have to delay vfs_dq_claim_space() until allocation context destruction.
Currently we have following call-trace:
ext4_mb_new_blocks()
  /* task is already holding ac->alloc_semp */
 ->ext4_mb_mark_diskspace_used
    ->vfs_dq_claim_space()  /*  acquire dqptr_sem here. Possible deadlock */
 ->ext4_mb_release_context() /* drop ac->alloc_semp here */

Let's move quota claiming to ext4_da_update_reserve_space()

 =======================================================
 [ INFO: possible circular locking dependency detected ]
 2.6.32-rc7 #18
 -------------------------------------------------------
 write-truncate-/3465 is trying to acquire lock:
  (&s->s_dquot.dqptr_sem){++++..}, at: [<c025e73b>] dquot_claim_space+0x3b/0x1b0

 but task is already holding lock:
  (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #3 (&meta_group_info[i]->alloc_sem){++++..}:
        [<c017d04b>] __lock_acquire+0xd7b/0x1260
        [<c017d5ea>] lock_acquire+0xba/0xd0
        [<c0527191>] down_read+0x51/0x90
        [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370
        [<c02d0c1c>] ext4_mb_free_blocks+0x46c/0x870
        [<c029c9d3>] ext4_free_blocks+0x73/0x130
        [<c02c8cfc>] ext4_ext_truncate+0x76c/0x8d0
        [<c02a8087>] ext4_truncate+0x187/0x5e0
        [<c01e0f7b>] vmtruncate+0x6b/0x70
        [<c022ec02>] inode_setattr+0x62/0x190
        [<c02a2d7a>] ext4_setattr+0x25a/0x370
        [<c022ee81>] notify_change+0x151/0x340
        [<c021349d>] do_truncate+0x6d/0xa0
        [<c0221034>] may_open+0x1d4/0x200
        [<c022412b>] do_filp_open+0x1eb/0x910
        [<c021244d>] do_sys_open+0x6d/0x140
        [<c021258e>] sys_open+0x2e/0x40
        [<c0103100>] sysenter_do_call+0x12/0x32

 -> #2 (&ei->i_data_sem){++++..}:
        [<c017d04b>] __lock_acquire+0xd7b/0x1260
        [<c017d5ea>] lock_acquire+0xba/0xd0
        [<c0527191>] down_read+0x51/0x90
        [<c02a5787>] ext4_get_blocks+0x47/0x450
        [<c02a74c1>] ext4_getblk+0x61/0x1d0
        [<c02a7a7f>] ext4_bread+0x1f/0xa0
        [<c02bcddc>] ext4_quota_write+0x12c/0x310
        [<c0262d23>] qtree_write_dquot+0x93/0x120
        [<c0261708>] v2_write_dquot+0x28/0x30
        [<c025d3fb>] dquot_commit+0xab/0xf0
        [<c02be977>] ext4_write_dquot+0x77/0x90
        [<c02be9bf>] ext4_mark_dquot_dirty+0x2f/0x50
        [<c025e321>] dquot_alloc_inode+0x101/0x180
        [<c029fec2>] ext4_new_inode+0x602/0xf00
        [<c02ad789>] ext4_create+0x89/0x150
        [<c0221ff2>] vfs_create+0xa2/0xc0
        [<c02246e7>] do_filp_open+0x7a7/0x910
        [<c021244d>] do_sys_open+0x6d/0x140
        [<c021258e>] sys_open+0x2e/0x40
        [<c0103100>] sysenter_do_call+0x12/0x32

 -> #1 (&sb->s_type->i_mutex_key#7/4){+.+...}:
        [<c017d04b>] __lock_acquire+0xd7b/0x1260
        [<c017d5ea>] lock_acquire+0xba/0xd0
        [<c0526505>] mutex_lock_nested+0x65/0x2d0
        [<c0260c9d>] vfs_load_quota_inode+0x4bd/0x5a0
        [<c02610af>] vfs_quota_on_path+0x5f/0x70
        [<c02bc812>] ext4_quota_on+0x112/0x190
        [<c026345a>] sys_quotactl+0x44a/0x8a0
        [<c0103100>] sysenter_do_call+0x12/0x32

 -> #0 (&s->s_dquot.dqptr_sem){++++..}:
        [<c017d361>] __lock_acquire+0x1091/0x1260
        [<c017d5ea>] lock_acquire+0xba/0xd0
        [<c0527191>] down_read+0x51/0x90
        [<c025e73b>] dquot_claim_space+0x3b/0x1b0
        [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380
        [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530
        [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0
        [<c02a5966>] ext4_get_blocks+0x226/0x450
        [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0
        [<c02a6ed6>] ext4_da_writepages+0x506/0x790
        [<c01de272>] do_writepages+0x22/0x50
        [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80
        [<c01d7b9b>] filemap_flush+0x2b/0x30
        [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60
        [<c029e595>] ext4_release_file+0x75/0xb0
        [<c0216b59>] __fput+0xf9/0x210
        [<c0216c97>] fput+0x27/0x30
        [<c02122dc>] filp_close+0x4c/0x80
        [<c014510e>] put_files_struct+0x6e/0xd0
        [<c01451b7>] exit_files+0x47/0x60
        [<c0146a24>] do_exit+0x144/0x710
        [<c0147028>] do_group_exit+0x38/0xa0
        [<c0159abc>] get_signal_to_deliver+0x2ac/0x410
        [<c0102849>] do_notify_resume+0xb9/0x890
        [<c01032d2>] work_notifysig+0x13/0x21

 other info that might help us debug this:

 3 locks held by write-truncate-/3465:
  #0:  (jbd2_handle){+.+...}, at: [<c02e1f8f>] start_this_handle+0x38f/0x5c0
  #1:  (&ei->i_data_sem){++++..}, at: [<c02a57f6>] ext4_get_blocks+0xb6/0x450
  #2:  (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370

 stack backtrace:
 Pid: 3465, comm: write-truncate- Not tainted 2.6.32-rc7 #18
 Call Trace:
  [<c0524cb3>] ? printk+0x1d/0x22
  [<c017ac9a>] print_circular_bug+0xca/0xd0
  [<c017d361>] __lock_acquire+0x1091/0x1260
  [<c016bca2>] ? sched_clock_local+0xd2/0x170
  [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
  [<c017d5ea>] lock_acquire+0xba/0xd0
  [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0
  [<c0527191>] down_read+0x51/0x90
  [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0
  [<c025e73b>] dquot_claim_space+0x3b/0x1b0
  [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380
  [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530
  [<c02c601d>] ? ext4_ext_find_extent+0x25d/0x280
  [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0
  [<c016bca2>] ? sched_clock_local+0xd2/0x170
  [<c016be60>] ? sched_clock_cpu+0x120/0x160
  [<c016beef>] ? cpu_clock+0x4f/0x60
  [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
  [<c052712c>] ? down_write+0x8c/0xa0
  [<c02a5966>] ext4_get_blocks+0x226/0x450
  [<c016be60>] ? sched_clock_cpu+0x120/0x160
  [<c016beef>] ? cpu_clock+0x4f/0x60
  [<c017908b>] ? trace_hardirqs_off+0xb/0x10
  [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0
  [<c01d69cc>] ? find_get_pages_tag+0x16c/0x180
  [<c01d6860>] ? find_get_pages_tag+0x0/0x180
  [<c02a73bd>] ? __mpage_da_writepage+0x16d/0x1a0
  [<c01dfc4e>] ? pagevec_lookup_tag+0x2e/0x40
  [<c01ddf1b>] ? write_cache_pages+0xdb/0x3d0
  [<c02a7250>] ? __mpage_da_writepage+0x0/0x1a0
  [<c02a6ed6>] ext4_da_writepages+0x506/0x790
  [<c016beef>] ? cpu_clock+0x4f/0x60
  [<c016bca2>] ? sched_clock_local+0xd2/0x170
  [<c016be60>] ? sched_clock_cpu+0x120/0x160
  [<c016be60>] ? sched_clock_cpu+0x120/0x160
  [<c02a69d0>] ? ext4_da_writepages+0x0/0x790
  [<c01de272>] do_writepages+0x22/0x50
  [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80
  [<c01d7b9b>] filemap_flush+0x2b/0x30
  [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60
  [<c029e595>] ext4_release_file+0x75/0xb0
  [<c0216b59>] __fput+0xf9/0x210
  [<c0216c97>] fput+0x27/0x30
  [<c02122dc>] filp_close+0x4c/0x80
  [<c014510e>] put_files_struct+0x6e/0xd0
  [<c01451b7>] exit_files+0x47/0x60
  [<c0146a24>] do_exit+0x144/0x710
  [<c017b163>] ? lock_release_holdtime+0x33/0x210
  [<c0528137>] ? _spin_unlock_irq+0x27/0x30
  [<c0147028>] do_group_exit+0x38/0xa0
  [<c017babb>] ? trace_hardirqs_on+0xb/0x10
  [<c0159abc>] get_signal_to_deliver+0x2ac/0x410
  [<c0102849>] do_notify_resume+0xb9/0x890
  [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0
  [<c017b163>] ? lock_release_holdtime+0x33/0x210
  [<c0165b50>] ? autoremove_wake_function+0x0/0x50
  [<c017ba54>] ? trace_hardirqs_on_caller+0x134/0x190
  [<c017babb>] ? trace_hardirqs_on+0xb/0x10
  [<c0300ba4>] ? security_file_permission+0x14/0x20
  [<c0215761>] ? vfs_write+0x131/0x190
  [<c0214f50>] ? do_sync_write+0x0/0x120
  [<c0103115>] ? sysenter_do_call+0x27/0x32
  [<c01032d2>] work_notifysig+0x13/0x21

CC: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

d21cd8f1

ext4: Convert to generic reserved quota's space management. · a9e7f447

由 Dmitry Monakhov 提交于 12月 14, 2009

This patch also fixes write vs chown race condition.
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

a9e7f447

ext4, jbd2: Add barriers for file systems with exernal journals · cc3e1bea

由 Theodore Ts'o 提交于 12月 23, 2009

This is a bit complicated because we are trying to optimize when we
send barriers to the fs data disk. We could just throw in an extra
barrier to the data disk whenever we send a barrier to the journal
disk, but that's not always strictly necessary.

We only need to send a barrier during a commit when there are data
blocks which are must be written out due to an inode written in
ordered mode, or if fsync() depends on the commit to force data blocks
to disk. Finally, before we drop transactions from the beginning of
the journal during a checkpoint operation, we need to guarantee that
any blocks that were flushed out to the data disk are firmly on the
rust platter before we drop the transaction from the journal.

Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cc3e1bea

18 12月, 2009 1 次提交

Revert "task_struct: make journal_info conditional" · b6e3224f

由 Linus Torvalds 提交于 12月 17, 2009

This reverts commit e4c570c4, as
requested by Alexey:

 "I think I gave a good enough arguments to not merge it.
  To iterate:
   * patch makes impossible to start using ext3 on EXT3_FS=n kernels
     without reboot.
   * this is done only for one pointer on task_struct"

  None of config options which define task_struct are tristate directly
  or effectively."
Requested-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6e3224f

17 12月, 2009 1 次提交

sanitize xattr handler prototypes · 431547b3

由 Christoph Hellwig 提交于 11月 13, 2009

Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods.  This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute.  With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.

Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.

[with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Morris <jmorris@namei.org>
Acked-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

431547b3

16 12月, 2009 2 次提交

tree-wide: convert open calls to remove spaces to skip_spaces() lib function · e7d2860b

由 André Goddard Rosa 提交于 12月 14, 2009

Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
   text    data     bss     dec     hex filename
  64688     584     592   65864   10148 (TOTALS-BEFORE)
  64641     584     592   65817   10119 (TOTALS-AFTER)

Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
    drivers/leds/led-class.c
    drivers/leds/ledtrig-timer.c
    drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str &&  isspace(*str)) { \(str++;\|++str;\) }
|
- *str &&
isspace(*str)
)
Signed-off-by: NAndré Goddard Rosa <andre.goddard@gmail.com>
Cc: Julia Lawall <julia@diku.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e7d2860b

task_struct: make journal_info conditional · e4c570c4

由 Hiroshi Shimamoto 提交于 12月 14, 2009

journal_info in task_struct is used in journaling file system only.  So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e4c570c4

14 12月, 2009 1 次提交

ext4: replace BUG() with return -EIO in ext4_ext_get_blocks · 034fb4c9

由 Surbhi Palande 提交于 12月 14, 2009

This patch fixes the Kernel BZ #14286. When the address of an extent
corresponding to a valid block is corrupted, a -EIO should be reported
instead of a BUG(). This situation should not normally not occur
except in the case of a corrupted filesystem. If however it does,
then the system should not panic directly but depending on the mount
time options appropriate action should be taken. If the mount options
so permit, the I/O should be gracefully aborted by returning a -EIO.

http://bugzilla.kernel.org/show_bug.cgi?id=14286Signed-off-by: NSurbhi Palande <surbhi.palande@canonical.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

034fb4c9

21 12月, 2009 2 次提交

ext4: add module aliases for ext2 and ext3 · 51b7e3c9

由 Theodore Ts'o 提交于 12月 21, 2009

Add module aliases for ext2 and ext3 when CONFIG_EXT4_USE_FOR_EXT23 is
set.  This makes the existing user-space stuff like mkinitrd working
as is.
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

51b7e3c9

ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured · 84c66473

由 David Howells 提交于 12月 21, 2009

Don't offer to build ext2/3 support into ext4 if ext4 itself is not
configured on.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

84c66473

14 12月, 2009 1 次提交

ext4: remove unused #include <linux/version.h> · 149feb00

由 Huang Weiyi 提交于 12月 14, 2009

Remove unused #include <linux/version.h>('s) in
  fs/ext4/block_validity.c
  fs/ext4/mballoc.h
Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

149feb00

10 12月, 2009 3 次提交

ext4: Support for 64-bit quota format · 5a20bdfc

由 Jan Kara 提交于 11月 30, 2009

Add support for new 64-bit quota format. It is enough to add proper
mount options handling. The rest is done by the generic code.
Signed-off-by: NJan Kara <jack@suse.cz>

5a20bdfc

ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem) · fab3a549

由 Theodore Ts'o 提交于 12月 09, 2009

Fix the following potential circular locking dependency between
mm->mmap_sem and ei->i_data_sem:

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.32-04115-gec044c5 #37
    -------------------------------------------------------
    ureadahead/1855 is trying to acquire lock:
     (&mm->mmap_sem){++++++}, at: [<ffffffff81107224>] might_fault+0x5c/0xac

    but task is already holding lock:
     (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&ei->i_data_sem){++++..}:
           [<ffffffff81099bfa>] __lock_acquire+0xb67/0xd0f
           [<ffffffff81099e7e>] lock_acquire+0xdc/0x102
           [<ffffffff81516633>] down_read+0x51/0x84
           [<ffffffff811a2414>] ext4_get_blocks+0x50/0x2a5
           [<ffffffff811a3453>] ext4_get_block+0xab/0xef
           [<ffffffff81154f39>] do_mpage_readpage+0x198/0x48d
           [<ffffffff81155360>] mpage_readpages+0xd0/0x114
           [<ffffffff811a104b>] ext4_readpages+0x1d/0x1f
           [<ffffffff810f8644>] __do_page_cache_readahead+0x12f/0x1bc
           [<ffffffff810f86f2>] ra_submit+0x21/0x25
           [<ffffffff810f0cfd>] filemap_fault+0x19f/0x32c
           [<ffffffff81107b97>] __do_fault+0x55/0x3a2
           [<ffffffff81109db0>] handle_mm_fault+0x327/0x734
           [<ffffffff8151aaa9>] do_page_fault+0x292/0x2aa
           [<ffffffff81518205>] page_fault+0x25/0x30
           [<ffffffff812a34d8>] clear_user+0x38/0x3c
           [<ffffffff81167e16>] padzero+0x20/0x31
           [<ffffffff81168b47>] load_elf_binary+0x8bc/0x17ed
           [<ffffffff81130e95>] search_binary_handler+0xc2/0x259
           [<ffffffff81166d64>] load_script+0x1b8/0x1cc
           [<ffffffff81130e95>] search_binary_handler+0xc2/0x259
           [<ffffffff8113255f>] do_execve+0x1ce/0x2cf
           [<ffffffff81027494>] sys_execve+0x43/0x5a
           [<ffffffff8102918a>] stub_execve+0x6a/0xc0

    -> #0 (&mm->mmap_sem){++++++}:
           [<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
           [<ffffffff81099e7e>] lock_acquire+0xdc/0x102
           [<ffffffff81107251>] might_fault+0x89/0xac
           [<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
           [<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
           [<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
           [<ffffffff811be21e>] ext4_fiemap+0x13c/0x159
           [<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
           [<ffffffff811392ca>] sys_ioctl+0x56/0x79
           [<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b

    other info that might help us debug this:

    1 lock held by ureadahead/1855:
     #0:  (&ei->i_data_sem){++++..}, at: [<ffffffff811be1fd>] ext4_fiemap+0x11b/0x159

    stack backtrace:
    Pid: 1855, comm: ureadahead Not tainted 2.6.32-04115-gec044c5 #37
    Call Trace:
     [<ffffffff81098c70>] print_circular_bug+0xa8/0xb7
     [<ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
     [<ffffffff8102f229>] ? sched_clock+0x9/0xd
     [<ffffffff81099e7e>] lock_acquire+0xdc/0x102
     [<ffffffff81107224>] ? might_fault+0x5c/0xac
     [<ffffffff81107251>] might_fault+0x89/0xac
     [<ffffffff81107224>] ? might_fault+0x5c/0xac
     [<ffffffff81124b44>] ? __kmalloc+0x13b/0x18c
     [<ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
     [<ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
     [<ffffffff811bca0b>] ? ext4_ext_fiemap_cb+0x0/0x157
     [<ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
     [<ffffffff811be21e>] ext4_fiemap+0x13c/0x159
     [<ffffffff81107224>] ? might_fault+0x5c/0xac
     [<ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
     [<ffffffff8129f6d0>] ? __up_read+0x8d/0x95
     [<ffffffff81517fb5>] ? retint_swapgs+0x13/0x1b
     [<ffffffff811392ca>] sys_ioctl+0x56/0x79
     [<ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fab3a549

ext4: Do not override ext2 or ext3 if built they are built as modules · a214238d

由 Theodore Ts'o 提交于 12月 09, 2009

The CONFIG_EXT4_USE_FOR_EXT23 option must not try to take over the
ext2 or ext3 file systems if the those file system drivers are
configured to be built as mdoules.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a214238d

07 12月, 2009 1 次提交

ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT · 4a58579b

由 Akira Fujita 提交于 12月 06, 2009

This patch fixes three problems in the handling of the
EXT4_IOC_MOVE_EXT ioctl:

1. In current EXT4_IOC_MOVE_EXT, there are read access mode checks for
original and donor files, but they allow the illegal write access to
donor file, since donor file is overwritten by original file data.  To
fix this problem, change access mode checks of original (r->r/w) and
donor (r->w) files.

2.  Disallow the use of donor files that have a setuid or setgid bits.

3.  Call mnt_want_write() and mnt_drop_write() before and after
ext4_move_extents() calling to get write access to a mount.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4a58579b

09 12月, 2009 4 次提交

ext4: Wait for proper transaction commit on fsync · b436b9be

由 Jan Kara 提交于 12月 08, 2009

We cannot rely on buffer dirty bits during fsync because pdflush can come
before fsync is called and clear dirty bits without forcing a transaction
commit. What we do is that we track which transaction has last changed
the inode and which transaction last changed allocation and force it to
disk on fsync.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b436b9be

ext4: fix incorrect block reservation on quota transfer. · 194074ac

由 Dmitry Monakhov 提交于 12月 08, 2009

Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid
This means that we may end-up with transferring all quotas. Add
we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in
case of QUOTA_INIT_BLOCKS.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

194074ac

ext4: quota macros cleanup · 5aca07eb

由 Dmitry Monakhov 提交于 12月 08, 2009

Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Acked-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5aca07eb

ext4: ext4_get_reserved_space() must return bytes instead of blocks · 8aa6790f

由 Dmitry Monakhov 提交于 12月 08, 2009

Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: NEric Sandeen <sandeen@redhat.com>
Acked-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8aa6790f