提交 · 6d1ab10e69ff5f3cb63920ba965ec0f1f0bdaf8d · openeuler / raspberrypi-kernel

27 9月, 2012 15 次提交

由 Carlos Maiolino 提交于 9月 27, 2012

When ext4_bread() returns NULL and err is set to zero, this means
there is no phyical block mapped to the specified logical block
number.  (Previous to commit 90b0a973, err was uninitialized in this
case, which caused other problems.)

The directory handling routines use ext4_bread() in many places, the
fact that ext4_bread() now returns NULL with err set to zero could
cause problems since a number of these functions will simply return
the value of err if the result of ext4_bread() was the NULL pointer,
causing the caller of the function to think that the function was
successful.

Since directories should never contain holes, this case can only
happen if the file system is corrupted.  This commit audits all of the
callers of ext4_bread(), and makes sure they do the right thing if a
hole in a directory is found by ext4_bread().

Some ext4_bread() callers did not need any changes either because they
already had its own hole detector paths.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6d1ab10e

fs: reserve fallocate flag codepoint · bbdd6808

由 Theodore Ts'o 提交于 9月 27, 2012

As discussed at the Plumber's Conference, reserve the bit 0x04 in
fallocate() to prevent collisions with a commonly used out-of-tree
patch which implements the no-hide-stale feature.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bbdd6808

ext4: remove redundant offset check in mext_check_arguments() · cbb4ee83

由 Wang Sheng-Hui 提交于 9月 27, 2012

In the check code above, if orig_start != donor_start, we would
return -EINVAL. So here, orig_start should be equal with donor_start.
Remove the redundant check here.
Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cbb4ee83

ext4: don't clear orphan list on ro mount with errors · c25f9bc6

由 Eric Sandeen 提交于 9月 26, 2012

If the file system contains errors and it is being mounted read-only,
don't clear the orphan list.  We should minimize changes to the file
system if it is mounted read-only.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c25f9bc6

jbd2: fix assertion failure in commit code due to lacking transaction credits · b794e7a6

由 Jan Kara 提交于 9月 26, 2012

ext4 users of data=journal mode with blocksize < pagesize were
occasionally hitting assertion failure in
jbd2_journal_commit_transaction() checking whether the transaction has
at least as many credits reserved as buffers attached.  The core of the
problem is that when a file gets truncated, buffers that still need
checkpointing or that are attached to the committing transaction are
left with buffer_mapped set. When this happens to buffers beyond i_size
attached to a page stradding i_size, subsequent write extending the file
will see these buffers and as they are mapped (but underlying blocks
were freed) things go awry from here.

The assertion failure just coincidentally (and in this case luckily as
we would start corrupting filesystem) triggers due to journal_head not
being properly cleaned up as well.

We fix the problem by unmapping buffers if possible (in lots of cases we
just need a buffer attached to a transaction as a place holder but it
must not be written out anyway).  And in one case, we just have to bite
the bullet and wait for transaction commit to finish.

CC: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: NJan Kara <jack@suse.cz>

b794e7a6

ext4: release donor reference when EXT4_IOC_MOVE_EXT ioctl fails · 9b687332

由 Djalal Harouni 提交于 9月 26, 2012

When the EXT4_IOC_MOVE_EXT ioctl() fails on bigalloc file systems, we
should jump to the 'mext_out' label to release the donor file reference.
Signed-off-by: NDjalal Harouni <tixxdz@opendz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9b687332

ext4: enable FITRIM ioctl on bigalloc file system · aaf7d73e

由 Lukas Czerner 提交于 9月 26, 2012

With a minor tweaks regarding minimum extent size to discard and
discarded bytes reporting the FITRIM can be enabled on bigalloc file
system and it works without any problem.

This patch fixes minlen handling and discarded bytes reporting to
take into consideration bigalloc enabled file systems and finally
removes the restriction and allow FITRIM to be used on file system with
bigalloc feature enabled.
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

aaf7d73e

ext4: fix fdatasync() for files with only i_size changes · b71fc079

由 Jan Kara 提交于 9月 26, 2012

Code tracking when transaction needs to be committed on fdatasync(2) forgets
to handle a situation when only inode's i_size is changed. Thus in such
situations fdatasync(2) doesn't force transaction with new i_size to disk
and that can result in wrong i_size after a crash.

Fix the issue by updating inode's i_datasync_tid whenever its size is
updated.

CC: <stable@vger.kernel.org> # >= 2.6.32
Reported-by: NKristian Nielsen <knielsen@knielsen-hq.org>
Signed-off-by: NJan Kara <jack@suse.cz>

b71fc079

ext4: always set i_op in ext4_mknod() · 6a08f447

由 Bernd Schubert 提交于 9月 26, 2012

ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR
to mask those methods. And ext4_iget also always sets it, so there is
an inconsistency.
Signed-off-by: NBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

6a08f447

ext4: remove unused function ext4_ext_check_cache · 63fedaf1

由 Lukas Czerner 提交于 9月 26, 2012

Remove unused function ext4_ext_check_cache() and merge the code back to
the ext4_ext_in_cache().
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

63fedaf1

ext4: use kmem_cache_zalloc instead of kmem_cache_alloc/memset · 85556c9a

由 Wei Yongjun 提交于 9月 26, 2012

Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().

spatch with a semantic match is used to found this problem.
(http://coccinelle.lip6.fr/)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

85556c9a

ext4: reimplement uninit extent optimization for move_extent_per_page() · 8c854473

由 Dmitry Monakhov 提交于 9月 26, 2012

Uninitialized extent may became initialized(parallel writeback task)
at any moment after we drop i_data_sem, so we have to recheck extent's
state after we hold page's lock and i_data_sem.

If we about to change page's mapping we must hold page's lock in order to
serialize other users.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8c854473

ext4: clean up online defrag bugs in move_extent_per_page() · bb557488

由 Dmitry Monakhov 提交于 9月 26, 2012

Non-full list of bugs:
1) uninitialized extent optimization does not hold page's lock,
   and simply replace brunches after that writeback code goes
   crazy because block mapping changed under it's feets
   kernel BUG at fs/ext4/inode.c:1434!  ( 288'th xfstress)

2) uninitialized extent may became initialized right after we
   drop i_data_sem, so extent state must be rechecked

3) Locked pages goes uptodate via following sequence:
   ->readpage(page); lock_page(page); use_that_page(page)
   But after readpage() one may invalidate it because it is
   uptodate and unlocked (reclaimer does that)
   As result kernel bug at include/linux/buffer_head.c:133!

4) We call write_begin() with already opened stansaction which
   result in following deadlock:
->move_extent_per_page()
  ->ext4_journal_start()-> hold journal transaction
  ->write_begin()
    ->ext4_da_write_begin()
      ->ext4_nonda_switch()
        ->writeback_inodes_sb_if_idle()  --> will wait for journal_stop()

5) try_to_release_page() may fail and it does fail if one of page's bh was
   pinned by journal

6) If we about to change page's mapping we MUST hold it's lock during entire
   remapping procedure, this is true for both pages(original and donor one)

Fixes:

- Avoid (1) and (2) simply by temproraly drop uninitialized extent handling
  optimization, this will be reimplemented later.

- Fix (3) by manually forcing page to uptodate state w/o dropping it's lock

- Fix (4) by rearranging existing locking:
  from: journal_start(); ->write_begin
  to: write_begin(); journal_extend()
- Fix (5) simply by checking retvalue
- Fix (6) by locking both (original and donor one) pages during extent swap
  with help of mext_page_double_lock()
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bb557488

ext4: online defrag is not supported for journaled files · f066055a

由 Dmitry Monakhov 提交于 9月 26, 2012

Proper block swap for inodes with full journaling enabled is
truly non obvious task. In order to be on a safe side let's
explicitly disable it for now.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

f066055a

ext4: move_extent code cleanup · 03bd8b9b

由 Dmitry Monakhov 提交于 9月 26, 2012

- Remove usless checks, because it is too late to check that inode != NULL
  at the moment it was referenced several times.
- Double lock routines looks very ugly and locking ordering relays on
  order of i_ino, but other kernel code rely on order of pointers.
  Let's make them simple and clean.
- check that inodes belongs to the same SB as soon as possible.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

03bd8b9b

26 9月, 2012 2 次提交

ext4: don't call update_backups() multiple times for the same bg · 0acdb887

由 Tao Ma 提交于 9月 26, 2012

When performing an online resize, we add a bunch of groups at one time
in ext4_flex_group_add, so in most cases a lot of group descriptors
will be in the same group block. But in the end of this function,
update_backups will be called for every group descriptor and the same
block will be copied and journalled again and again.  It is really a
waste.

Fix things so we only update a particular bg descriptor block once and
skip subsequent updates of the same block.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0acdb887

ext4: fix double unlock buffer mess during fs-resize · 7f1468d1

由 Dmitry Monakhov 提交于 9月 25, 2012

bh_submit_read() is responsible for unlock bh on endio.  In addition,
we need to use bh_uptodate_or_lock() to avoid races.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7f1468d1

24 9月, 2012 3 次提交

ext4: check free inode count before allocating an inode · f2a09af6

由 Yongqiang Yang 提交于 9月 23, 2012

Recently, I ecountered some corrupted filesystems in which some
groups' free inode counts were 65535, it seemed that free inode
count was overflow.  This patch teaches ext4 to check free inode
count before allocaing an inode.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f2a09af6

ext4: check free block counters in ext4_mb_find_by_goal · 838cd0cf

由 Yongqiang Yang 提交于 9月 23, 2012

Free block counters should be checked before doing allocation.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

838cd0cf

ext4: fix crash when accessing /proc/mounts concurrently · 50df9fd5

由 Herton Ronaldo Krzesinski 提交于 9月 23, 2012

The crash was caused by a variable being erronously declared static in
token2str().

In addition to /proc/mounts, the problem can also be easily replicated
by accessing /proc/fs/ext4/<partition>/options in parallel:

$ cat /proc/fs/ext4/<partition>/options > options.txt

... and then running the following command in two different terminals:

$ while diff /proc/fs/ext4/<partition>/options options.txt; do true; done

This is also the cause of the following a crash while running xfstests
#234, as reported in the following bug reports:

	https://bugs.launchpad.net/bugs/1053019
	https://bugzilla.kernel.org/show_bug.cgi?id=47731Signed-off-by: NHerton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Brad Figg <brad.figg@canonical.com>
Cc: stable@vger.kernel.org

50df9fd5

20 9月, 2012 3 次提交

ext4: remove erroneous ext4_superblock_csum_set() in update_backups() · bef53b01

由 Tao Ma 提交于 9月 20, 2012

The update_backups() function is used to backup all the metadata
blocks, so we should not take it for granted that 'data' is pointed to
a super block and use ext4_superblock_csum_set to calculate the
checksum there.  In case where the data is a group descriptor block,
it will corrupt the last group descriptor, and then e2fsck will
complain about it it.

As all the metadata checksums should already be OK when we do the
backup, remove the wrong ext4_superblock_csum_set and it should be
just fine.
Reported-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

bef53b01

ext4: fix potential deadlock in ext4_nonda_switch() · 00d4e736

由 Theodore Ts'o 提交于 9月 19, 2012

In ext4_nonda_switch(), if the file system is getting full we used to
call writeback_inodes_sb_if_idle().  The problem is that we can be
holding i_mutex already, and this causes a potential deadlock when
writeback_inodes_sb_if_idle() when it tries to take s_umount.  (See
lockdep output below).

As it turns out we don't need need to hold s_umount; the fact that we
are in the middle of the write(2) system call will keep the superblock
pinned.  Unfortunately writeback_inodes_sb() checks to make sure
s_umount is taken, and the VFS uses a different mechanism for making
sure the file system doesn't get unmounted out from under us.  The
simplest way of dealing with this is to just simply grab s_umount
using a trylock, and skip kicking the writeback flusher thread in the
very unlikely case that we can't take a read lock on s_umount without
blocking.

Also, we now check the cirteria for kicking the writeback thread
before we decide to whether to fall back to non-delayed writeback, so
if there are any outstanding delayed allocation writes, we try to get
them resolved as soon as possible.

   [ INFO: possible circular locking dependency detected ]
   3.6.0-rc1-00042-gce894ca #367 Not tainted
   -------------------------------------------------------
   dd/8298 is trying to acquire lock:
    (&type->s_umount_key#18){++++..}, at: [<c02277d4>] writeback_inodes_sb_if_idle+0x28/0x46

   but task is already holding lock:
    (&sb->s_type->i_mutex_key#8){+.+...}, at: [<c01ddcce>] generic_file_aio_write+0x5f/0xd3

   which lock already depends on the new lock.

   2 locks held by dd/8298:
    #0:  (sb_writers#2){.+.+.+}, at: [<c01ddcc5>] generic_file_aio_write+0x56/0xd3
    #1:  (&sb->s_type->i_mutex_key#8){+.+...}, at: [<c01ddcce>] generic_file_aio_write+0x5f/0xd3

   stack backtrace:
   Pid: 8298, comm: dd Not tainted 3.6.0-rc1-00042-gce894ca #367
   Call Trace:
    [<c015b79c>] ? console_unlock+0x345/0x372
    [<c06d62a1>] print_circular_bug+0x190/0x19d
    [<c019906c>] __lock_acquire+0x86d/0xb6c
    [<c01999db>] ? mark_held_locks+0x5c/0x7b
    [<c0199724>] lock_acquire+0x66/0xb9
    [<c02277d4>] ? writeback_inodes_sb_if_idle+0x28/0x46
    [<c06db935>] down_read+0x28/0x58
    [<c02277d4>] ? writeback_inodes_sb_if_idle+0x28/0x46
    [<c02277d4>] writeback_inodes_sb_if_idle+0x28/0x46
    [<c026f3b2>] ext4_nonda_switch+0xe1/0xf4
    [<c0271ece>] ext4_da_write_begin+0x27/0x193
    [<c01dcdb0>] generic_file_buffered_write+0xc8/0x1bb
    [<c01ddc47>] __generic_file_aio_write+0x1dd/0x205
    [<c01ddce7>] generic_file_aio_write+0x78/0xd3
    [<c026d336>] ext4_file_write+0x480/0x4a6
    [<c0198c1d>] ? __lock_acquire+0x41e/0xb6c
    [<c0180944>] ? sched_clock_cpu+0x11a/0x13e
    [<c01967e9>] ? trace_hardirqs_off+0xb/0xd
    [<c018099f>] ? local_clock+0x37/0x4e
    [<c0209f2c>] do_sync_write+0x67/0x9d
    [<c0209ec5>] ? wait_on_retry_sync_kiocb+0x44/0x44
    [<c020a7b9>] vfs_write+0x7b/0xe6
    [<c020a9a6>] sys_write+0x3b/0x64
    [<c06dd4bd>] syscall_call+0x7/0xb
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

00d4e736

ext4: speed up truncate/unlink by not using bforget() unless needed · 18888cf0

由 Andrey Sidorov 提交于 9月 19, 2012

Do not iterate over data blocks scanning for bh's to forget as they're
never exist. This improves time taken by unlink / truncate syscall.
Tested by continuously truncating file that is being written by dd.
Another test is rm -rf of linux tree while tar unpacks it. With
ordered data mode condition unlikely(!tbh) was always met in
ext4_free_blocks. With journal data mode tbh was found only few times,
so optimisation is also possible.

Unlinking fallocated 60G file after doing sync && echo 3 >
/proc/sys/vm/drop_caches && time rm --help

X86 before (linux 3.6-rc4):
# time rm -f test1
real    0m2.710s
user    0m0.000s
sys     0m1.530s

X86 after:
# time rm -f test1
real    0m0.644s
user    0m0.003s
sys     0m0.060s

MIPS before (linux 2.6.37):
# time rm -f test1
real    0m 4.93s
user    0m 0.00s
sys     0m 4.61s

MIPS after:
# time rm -f test1
real    0m 0.16s
user    0m 0.00s
sys     0m 0.06s
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrey Sidorov <qrxd43@motorola.com>

18888cf0

19 9月, 2012 3 次提交

ext4: fix online resizing when the # of block groups is constant · 59e31c15

由 Theodore Ts'o 提交于 9月 19, 2012

Commit 1c6bd717 introduced a regression where an online resize
operation which did not change the number of block groups would fail,
i.e:

	mke2fs -t /dev/vdc 60000
	mount /dev/vdc
	resize2fs /dev/vdc 60001

This was due to a bug in the logic regarding when to try converting
the filesystem to use meta_bg.

Also fix up a number of other minor issues with the online resizing
code: (a) Fix a sparse warning; (b) only check to make sure the device
is large enough once, instead of multiple times through the resize
loop.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

59e31c15

ext4: make orphan functions be no-op in no-journal mode · c9b92530

由 Anatol Pomozov 提交于 9月 18, 2012

Instead of checking whether the handle is valid, we check if journal
is enabled. This avoids taking the s_orphan_lock mutex in all cases
when there is no journal in use, including the error paths where
ext4_orphan_del() is called with a handle set to NULL.
Signed-off-by: NAnatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c9b92530

ext4: re-enable -o discard functionality in no-journal mode · b5e2368b

由 Theodore Ts'o 提交于 9月 18, 2012

This is a revert of commit b56ff9d3, which removed the call to
ext4_issue_discard() to fix a BUG reported because
ext4_issue_discard() was being called from inside a block group
spinlock.  As it turns out this bug had already been fixed by Lukas
Czerner in commit 53fdcf99 by the simple expedient of moving when
we call ext4_issue_discard() outside the spinlock.

So it should be safe to re-enable this functionality, which I tested
by putting an BUG_ON(in_atomic) just after the restored callsite to
ext4_issue_discard().

Addresses-Google-Bug: #6750518
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Anatol Pomozov <anatol.pomozov@gmail.com>

b5e2368b

18 9月, 2012 2 次提交

ext4: fix possible non-initialized variable in htree_dirblock_to_tree() · 90b0a973

由 Carlos Maiolino 提交于 9月 17, 2012

htree_dirblock_to_tree() declares a non-initialized 'err' variable,
which is passed as a reference to another functions expecting them to
set this variable with their error codes.

It's passed to ext4_bread(), which then passes it to ext4_getblk(). If
ext4_map_blocks() returns 0 due to a lookup failure, leaving the
ext4_getblk() buffer_head uninitialized, it will make ext4_getblk()
return to ext4_bread() without initialize the 'err' variable, and
ext4_bread() will return to htree_dirblock_to_tree() with this variable
still uninitialized. htree_dirblock_to_tree() will pass this variable
with garbage back to ext4_htree_fill_tree(), which expects a number of
directory entries added to the rb-tree. which, in case, might return a
fake non-zero value due the garbage left in the 'err' variable, leading
the kernel to an Oops in ext4_dx_readdir(), once this is expecting a
filled rb-tree node, when in turn it will have a NULL-ed one, causing an
invalid page request when trying to get a fname struct from this NULL-ed
rb-tree node in this line:

fname = rb_entry(info->curr_node, struct fname, rb_hash);

The patch itself initializes the err variable in
htree_dirblock_to_tree() to avoid usage mistakes by the called
functions, and also fix ext4_getblk() to return a initialized 'err'
variable when ext4_map_blocks() fails a lookup.
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

90b0a973

T
ext4: do not enable delalloc by default for ext2 · bc0b75f7
由 Theodore Ts'o 提交于 9月 17, 2012
```
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
bc0b75f7

14 9月, 2012 1 次提交
- T
  ext4: advertise the fact that the kernel supports meta_bg resizing · 5e7bbef1
  由 Theodore Ts'o 提交于 9月 13, 2012
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
  5e7bbef1
13 9月, 2012 3 次提交

ext4: log a resize update to the console every 10 seconds · 4da4a56e

由 Theodore Ts'o 提交于 9月 13, 2012

For very long online resizes, a periodic update to the console log is
helpful for debugging and for progress reporting.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4da4a56e

ext4: convert file system to meta_bg if needed during resizing · 1c6bd717

由 Theodore Ts'o 提交于 9月 13, 2012

If we have run out of reserved gdt blocks, then clear the resize_inode
feature and enable the meta_bg feature, so that we can continue
resizing the file system seamlessly.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1c6bd717

ext4: set bg_itable_unused when resizing · 93f90526

由 Theodore Ts'o 提交于 9月 12, 2012

Set bg_itable_unused for file systems that have uninit_bg enabled.
This will speed up the first e2fsck run after the file system is
resized.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

93f90526

05 9月, 2012 7 次提交

ext4: add online resizing support for meta_bg and 64-bit file systems · 01f795f9

由 Yongqiang Yang 提交于 9月 05, 2012

This patch adds support for resizing file systems with the meta_bg and
64bit features.

[ Added a fix by tytso to fix a divide by zero when resizing a
  filesystem from 14 TB to 18TB.  Also fixed overhead accounting for
  meta_bg file systems.]
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

01f795f9

ext4: grow the s_group_info array as needed · 28623c2f

由 Theodore Ts'o 提交于 9月 05, 2012

Previously we allocated the s_group_info array with enough space for
any future possible growth of the file system via online resize.  This
is unfortunate because it wastes memory, and it doesn't work for the
meta_bg scheme, since there is no limit based on the number of
reserved gdt blocks.  So add the code to grow the s_group_info array
as needed.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

28623c2f

ext4: grow the s_flex_groups array as needed when resizing · 117fff10

由 Theodore Ts'o 提交于 9月 05, 2012

Previously, we allocated the s_flex_groups array to the maximum size
that the file system could be resized.  There was two problems with
this approach.  First, it wasted memory in the common case where the
file system was not resized.  Secondly, once we start allowing online
resizing using the meta_bg scheme, there is no maximum size that the
file system can be resized.  So instead, we need to grow the
s_flex_groups at inline resize time.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

117fff10

ext4: avoid duplicate writes of the backup bg descriptor blocks · 2ebd1704

由 Yongqiang Yang 提交于 9月 05, 2012

The resize code was needlessly writing the backup block group
descriptor blocks multiple times (once per block group) during an
online resize.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

2ebd1704

ext4: don't copy non-existent gdt blocks when resizing · 6df935ad

由 Yongqiang Yang 提交于 9月 05, 2012

The resize code was copying blocks at the beginning of each block
group in order to copy the superblock and block group descriptor table
(gdt) blocks.  This was, unfortunately, being done even for block
groups that did not have super blocks or gdt blocks.  This is a
complete waste of perfectly good I/O bandwidth, to skip writing those
blocks for sparse bg's.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

6df935ad

ext4: report the original old blocks count in a debug message when resizing · d7574ad0

由 Yongqiang Yang 提交于 9月 05, 2012

Avoid changing o_blocks_count, since it is used later when reporting
old blocks count in debug mode.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d7574ad0

ext4: ignore last group w/o enough space when resizing instead of BUG'ing · 03c1c290

由 Yongqiang Yang 提交于 9月 05, 2012

If the last group does not have enough space for group tables, ignore
it instead of calling BUG_ON().
Reported-by: NDaniel Drake <dsd@laptop.org>
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

03c1c290

20 8月, 2012 1 次提交

ext4: remove duplicated declarations in inode.c · 8a2f8460

由 Zheng Liu 提交于 8月 19, 2012

In patch cb20d518, ext4_set_bh_endio
and ext4_end_io_buffer_write are declared at the beginning of inode.c,
and again later on in the middle of the file.  Remove the second set
of duplicated function declarations.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8a2f8460