提交 · 722887ddc8982ff40e40b650fbca9ae1e56259bc · openanolis / cloud-kernel

09 2月, 2013 2 次提交

ext4: move the jbd2 wrapper functions out of super.c · 722887dd

由 Theodore Ts'o 提交于 2月 08, 2013

Move the jbd2 wrapper functions which start and stop handles out of
super.c, where they don't really logically belong, and into
ext4_jbd2.c.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

722887dd

jbd2: add tracepoints which provide per-handle statistics · 343d9c28

由 Theodore Ts'o 提交于 2月 08, 2013

Handles which stay open a long time are problematic when it comes time
to close down a transaction so it can be committed. These tracepoints
will help us determine which ones are the problematic ones, and to
validate whether changes makes things better or worse.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

343d9c28

07 2月, 2013 2 次提交

jbd2: revert "jbd2: add COW fields to struct jbd2_journal_handle" · 078d5039

由 Theodore Ts'o 提交于 2月 07, 2013

This reverts commit 93737456.

The cow-snapshots effort is no longer active, so remove these extra
fields to shrink down the handle structure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

078d5039

jbd2: track request delay statistics · 9fff24aa

由 Theodore Ts'o 提交于 2月 06, 2013

Track the delay between when we first request that the commit begin
and when it actually begins, so we can see how much of a gap exists.
In theory, this should just be the remaining scheduling quantuum of
the thread which requested the commit (assuming it was not a
synchronous operation which triggered the commit request) plus
scheduling overhead; however, it's possible that real time processes
might get in the way of letting the kjournald thread from executing.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9fff24aa

05 2月, 2013 1 次提交

ext4: optimize mballoc for large allocations · 40ae3487

由 Theodore Ts'o 提交于 2月 04, 2013

The ext4 block allocator only maintains buddy bitmaps for chunks which
are less than or equal to one quarter of a block group.  That is, for
a file aystem with a 1k blocksize, and where the number of blocks in a
block group is 8192 blocks, the largest chunk size tracked by buddy
bitmaps is 2048 blocks.

For a file system with a 4k blocksize, and where the number of blocks
in a block group is 32768 blocks, the largest chunk size tracked by
buddy bitmaps is 8192 blocks.

To work around this code, mballoc.c before this commit would truncate
allocation requests to the number of blocks in a block group minus 10.
Why 10?  Aside from being a completely arbitrary number, it avoids
block allocation to be a power of two larger than 25% of the block
group.  If you try to explicitly fallocate 50% of the block group
size, this will demonstrate the problem; the block allocation code
will scan the all of the blocks in the file system with cr==0 (since
the request is for a natural power of two), but then completely fail
for all blocks groups, since the buddy bitmaps don't track chunk sizes
of 50% of the block group.

To fix this, in these we use ext4_mb_complex_scan_group() instead of
ext4_mb_simple_scan_group().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@dilger.ca>

40ae3487

03 2月, 2013 4 次提交

ext4: check incompatible mount options while mounting ext2/3 · 8dc0aa8c

由 Theodore Ts'o 提交于 2月 02, 2013

Check for incompatible mount options when using the ext4 file system
driver to mount ext2 or ext3 file systems.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8dc0aa8c

ext4: print error when argument of inode_readahead_blk is invalid · e33e60ea

由 Jan Kara 提交于 2月 02, 2013

If argument of inode_readahead_blk is too big, we just bail out
without printing any error. Fix this since it could confuse users.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e33e60ea

ext4: make mount option parsing loop more logical · 5f3633e3

由 Jan Kara 提交于 2月 02, 2013

The loop looking for correct mount option entry is more logical if it is
written rewritten as an empty loop looking for correct option entry and then
code handling the option. It also saves one level of indentation for a lot of
code so we can join a couple of split lines.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5f3633e3

ext4: move several mount options to standard handling loop · 0efb3b23

由 Jan Kara 提交于 2月 02, 2013

Several mount option (resuid, resgid, journal_dev, journal_ioprio) are
currently handled before we enter standard option handling loop. I don't
see a reason for this so move them to normal handling loop to make things
more regular.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0efb3b23

02 2月, 2013 4 次提交

ext4: reduce one "if" comparison in ext4_dirhash() · 0e79537d

由 Cong Ding 提交于 2月 01, 2013

It is unnecessary to check i<4 after the loop; just do it before the
break.
Signed-off-by: NCong Ding <dinggnu@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0e79537d

ext4: fix race in ext4_mb_add_n_trim() · f1167009

由 Niu Yawei 提交于 2月 01, 2013

In ext4_mb_add_n_trim(), lg_prealloc_lock should be taken when
changing the lg_prealloc_list.
Signed-off-by: NNiu Yawei <yawei.niu@intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

f1167009

ext4: fix smatch warning in move_extent.c's mext_replace_branches() · 87e69873

由 Akria Fujita 提交于 2月 01, 2013

Commit 2147b1a6 resulted in a new smatch warning:

> fs/ext4/move_extent.c:693 mext_replace_branches()
> 	 warn: variable dereferenced before check 'dext' (see line 683)

Fix this by adding a check to make sure dext is non-NULL before we
derefrence it.
Signed-off-by: NAkria Fujita <a-fujita@rs.jp.nec.com>
[ modified by tytso to make sure an ext4_error is called ]
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

87e69873

ext4: use WARN in ext4_alloc_blocks · 524c19eb

由 Julia Lawall 提交于 2月 01, 2013

Use WARN rather than printk followed by WARN_ON(1), for conciseness.

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression list es;
@@

-printk(
+WARN(1,
  es);
-WARN_ON(1);
// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

524c19eb

30 1月, 2013 2 次提交

jbd2: don't wake kjournald unnecessarily · e7b04ac0

由 Eric Sandeen 提交于 1月 30, 2013

Don't send an extra wakeup to kjournald in the case where we
already have the proper target in j_commit_request, i.e. that
transaction has already been requested for commit.

commit deeeaf13 "jbd2: fix fsync() tid wraparound bug" changed
the logic leading to a wakeup, but it caused some extra wakeups
which were found to lead to a measurable performance regression.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
[tytso@mit.edu: reworked check to make it clearer]
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e7b04ac0

ext4: fix possible use-after-free with AIO · 091e26df

由 Jan Kara 提交于 1月 29, 2013

Running AIO is pinning inode in memory using file reference. Once AIO
is completed using aio_complete(), file reference is put and inode can
be freed from memory. So we have to be sure that calling aio_complete()
is the last thing we do with the inode.

CC: stable@vger.kernel.org
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Acked-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

091e26df

29 1月, 2013 10 次提交

ext4: remove unnecessary NULL pointer check · b1deefc9

由 Guo Chao 提交于 1月 28, 2013

brelse() and ext4_journal_force_commit() are both inlined and able
to handle NULL.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b1deefc9

ext4: remove useless assignment in dx_probe() · 41be871f

由 Guo Chao 提交于 1月 28, 2013

Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

41be871f

ext4: remove unused variable in add_dirent_to_buf() · 2bbbee2a

由 Guo Chao 提交于 1月 28, 2013

After commit 978fef91 (create __ext4_insert_dentry for dir entry
insertion), 'reclen' is not used anymore.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>

2bbbee2a

ext4: release buffer when checksum failed · d5ac7773

由 Guo Chao 提交于 1月 28, 2013

Commit b0336e8d (ext4: calculate and verify checksums of directory
leaf blocks) and commit dbe89444 (ext4: Calculate and verify checksums
for htree nodes) forget to release buffer when checksum failed, at
some places.
Signed-off-by: NGuo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>

d5ac7773

ext4: remove explicit WARN_ON when ext4_map_blocks() fails · b06acd38

由 Lukas Czerner 提交于 1月 28, 2013

In two places we call WARN_ON() before we print out the debug message,
however we agreed that the WARN_ON() is unnecessary at those places so
remove them.

Also use ext4_warning() instead of ext4_msg() and printk().
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b06acd38

ext4: remove unused variable flags · cfa72754

由 Lukas Czerner 提交于 1月 28, 2013

Remove unused variable flags from dump_completed_IO(). The code is
only exercised when EXT4FS_DEBUG is defined.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>

cfa72754

ext4: fix ext4_writepage() to achieve data=ordered guarantees · fe386132

由 Jan Kara 提交于 1月 28, 2013

So far ext4_writepage() skipped writing pages that had any delayed or
unwritten buffers attached. When blocksize < pagesize this breaks
data=ordered mode guarantees as we can have a page with one freshly
allocated buffer whose allocation is part of the committing
transaction and another buffer in the page which is delayed or
unwritten. So fix this problem by calling ext4_bio_writepage()
anyway. It will submit mapped buffers and leave others alone.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fe386132

ext4: Make ext4_bio_writepage() handle unprepared buffers · 8a850c3f

由 Jan Kara 提交于 1月 28, 2013

So far ext4_bio_writepage() unconditionally cleared dirty bit on all
buffers underlying the page. That implicitely assumes we can write all
buffers. So far that is true because callers call into
ext4_bio_writepage() make sure all buffers in the page are mapped but:

a) it's a data corruption bug waiting to happen
b) in data=ordered mode when blocksize < pagesize we do need to write
   pages that may have only some of dirty buffers mapped.

So change ext4_bio_writepage() to skip buffers that cannot be written without
clearing their dirty bit.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8a850c3f

ext4: simplify mpage_add_bh_to_extent() · b6a8e62f

由 Jan Kara 提交于 1月 28, 2013

The argument b_size of mpage_add_bh_to_extent() was bogus since it was
always == blocksize (which we can easily derive from inode->i_blkbits).
Also second branch of condition:
	if (nrblocks >= EXT4_MAX_TRANS_DATA) {
	} else if ((nrblocks + (b_size >> mpd->inode->i_blkbits)) >
						EXT4_MAX_TRANS_DATA) {
	}
was never taken because (b_size >> mpd->inode->i_blkbits) == 1.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b6a8e62f

ext4: dirty page has always buffers attached · f8bec370

由 Jan Kara 提交于 1月 28, 2013

ext4_writepage(), write_cache_pages_da(), and mpage_da_submit_io()
doesn't have to deal with the case when page doesn't have buffers. We
attach buffers to a page in ->write_begin() and ->page_mkwrite() which
covers all places where a page can become dirty.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f8bec370

28 1月, 2013 6 次提交

ext4: simplify list handling in ext4_do_flush_completed_IO() · 002bd7fa

由 Jan Kara 提交于 1月 28, 2013

The function splices i_completed_io_list to its private list
first.  From that moment on we don't need any lock for working with
io_end structures because all io_end structure on the list are only
our own. So we can remove the other two lists in the function and free
io_end immediately after we are done with it.

CC: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

002bd7fa

ext4: move work from io_end to inode · 84c17543

由 Jan Kara 提交于 1月 28, 2013

It does not make much sense to have struct work in ext4_io_end_t
because we always use it for only one ext4_io_end_t per inode (the
first one in the i_completed_io list). So just move the structure to
inode itself.  This also allows for a small simplification in
processing io_end structures.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

84c17543

ext4: remove __ext4_journalled_writepage() from mpage_da_submit_io() · fe089c77

由 Jan Kara 提交于 1月 28, 2013

We don't support delayed allocation in data=journal mode. So checking for it in
mpage_da_submit_io() doesn't make really sence. If we ever decide to extend
delayed allocation support to data=journal mode, adding
__ext4_journalled_writepage() call will be the least of problems we have to
solve. Most likely we'd have to implement separate writepages call anyways
because we don't have transaction credits for writing more than a single page
so mapping of page buffers would have to be done differently.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fe089c77

ext4: use redirty_page_for_writepage() in ext4_bio_write_page() · 1ae48a63

由 Jan Kara 提交于 1月 28, 2013

When we cannot write a page we should use redirty_page_for_writepage()
instead of plain set_page_dirty(). That tells writeback code we have
problems, redirties only the page (redirtying buffers is not needed),
and updates mm accounting of failed page writes.

Also move clearing of buffer dirty flag after io_submit_add_bh(). At that
moment we are sure buffer will be going to disk.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1ae48a63

ext4: Always use ext4_bio_write_page() for writeout · 36ade451

由 Jan Kara 提交于 1月 28, 2013

Currently we sometimes used block_write_full_page() and sometimes
ext4_bio_write_page() for writeback (depending on mount options and call
path). Let's always use ext4_bio_write_page() to simplify things a bit.
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

36ade451

ext4: add punching hole support for non-extent-mapped files · 8bad6fc8

由 Zheng Liu 提交于 1月 28, 2013

This patch add supports for indirect file support punching hole.  It
is almost the same as ext4_ext_punch_hole.  First, we invalidate all
pages between this hole, and then we try to deallocate all blocks of
this hole.

A recursive function is used to handle deallocation of blocks.  In
this function, it iterates over the entries in inode's i_blocks or
indirect blocks, and try to free the block for each one of them.

After applying this patch, xfstest #255 will not pass w/o extent because
indirect-based file doesn't support unwritten extents.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8bad6fc8

25 1月, 2013 3 次提交

ext4: fix memory leak when quota options are specified multiple times · 03dafb5f

由 Chen Gang 提交于 1月 24, 2013

When usrjquota or grpjquota mount options are specified several times,
we leak memory storing the names. Free the memory correctly.
Signed-off-by: NChen Gang <gang.chen@asianux.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

03dafb5f

quota: autoload the quota_v2 module for QFMT_VFS_V1 quota format · c3ad83d9

由 Theodore Ts'o 提交于 1月 24, 2013

Otherwise, ext4 file systems with the quota featured enable will get a
very confusing "No such process" error message if the quota code is
built as a module and the quota_v2 module has not been loaded.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Acked-by: NJan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org

c3ad83d9

ext4: release sysfs kobject when failing to enable quotas on mount · 72ba7450

由 Theodore Ts'o 提交于 1月 24, 2013

In addition, print the error returned from ext4_enable_quotas()
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Cc: stable@vger.kernel.org

72ba7450

17 1月, 2013 1 次提交

ext4: add tracepoint in punching hole · aaddea81

由 Zheng Liu 提交于 1月 16, 2013

This patch adds a tracepoint in ext4_punch_hole.

CC: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

aaddea81

13 1月, 2013 4 次提交

ext4: trigger the lazy inode table initialization after resize · 7f511862

由 Theodore Ts'o 提交于 1月 13, 2013

After we have finished extending the file system, we need to trigger a
the lazy inode table thread to zero out the inode tables.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7f511862

ext4: check bh in ext4_read_block_bitmap() · 15b49132

由 Eryu Guan 提交于 1月 12, 2013

Validate the bh pointer before using it, since
ext4_read_block_bitmap_nowait() might return NULL.

I've seen this in fsfuzz testing.

 EXT4-fs error (device loop0): ext4_read_block_bitmap_nowait:385: comm touch: Cannot get buffer for block bitmap - block_group = 0, block_bitmap = 3925999616
 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [<ffffffff8121de25>] ext4_wait_block_bitmap+0x25/0xe0
 ...
 Call Trace:
  [<ffffffff8121e1e5>] ext4_read_block_bitmap+0x35/0x60
  [<ffffffff8125e9c6>] ext4_free_blocks+0x236/0xb80
  [<ffffffff811d0d36>] ? __getblk+0x36/0x70
  [<ffffffff811d0a5f>] ? __find_get_block+0x8f/0x210
  [<ffffffff81191ef3>] ? kmem_cache_free+0x33/0x140
  [<ffffffff812678e5>] ext4_xattr_release_block+0x1b5/0x1d0
  [<ffffffff812679be>] ext4_xattr_delete_inode+0xbe/0x100
  [<ffffffff81222a7c>] ext4_free_inode+0x7c/0x4d0
  [<ffffffff812277b8>] ? ext4_mark_inode_dirty+0x88/0x230
  [<ffffffff8122993c>] ext4_evict_inode+0x32c/0x490
  [<ffffffff811b8cd7>] evict+0xa7/0x1c0
  [<ffffffff811b8ed3>] iput_final+0xe3/0x170
  [<ffffffff811b8f9e>] iput+0x3e/0x50
  [<ffffffff812316fd>] ext4_add_nondir+0x4d/0x90
  [<ffffffff81231d0b>] ext4_create+0xeb/0x170
  [<ffffffff811aae9c>] vfs_create+0xac/0xd0
  [<ffffffff811ac845>] lookup_open+0x185/0x1c0
  [<ffffffff8129e3b9>] ? selinux_inode_permission+0xa9/0x170
  [<ffffffff811acb54>] do_last+0x2d4/0x7a0
  [<ffffffff811af743>] path_openat+0xb3/0x480
  [<ffffffff8116a8a1>] ? handle_mm_fault+0x251/0x3b0
  [<ffffffff811afc49>] do_filp_open+0x49/0xa0
  [<ffffffff811bbaad>] ? __alloc_fd+0xdd/0x150
  [<ffffffff8119da28>] do_sys_open+0x108/0x1f0
  [<ffffffff8119db51>] sys_open+0x21/0x30
  [<ffffffff81618959>] system_call_fastpath+0x16/0x1b

Also fix comment for ext4_read_block_bitmap_nowait()
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

15b49132

ext4: use unlikely to improve the efficiency of the kernel · aebf0243

由 Wang Shilong 提交于 1月 12, 2013

Because the function 'sb_getblk' seldomly fails to return NULL
value,it will be better to use 'unlikely' to optimize it.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

aebf0243

ext4: return ENOMEM if sb_getblk() fails · 860d21e2

由 Theodore Ts'o 提交于 1月 12, 2013

The only reason for sb_getblk() failing is if it can't allocate the
buffer_head.  So ENOMEM is more appropriate than EIO.  In addition,
make sure that the file system is marked as being inconsistent if
sb_getblk() fails.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

860d21e2

10 1月, 2013 1 次提交
- L
  
  Linux 3.8-rc3 · 9931faca
  由 Linus Torvalds 提交于 1月 09, 2013
  
  9931faca

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功