提交 · 39db00f1c45e770856264bdb3ceca27980b01965 · openanolis / cloud-kernel

01 5月, 2011 1 次提交

ext4: don't set PageUptodate in ext4_end_bio() · 39db00f1

由 Curt Wohlgemuth 提交于 4月 30, 2011

In the bio completion routine, we should not be setting
PageUptodate at all -- it's set at sys_write() time, and is
unaffected by success/failure of the write to disk.

This can cause a page corruption bug when the file system's
block size is less than the architecture's VM page size.

if we have only written a single block -- we might end up
setting the page's PageUptodate flag, indicating that page
is completely read into memory, which may not be true.
This could cause subsequent reads to get bad data.

This commit also takes the opportunity to clean up error
handling in ext4_end_bio(), and remove some extraneous code:

   - fixes ext4_end_bio() to set AS_EIO in the
     page->mapping->flags on error, which was left out by
     mistake.  This is needed so that fsync() will
     return an error if there was an I/O error.
   - remove the clear_buffer_dirty() call on unmapped
     buffers for each page.
   - consolidate page/buffer error handling in a single
     section.
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NJim Meyering <jim@meyering.net>
Reported-by: NHugh Dickins <hughd@google.com>
Cc: Mingming Cao <cmm@us.ibm.com>

39db00f1

19 4月, 2011 1 次提交

ext4: check for ext[23] file system features when mounting as ext[23] · 2035e776

由 Theodore Ts'o 提交于 4月 18, 2011

Provide better emulation for ext[23] mode by enforcing that the file
system does not have any unsupported file system features as defined
by ext[23] when emulating the ext[23] file system driver when
CONFIG_EXT4_USE_FOR_EXT23 is defined.

This causes the file system type information in /proc/mounts to be
correct for the automatically mounted root file system.  This also
means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda
contains an ext3 or ext4 file system, just as one would expect if the
original ext2 file system driver were in use.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2035e776

17 4月, 2011 1 次提交

ext4: release page cache in ext4_mb_load_buddy error path · 26626f11

由 Yang Ruirui 提交于 4月 16, 2011

Add missing page_cache_release in the error path of ext4_mb_load_buddy
Signed-off-by: NYang Ruirui <ruirui.r.yang@tieto.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

26626f11

11 4月, 2011 4 次提交

ext4: fix data corruption regression by reverting commit · c8205636

由 Theodore Ts'o 提交于 4月 10, 2011

Revert commit 6de9843d, since it
caused a data corruption regression with BitTorrent downloads. Thanks
to Damien for discovering and bisecting to find the problem commit.

https://bugzilla.kernel.org/show_bug.cgi?id=32972Reported-by: NDamien Grassart <damien@grassart.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c8205636

ext4: Allow indirect-block file to grow the file size to max file size · f80da1e7

由 Kazuya Mio 提交于 4月 10, 2011

We can create 4402345721856 byte file with indirect block mapping.
However, if we grow an indirect-block file to the size with ftruncate(),
we can see an ext4 warning. The following patch fixes this problem.

How to reproduce:
# dd if=/dev/zero of=/mnt/mp1/hoge bs=1 count=0 seek=4402345721856
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000221428 s, 0.0 kB/s
# tail -n 1 /var/log/messages
Nov 25 15:10:27 test kernel: EXT4-fs warning (device sda8): ext4_block_to_path:345: block 1074791436 > max in inode 12
Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f80da1e7

ext4: allow an active handle to be started when freezing · be4f27d3

由 Yongqiang Yang 提交于 4月 10, 2011

ext4_journal_start_sb() should not prevent an active handle from being
started due to s_frozen.  Otherwise, deadlock is easy to happen, below
is a situation.

================================================
     freeze         |       truncate
================================================
                    |  ext4_ext_truncate()
    freeze_super()  |   starts a handle
    sets s_frozen   |
                    |  ext4_ext_truncate()
                    |  holds i_data_sem
  ext4_freeze()     |
  waits for updates |
                    |  ext4_free_blocks()
                    |  calls dquot_free_block()
                    |
                    |  dquot_free_blocks()
                    |  calls ext4_dirty_inode()
                    |
                    |  ext4_dirty_inode()
                    |  trys to start an active
                    |  handle
                    |
                    |  block due to s_frozen
================================================
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NAmir Goldstein <amir73il@users.sf.net>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

be4f27d3

ext4: sync the directory inode in ext4_sync_parent() · 0893ed45

由 Curt Wohlgemuth 提交于 4月 10, 2011

ext4 has taken the stance that, in the absence of a journal,
when an fsync/fdatasync of an inode is done, the parent
directory should be sync'ed if this inode entry is new.
ext4_sync_parent(), which implements this, does indeed sync
the dirent pages for parent directories, but it does not
sync the directory *inode*.  This patch fixes this.

Also now return error status from ext4_sync_parent().

I tested this using a power fail test, which panics a
machine running a file server getting requests from a
client.  Without this patch, on about every other test run,
the server is missing many, many files that had been synced.
With this patch, on > 6 runs, I see zero files being lost.

Google-Bug-Id: 4179519
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0893ed45

06 4月, 2011 1 次提交

ext4: init timer earlier to avoid a kernel panic in __save_error_info · 04496411

由 Tao Ma 提交于 4月 05, 2011

During mount, when we fail to open journal inode or root inode, the
__save_error_info will mod_timer. But actually s_err_report isn't
initialized yet and the kernel oops. The detailed information can
be found https://bugzilla.kernel.org/show_bug.cgi?id=32082.

The best way is to check whether the timer s_err_report is initialized
or not. But it seems that in include/linux/timer.h, we can't find a
good function to check the status of this timer, so this patch just
move the initializtion of s_err_report earlier so that we can avoid
the kernel panic. The corresponding del_timer is also added in the
error path.
Reported-by: NSami Liedes <sliedes@cc.hut.fi>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

04496411

05 4月, 2011 3 次提交

ext4: fix a double free in ext4_register_li_request · 46e4690b

由 Tao Ma 提交于 4月 04, 2011

In ext4_register_li_request, we malloc a ext4_li_request and
inserts it into ext4_li_info->li_request_list. In case of any
error later, we free it in the end.  But if we have some error
in ext4_run_lazyinit_thread, the whole li_request_list will be
dropped and freed in it. So we will double free this ext4_li_request.

This patch just sets elr to NULL after it is inserted to the list
so that the latter kfree won't double free it.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

46e4690b

ext4: fix credits computing for indirect mapped files · 5b41395f

由 Yongqiang Yang 提交于 4月 04, 2011

When writing a contiguous set of blocks, two indirect blocks could be
needed depending on how the blocks are aligned, so we need to increase
the number of credits needed by one.

[ Also fixed a another bug which could further underestimate the
  number of journal credits needed by 1; the code was using integer
  division instead of DIV_ROUND_UP() -- tytso]
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

5b41395f

ext4: remove unnecessary [cm]time update of quota file · 21f97697

由 Jan Kara 提交于 4月 04, 2011

It is not necessary to update [cm]time of quota file on each quota
file write and it wastes journal space and IO throughput with inode
writes. So just remove the updating from ext4_quota_write() and only
update times when quotas are being turned off. Userspace cannot get
anything reliable from quota files while they are used by the kernel
anyway.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

21f97697

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

24 3月, 2011 5 次提交

userns: rename is_owner_or_cap to inode_owner_or_capable · 2e149670

由 Serge E. Hallyn 提交于 3月 23, 2011

And give it a kernel-doc comment.

[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: NDavid Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2e149670

ext4: use little-endian bitops · 50e0168c

由 Akinobu Mita 提交于 3月 23, 2011

As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h.  This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

50e0168c

ext4: fix a BUG in mb_mark_used during trim. · 0ba08517

由 Tao Ma 提交于 3月 23, 2011

In a bs=4096 volume, if we call FITRIM with the following parameter as
fstrim_range(start = 102400, len = 134144000, minlen = 10240),
we will trigger this BUG_ON:

BUG_ON(start + len > (e4b->bd_sb->s_blocksize << 3));

Mar 4 00:55:52 boyu-tm kernel: ------------[ cut here ]------------
Mar 4 00:55:52 boyu-tm kernel: kernel BUG at fs/ext4/mballoc.c:1506!
Mar 4 01:21:09 boyu-tm kernel: Code: d4 00 00 00 00 49 89 fe 8b 56 0c 44 8b 7e 04 89 55 c4 48 8b 4f 28 89 d6 44 01 fe 48 63 d6 48 8b 41 18 48 c1 e0 03 48 39 c2 76 04 <0f> 0b eb fe 48 8b 55 b0 8b 47 34 3b 42 08 74 04 0f 0b eb fe 48
Mar 4 01:21:09 boyu-tm kernel: RIP [<ffffffffa053eb42>] mb_mark_used+0x47/0x26c [ext4]
Mar 4 01:21:09 boyu-tm kernel: RSP <ffff880121e45c38>
Mar 4 01:21:09 boyu-tm kernel: ---[ end trace 9f461696f6a9dcf2 ]---

Fix this bug by doing the accounting correctly.

Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0ba08517

ext4: unused variables cleanup in fs/ext4/extents.c · 65922cb5

由 Sergey Senozhatsky 提交于 3月 23, 2011

ext4 extents cleanup:

  . remove unused `*ex' from check_eofblocks_fl
  . remove unused `*eh' from ext4_ext_map_blocks
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

65922cb5

ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep() · 6de9843d

由 Feng Tang 提交于 3月 23, 2011

The map_bh() call will have already set the buffer_head to mapped.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6de9843d

22 3月, 2011 3 次提交

ext4: add more tracepoints and use dev_t in the trace buffer · 0562e0ba

由 Jiaying Zhang 提交于 3月 21, 2011

- Add more ext4 tracepoints.
- Change ext4 tracepoints to use dev_t field with MAJOR/MINOR macros
so that we can save 4 bytes in the ring buffer on some platforms.
- Add sync_mode to ext4_da_writepages, ext4_da_write_pages, and
ext4_da_writepages_result tracepoints. Also remove for_reclaim
field from ext4_da_writepages since it is usually not very useful.
Signed-off-by: NJiaying Zhang <jiayingz@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0562e0ba

ext4: don't kfree uninitialized s_group_info members · 4596fe07

由 Eric Sandeen 提交于 3月 21, 2011

We can call kfree on uninitialized members of the s_group_info array
on an the error path. We can avoid this by kzalloc'ing the array.

This doesn't entirely solve the oops on mount if we fail down this
path; failed_mount4: frees the sbi, for one, which gets referenced
later in the failed mount paths - I haven't worked that out yet.

https://bugzilla.kernel.org/show_bug.cgi?id=30872Reported-by: NEugene A. Shatokhin <dame_eugene@mail.ru>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4596fe07

ext4: add missing space in printk's in __ext4_grp_locked_error() · 21149d61

由 Robin Dong 提交于 3月 21, 2011

When we do performence-testing on ext4 filesystem, we observed a
warning like this:

EXT4-fs error (device sda7): ext4_mb_generate_buddy:718: group 259825901 blocks in bitmap, 26057 in gd

instead, it should be

"group 2598, 25901 blocks in bitmap, 26057 in gd"
Reviewed-by: NColy Li <bosong.ly@taobao.com>
Cc: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: NRobin Dong <sanbai@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

21149d61

21 3月, 2011 4 次提交

ext4: add FITRIM to compat_ioctl. · a56e69c2

由 Tao Ma 提交于 3月 20, 2011

FITRIM isn't added in compat_ioctl. So a 32 bit program can't be executed
in a 64 bit platform. Add it in the compat_ioctl.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a56e69c2

ext4: handle errors in ext4_clear_blocks() · d67d1218

由 Amir Goldstein 提交于 3月 20, 2011

Checking return code from ext4_journal_get_write_access() is important
with snapshots, because this function invokes COW, so may return new
errors, such as ENOSPC.

ext4_clear_blocks() now returns < 0 for fatal errors, in which case,
ext4_free_data() is aborted.
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d67d1218

ext4: unify the ext4_handle_release_buffer() api · 537a0310

由 Amir Goldstein 提交于 3月 20, 2011

There are two wrapper functions which do exactly the same thing:
ext4_journal_release_buffer(), and ext4_handle_release_buffer().  In
addition, ext4_xattr_block_set() calls jbd2_journal_release_buffer()
directly.

Unify all of the code to use ext4_handle_release_buffer(), and get rid
of ext4_journal_release_buffer().
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

537a0310

ext4: handle errors in ext4_rename · ef607893

由 Amir Goldstein 提交于 3月 20, 2011

Checking return code from ext4_journal_get_write_access() is important
with snapshots, because this function invokes COW, so may return new
errors, such as ENOSPC.

We move the call to ext4_journal_get_write_access earlier in the
function, to simplify error handling in the case that this function
returns returns an error.
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ef607893

17 3月, 2011 1 次提交

ext4: Initialize fsync transaction ids in ext4_new_inode() · 688f869c

由 Theodore Ts'o 提交于 3月 16, 2011

When allocating a new inode, we need to make sure i_sync_tid and
i_datasync_tid are initialized.  Otherwise, one or both of these two
values could be left initialized to zero, which could potentially
result in BUG_ON in jbd2_journal_commit_transaction.

(This could happen by having journal->commit_request getting set to
zero, which could wake up the kjournald process even though there is
no running transaction, which then causes a BUG_ON via the 
J_ASSERT(j_ruinning_transaction != NULL) statement.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

688f869c

15 3月, 2011 2 次提交

ext4: Copy fs UUID to superblock · f2fa2ffc

由 Aneesh Kumar K.V 提交于 1月 29, 2011

File system UUID is made available to application
via  /proc/<pid>/mountinfo
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f2fa2ffc

fs: Remove i_nlink check from file system link callback · f17b6042

由 Aneesh Kumar K.V 提交于 1月 29, 2011

Now that VFS check for inode->i_nlink == 0 and returns proper
error, remove similar check from file system
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f17b6042

10 3月, 2011 2 次提交

block: kill off REQ_UNPLUG · 721a9602

由 Jens Axboe 提交于 3月 09, 2011

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

721a9602

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

06 3月, 2011 1 次提交

ext4: Use single thread to perform DIO unwritten convertion · 198868f3

由 Mingming Cao 提交于 3月 05, 2011

While running ext4 testing on multiple core, we found there are per
cpu ext4-dio-unwritten threads processing conversion from unwritten
extents to written for IOs completed from async direct IO patch.  Per
filesystem is enough, we don't need per cpu threads to work on
conversion.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

198868f3

01 3月, 2011 1 次提交

ext4: optimize ext4_bio_write_page() when no extent conversion is needed · b6168443

由 Theodore Ts'o 提交于 2月 28, 2011

If no extent conversion is required, wake up any processes waiting for
the page's writeback to be complete and free the ext4_io_end structure
directly in ext4_end_bio() instead of dropping it on the linked list
(which requires taking a spinlock to queue and dequeue the io_end
structure), and waiting for the workqueue to do this work.

This removes an extra scheduling delay before process waiting for an
fsync() to complete gets woken up, and it also reduces the CPU
overhead for a random write workload.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b6168443

28 2月, 2011 6 次提交

ext4: skip orphan cleanup if fs has unknown ROCOMPAT features · d39195c3

由 Amir Goldstein 提交于 2月 28, 2011

Orphan cleanup is currently executed even if the file system has some
number of unknown ROCOMPAT features, which deletes inodes and frees
blocks, which could be very bad for some RO_COMPAT features,
especially the SNAPSHOT feature.

This patch skips the orphan cleanup if it contains readonly compatible
features not known by this ext4 implementation, which would prevent
the fs from being mounted (or remounted) readwrite.
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d39195c3

ext4: use the nblocks arg to ext4_truncate_restart_trans() · 8e8eaabe

由 Amir Goldstein 提交于 2月 27, 2011

nblocks is passed into ext4_truncate_restart_trans() from
ext4_ext_truncate_extend_restart() with a value different from the default
blocks_for_truncate(), but is being ignored.

The two other calls to ext4_truncate_restart_trans() already pass the
default value, which is then being recalculated inside the function.

Fix the problem by using the passed argument.
Signed-off-by: NAmir Goldstein <amir73il@users.sf.net>

8e8eaabe

ext4: fix missing iput of root inode for some mount error paths · 32a9bb57

由 Manish Katiyar 提交于 2月 27, 2011

This assures that the root inode is not leaked, and that sb->s_root is
NULL, which will prevent generic_shutdown_super() from doing extra
work, including call sync_filesystem, which ultimately results in
ext4_sync_fs() getting called with an uninitialized struct super,
which is the cause of the crash noted in Kernel Bugzilla #26752.

https://bugzilla.kernel.org/show_bug.cgi?id=26752Signed-off-by: NManish Katiyar <mkatiyar@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

32a9bb57

ext4: make FIEMAP and delayed allocation play well together · 6d9c85eb

由 Yongqiang Yang 提交于 2月 27, 2011

Fix the FIEMAP ioctl so that it returns all of the page ranges which
are still subject to delayed allocation.  We were missing some cases
if the file was sparse.

Reported by Chris Mason <chris.mason@oracle.com>:
>We've had reports on btrfs that cp is giving us files full of zeros
>instead of actually copying them.  It was tracked down to a bug with
>the btrfs fiemap implementation where it was returning holes for
>delalloc ranges.
>
>Newer versions of cp are trusting fiemap to tell it where the holes
>are, which does seem like a pretty neat trick.
>
>I decided to give xfs and ext4 a shot with a few tests cases too, xfs
>passed with all the ones btrfs was getting wrong, and ext4 got the basic
>delalloc case right.
>$ mkfs.ext4 /dev/xxx
>$ mount /dev/xxx /mnt
>$ dd if=/dev/zero of=/mnt/foo bs=1M count=1
>$ fiemap-test foo
>ext:   0 logical: [       0..     255] phys:        0..     255
>flags: 0x007 tot: 256
>
>Horray!  But once we throw a hole in, things go bad:
>$ mkfs.ext4 /dev/xxx
>$ mount /dev/xxx /mnt
>$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=1
>$ fiemap-test foo
>< no output >
>
>We've got a delalloc extent after the hole and ext4 fiemap didn't find
>it.  If I run sync to kick the delalloc out:
>$sync
>$ fiemap-test foo
>ext:   0 logical: [     256..     511] phys:    34048..   34303
>flags: 0x001 tot: 256
>
>fiemap-test is sitting in my /usr/local/bin, and I have no idea how it
>got there.  It's full of pretty comments so I know it isn't mine, but
>you can grab it here:
>
>http://oss.oracle.com/~mason/fiemap-test.c
>
>xfsqa has a fiemap program too.

After Fix, test results are as follows:
ext:   0 logical: [     256..     511] phys:        0..     255
flags: 0x007 tot: 256
ext:   0 logical: [     256..     511] phys:    33280..   33535
flags: 0x001 tot: 256

$ mkfs.ext4 /dev/xxx
$ mount /dev/xxx /mnt
$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=1
$ sync
$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=3
$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=5
$ fiemap-test foo
ext:   0 logical: [     256..     511] phys:    33280..   33535
flags: 0x000 tot: 256
ext:   1 logical: [     768..    1023] phys:        0..     255
flags: 0x006 tot: 256
ext:   2 logical: [    1280..    1535] phys:        0..     255
flags: 0x007 tot: 256
Tested-by: NEric Sandeen <sandeen@redhat.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6d9c85eb

ext4: suppress verbose debugging information if malloc-debug is off · 4dd89fc6

由 Theodore Ts'o 提交于 2月 27, 2011

If CONFIG_EXT4_DEBUG is enabled, then if a block allocation fails due
to disk being full, a verbose debugging message is printed, even if
the malloc-debug switch has not been enabled.  Suppress the debugging
message so that nothing is printed unless malloc-debug has been turned
on.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4dd89fc6

ext4: don't leave PageWriteback set after memory failure · a54aa761

由 Theodore Ts'o 提交于 2月 27, 2011

In ext4_bio_write_page(), if the memory allocation for the struct
ext4_io_page fails, it returns with the page's PageWriteback flag set.
This will end up causing the page not to skip writeback in
WB_SYNC_NONE mode, and in WB_SYNC_ALL mode (i.e., on a sync, fsync, or
umount) the writeback daemon will get stuck forever on the
wait_on_page_writeback() function in write_cache_pages_da().

Or, if journalling is enabled and the file gets deleted, it the
journal thread can get stuck in journal_finish_inode_data_buffers()
call to filemap_fdatawait().

Another place where things can get hung up is in
truncate_inode_pages(), called out of ext4_evict_inode().

Fix this by not setting PageWriteback until after we have successfully
allocated the struct ext4_io_page.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a54aa761

27 2月, 2011 3 次提交

ext4: move setup of the mpd structure to write_cache_pages_da() · 168fc022

由 Theodore Ts'o 提交于 2月 26, 2011

Move the initialization of all of the fields of the mpd structure to
write_cache_pages_da().  This simplifies the code considerably.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

168fc022

ext4: don't lock the next page in write_cache_pages if not needed · 78aaced3

由 Theodore Ts'o 提交于 2月 26, 2011

If we have accumulated a contiguous region of memory to be written
out, and the next page can added to this region, don't bother locking
(and then unlocking the page) before writing out the memory.  In the
unlikely event that the next page was being written back by some other
CPU, we can also skip waiting that page to finish writeback.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

78aaced3

ext4: remove page_skipped hackery in ext4_da_writepages() · ee6ecbcc

由 Theodore Ts'o 提交于 2月 26, 2011

Because the ext4 page writeback codepath had been prematurely calling
clear_page_dirty_for_io(), if it turned out that a particular page
couldn't be written out during a particular pass of
write_cache_pages_da(), the page would have to get redirtied by
calling redirty_pages_for_writeback().  Not only was this wasted work,
but redirty_page_for_writeback() would increment wbc->pages_skipped to
signal to writeback_sb_inodes() that buffers were locked, and that it
should skip this inode until later.

Since this signal was incorrect in ext4's case --- which was caused by
ext4's historically incorrect use of write_cache_pages() ---
ext4_da_writepages() saved and restored wbc->skipped_pages to avoid
confusing writeback_sb_inodes().

Now that we've fixed ext4 to call clear_page_dirty_for_io() right
before initiating the page I/O, we can nuke the page_skipped
save/restore hackery, and breathe a sigh of relief.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ee6ecbcc

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功