提交 · 4acdaf27ebe2034c342f3be57ef49aed1ad885ef · OpenHarmony / kernel_linux

04 1月, 2012 5 次提交

由 Al Viro 提交于 7月 26, 2011

vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4acdaf27

switch vfs_mkdir() and ->mkdir() to umode_t · 18bb1db3

由 Al Viro 提交于 7月 26, 2011

vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18bb1db3

vfs: fix the stupidity with i_dentry in inode destructors · 6b520e05

由 Al Viro 提交于 12月 12, 2011

Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
the cost of taking it into inode_init_always() will be negligible for pipes
and sockets and negative for everything else. Not to mention the removal of
boilerplate code from ->destroy_inode() instances...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6b520e05

vfs: mnt_drop_write_file() · 2a79f17e

由 Al Viro 提交于 12月 09, 2011

new helper (wrapper around mnt_drop_write()) to be used in pair with
mnt_want_write_file().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2a79f17e

switch a bunch of places to mnt_want_write_file() · a561be71

由 Al Viro 提交于 11月 23, 2011

it's both faster (in case when file has been opened for write) and cleaner.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a561be71

14 12月, 2011 5 次提交

ext4: handle EOF correctly in ext4_bio_write_page() · 5a0dc736

由 Yongqiang Yang 提交于 12月 13, 2011

We need to zero out part of a page which beyond EOF before setting uptodate,
otherwise, mapread or write will see non-zero data beyond EOF.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

5a0dc736

ext4: remove a wrong BUG_ON in ext4_ext_convert_to_initialized · 5b5ffa49

由 Yongqiang Yang 提交于 12月 13, 2011

If a file is fallocated on a hole, map->m_lblk + map->m_len may be greater
than ee_block + ee_len.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

5b5ffa49

ext4: correctly handle pages w/o buffers in ext4_discard_partial_buffers() · 093e6e36

由 Yongqiang Yang 提交于 12月 13, 2011

If a page has been read into memory and never been written, it has no
buffers, but we should handle the page in truncate or punch hole.

VFS code of writing operations has handled holes correctly, so this
patch removes the code handling holes in writing operations.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

093e6e36

ext4: avoid potential hang in mpage_submit_io() when blocksize < pagesize · 13a79a47

由 Yongqiang Yang 提交于 12月 13, 2011

If there is an unwritten but clean buffer in a page and there is a
dirty buffer after the buffer, then mpage_submit_io does not write the
dirty buffer out.  As a result, da_writepages loops forever.

This patch fixes the problem by checking dirty flag.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

13a79a47

ext4: avoid hangs in ext4_da_should_update_i_disksize() · ea51d132

由 Andrea Arcangeli 提交于 12月 13, 2011

If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.

 gdb> bt
 #0  0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
 08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 #2  0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
 ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
 #3  generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
 os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
 #4  0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
 xffff88001e26be40) at mm/filemap.c:2600
 #5  0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
 zed out>, pos=<value optimized out>) at mm/filemap.c:2632
 #6  0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
 t fs/ext4/file.c:136
 #7  0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
 ppos=0xffff88001e26bf48) at fs/read_write.c:406
 #8  0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
 000, pos=0xffff88001e26bf48) at fs/read_write.c:435
 #9  0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
 4000) at fs/read_write.c:487
 #10 <signal handler called>
 #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
 #12 0x0000000000000000 in ?? ()
 gdb> print offset
 $22 = 0xffffffffffffffff
 gdb> print idx
 $23 = 0xffffffff
 gdb> print inode->i_blkbits
 $24 = 0xc
 gdb> up
 #1  ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
 xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
 2512                    if (ext4_da_should_update_i_disksize(page, end)) {
 gdb> print start
 $25 = 0x0
 gdb> print end
 $26 = 0xffffffffffffffff
 gdb> print pos
 $27 = 0x108000
 gdb> print new_i_size
 $28 = 0x108000
 gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
 $29 = 0xd9000
 gdb> down
 2467            for (i = 0; i < idx; i++)
 gdb> print i
 $30 = 0xd44acbee

This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

ea51d132

13 12月, 2011 2 次提交

ext4: display the correct mount option in /proc/mounts for [no]init_itable · fc6cb1cd

由 Theodore Ts'o 提交于 12月 12, 2011

/proc/mounts was showing the mount option [no]init_inode_table when
the correct mount option that will be accepted by parse_options() is
[no]init_itable.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

fc6cb1cd

ext4: Fix crash due to getting bogus eh_depth value on big-endian systems · b4611abf

由 Paul Mackerras 提交于 12月 12, 2011

Commit 1939dd84 ("ext4: cleanup ext4_ext_grow_indepth code") added a
reference to ext4_extent_header.eh_depth, but forget to pass the value
read through le16_to_cpu.  The result is a crash on big-endian
machines, such as this crash on a POWER7 server:

attempt to access beyond end of device
sda8: rw=0, want=776392648163376, limit=168558560
Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6bcb
Faulting instruction address: 0xc0000000001f5f38
cpu 0x14: Vector: 300 (Data Access) at [c000001bd1aaecf0]
    pc: c0000000001f5f38: .__brelse+0x18/0x60
    lr: c0000000002e07a4: .ext4_ext_drop_refs+0x44/0x80
    sp: c000001bd1aaef70
   msr: 9000000000009032
   dar: 6b6b6b6b6b6b6bcb
 dsisr: 40000000
  current = 0xc000001bd15b8010
  paca    = 0xc00000000ffe4600
    pid   = 19911, comm = flush-8:0
enter ? for help
[c000001bd1aaeff0] c0000000002e07a4 .ext4_ext_drop_refs+0x44/0x80
[c000001bd1aaf090] c0000000002e0c58 .ext4_ext_find_extent+0x408/0x4c0
[c000001bd1aaf180] c0000000002e145c .ext4_ext_insert_extent+0x2bc/0x14c0
[c000001bd1aaf2c0] c0000000002e3fb8 .ext4_ext_map_blocks+0x628/0x1710
[c000001bd1aaf420] c0000000002b2974 .ext4_map_blocks+0x224/0x310
[c000001bd1aaf4d0] c0000000002b7f2c .mpage_da_map_and_submit+0xbc/0x490
[c000001bd1aaf5a0] c0000000002b8688 .write_cache_pages_da+0x2c8/0x430
[c000001bd1aaf720] c0000000002b8b28 .ext4_da_writepages+0x338/0x670
[c000001bd1aaf8d0] c000000000157280 .do_writepages+0x40/0x90
[c000001bd1aaf940] c0000000001ea830 .writeback_single_inode+0xe0/0x530
[c000001bd1aafa00] c0000000001eb680 .writeback_sb_inodes+0x210/0x300
[c000001bd1aafb20] c0000000001ebc84 .__writeback_inodes_wb+0xd4/0x140
[c000001bd1aafbe0] c0000000001ebfec .wb_writeback+0x2fc/0x3e0
[c000001bd1aafce0] c0000000001ed770 .wb_do_writeback+0x2f0/0x300
[c000001bd1aafdf0] c0000000001ed848 .bdi_writeback_thread+0xc8/0x340
[c000001bd1aafed0] c0000000000c5494 .kthread+0xb4/0xc0
[c000001bd1aaff90] c000000000021f48 .kernel_thread+0x54/0x70

This is due to getting ext_depth(inode) == 0x101 and therefore running
off the end of the path array in ext4_ext_drop_refs into following
unallocated structures.

This fixes it by adding the necessary le16_to_cpu.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b4611abf

12 12月, 2011 1 次提交

ext4: fix ext4_end_io_dio() racing against fsync() · b5a7e970

由 Theodore Ts'o 提交于 12月 12, 2011

We need to make sure iocb->private is cleared *before* we put the
io_end structure on i_completed_io_list.  Otherwise fsync() could
potentially run on another CPU and free the iocb structure out from
under us.
Reported-by: NKent Overstreet <koverstreet@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

b5a7e970

25 11月, 2011 1 次提交

ext4: fix racy use-after-free in ext4_end_io_dio() · 4c81f045

由 Tejun Heo 提交于 11月 24, 2011

ext4_end_io_dio() queues io_end->work and then clears iocb->private;
however, io_end->work calls aio_complete() which frees the iocb
object.  If that slab object gets reallocated, then ext4_end_io_dio()
can end up clearing someone else's iocb->private, this use-after-free
can cause a leak of a struct ext4_io_end_t structure.

Detected and tested with slab poisoning.

[ Note: Can also reproduce using 12 fio's against 12 file systems with the
  following configuration file:

  [global]
  direct=1
  ioengine=libaio
  iodepth=1
  bs=4k
  ba=4k
  size=128m

  [create]
  filename=${TESTDIR}
  rw=write

  -- tytso ]

Google-Bug-Id: 5354697
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NKent Overstreet <koverstreet@google.com>
Tested-by: NKent Overstreet <koverstreet@google.com>
Cc: stable@kernel.org

4c81f045

22 11月, 2011 1 次提交

ext4: fix up a undefined error in ext4_free_blocks in debugging code · 6e58ad69

由 Yongqiang Yang 提交于 11月 21, 2011

sbi is not defined, so let ext4_free_blocks use EXT4_SB(sb) instead
when EXT4FS_DEBUG is defined.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>

6e58ad69

08 11月, 2011 1 次提交

ext4: add blk_finish_plug in error case of writepages. · 3c1fcb2c

由 Namjae Jeon 提交于 11月 07, 2011

blk_finish_plug is needed in error case of writepages.
Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3c1fcb2c

07 11月, 2011 2 次提交

ext4: Remove kernel_lock annotations · 2397256d

由 Richard Weinberger 提交于 11月 07, 2011

The BKL is gone, these annotations are useless.
Signed-off-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2397256d

ext4: ignore journalled data options on remount if fs has no journal · eb513689

由 Theodore Ts'o 提交于 11月 07, 2011

This avoids a confusing failure in the init scripts when the
/etc/fstab has data=writeback or data=journal but the file system does
not have a journal.  So check for this case explicitly, and warn the
user that we are ignoring the (pointless, since they have no journal)
data=* mount option.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

eb513689

02 11月, 2011 4 次提交

filesystems: add set_nlink() · bfe86848

由 Miklos Szeredi 提交于 10月 28, 2011

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bfe86848

filesystems: add missing nlink wrappers · 6d6b77f1

由 Miklos Szeredi 提交于 10月 28, 2011

Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

6d6b77f1

ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined · bf52c6f7

由 Yongqiang Yang 提交于 11月 01, 2011

The variable 'block' is removed by commit 750c9c47, so use the
replacement ex_ee_block instead.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

bf52c6f7

ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled · 32de6756

由 Yongqiang Yang 提交于 11月 01, 2011

This patch fixes a syntax error which omits a comma. Besides this,
logical block number is unsigend 32 bits, so printk should use %u
instead %d.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

32de6756

01 11月, 2011 9 次提交

treewide: use __printf not __attribute__((format(printf,...))) · b9075fa9

由 Joe Perches 提交于 10月 31, 2011

Standardize the style for compiler based printf format verification.
Standardized the location of __printf too.

Done via script and a little typing.

$ grep -rPl --include=*.[ch] -w "__attribute__" * | \
  grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \
  xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }'

[akpm@linux-foundation.org: revert arch bits]
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b9075fa9

ext4: warn if direct reclaim tries to writeback pages · 966dbde2

由 Mel Gorman 提交于 10月 31, 2011

Direct reclaim should never writeback pages.  Warn if an attempt is made.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Alex Elder <aelder@sgi.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

966dbde2

ext4: fix a typo in struct ext4_allocation_context · ff3fc173

由 Robin Dong 提交于 10月 31, 2011

This patch changes "bext" to "best".
Signed-off-by: NRobin Dong <sanbai@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ff3fc173

ext4: Don't normalize an falloc request if it can fit in 1 extent. · 3c6fe770

由 Greg Harm 提交于 10月 31, 2011

If an fallocate request fits in EXT_UNINIT_MAX_LEN, then set the
EXT4_GET_BLOCKS_NO_NORMALIZE flag. For larger fallocate requests,
let mballoc.c normalize the request.

This fixes a problem where large requests were being split into
non-contiguous extents due to commit 556b27ab: ext4: do not
normalize block requests from fallocate.

Testing: 
*) Checked that 8.x MB falloc'ed files are still laid down next to
each other (contiguously).
*) Checked that the maximum size extent (127.9MB) is allocated as 1
extent.
*) Checked that a 1GB file is somewhat contiguous (often 5-6
non-contiguous extents now).
*) Checked that a 120MB file can still be falloc'ed even if there are
no single extents large enough to hold it.
Signed-off-by: NGreg Harm <gharm@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3c6fe770

ext4: remove comments about extent mount option in ext4_new_inode() · 4af83508

由 Eryu Guan 提交于 10月 31, 2011

Remove comments about 'extent' mount option in ext4_new_inode(), since
it's no longer exists.
Signed-off-by: NEryu Guan <guaneryu@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4af83508

ext4: let ext4_discard_partial_buffers handle unaligned range correctly · edb5ac89

由 Yongqiang Yang 提交于 10月 31, 2011

As comment says, we should handle unaligned range rather than aligned
one.  This fixes a bug found by running xfstests #91.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>

edb5ac89

Y
ext4: return ENOMEM if find_or_create_pages fails · 5129d05f
由 Yongqiang Yang 提交于 10月 31, 2011
```
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
5129d05f
Y
ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock() · e260daf2
由 Yongqiang Yang 提交于 10月 31, 2011
```
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
e260daf2

ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten · 0edeb71d

由 Tao Ma 提交于 10月 31, 2011

EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten
should be done simultaneously since ext4_end_io_nolock always clear
the flag and decrease the counter in the same time.

We have found some bugs that the flag is set while leaving
i_aiodio_unwritten unchanged(commit 32c80b32). So this patch just tries
to create a helper function to wrap them to avoid any future bug.
The idea is inspired by Eric.

Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0edeb71d

31 10月, 2011 4 次提交

ext4: optimize locking for end_io extent conversion · b82e384c

由 Theodore Ts'o 提交于 10月 31, 2011

Now that we are doing the locking correctly, we need to grab the
i_completed_io_lock() twice per end_io.  We can clean this up by
removing the structure from the i_complted_io_list, and use this as
the locking mechanism to prevent ext4_flush_completed_IO() racing
against ext4_end_io_work(), instead of clearing the
EXT4_IO_END_UNWRITTEN in io->flag.

In addition, if the ext4_convert_unwritten_extents() returns an error,
we no longer keep the end_io structure on the linked list.  This
doesn't help, because it tends to lock up the file system and wedges
the system.  That's one way to call attention to the problem, but it
doesn't help the overall robustness of the system.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b82e384c

ext4: remove unnecessary call to waitqueue_active() · 4e298021

由 Theodore Ts'o 提交于 10月 30, 2011

The usage of waitqueue_active() is not necessary, and introduces (I
believe) a hard-to-hit race.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4e298021

ext4: Use correct locking for ext4_end_io_nolock() · d73d5046

由 Tao Ma 提交于 10月 30, 2011

We must hold i_completed_io_lock when manipulating anything on the
i_completed_io_list linked list.  This includes io->lock, which we
were checking in ext4_end_io_nolock().

So move this check to ext4_end_io_work().  This also has the bonus of
avoiding extra work if it is already done without needing to take the
mutex.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d73d5046

writeback: Add a 'reason' to wb_writeback_work · 0e175a18

由 Curt Wohlgemuth 提交于 10月 07, 2011

This creates a new 'reason' field in a wb_writeback_work
structure, which unambiguously identifies who initiates
writeback activity.  A 'wb_reason' enumeration has been
added to writeback.h, to enumerate the possible reasons.

The 'writeback_work_class' and tracepoint event class and
'writeback_queue_io' tracepoints are updated to include the
symbolic 'reason' in all trace events.

And the 'writeback_inodes_sbXXX' family of routines has had
a wb_stats parameter added to them, so callers can specify
why writeback is being started.
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>

0e175a18

29 10月, 2011 5 次提交

ext4: fix race in xattr block allocation path · 6d6a4351

由 Eric Sandeen 提交于 10月 29, 2011

Ceph users reported that when using Ceph on ext4, the filesystem
would often become corrupted, containing inodes with incorrect
i_blocks counters.

I managed to reproduce this with a very hacked-up "streamtest"
binary from the Ceph tree.

Ceph is doing a lot of xattr writes, to out-of-inode blocks.
There is also another thread which does sync_file_range and close,
of the same files.  The problem appears to happen due to this race:

sync/flush thread               xattr-set thread
-----------------               ----------------

do_writepages                   ext4_xattr_set
ext4_da_writepages              ext4_xattr_set_handle
mpage_da_map_blocks             ext4_xattr_block_set
        set DELALLOC_RESERVE
                                ext4_new_meta_blocks
                                        ext4_mb_new_blocks
                                                if (!i_delalloc_reserved_flag)
                                                        vfs_dq_alloc_block
ext4_get_blocks
	down_write(i_data_sem)
        set i_delalloc_reserved_flag
	...
	up_write(i_data_sem)
                                        if (i_delalloc_reserved_flag)
                                                vfs_dq_alloc_block_nofail


In other words, the sync/flush thread pops in and sets
i_delalloc_reserved_flag on the inode, which makes the xattr thread
think that it's in a delalloc path in ext4_new_meta_blocks(),
and add the block for a second time, after already having added
it once in the !i_delalloc_reserved_flag case in ext4_mb_new_blocks

The real problem is that we shouldn't be using the DELALLOC_RESERVED
state flag, and instead we should be passing
EXT4_GET_BLOCKS_DELALLOC_RESERVE down to ext4_map_blocks() instead of
using an inode state flag.  We'll fix this for now with using
i_data_sem to prevent this race, but this is really not the right way
to fix things.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

6d6a4351

ext4: trace punch_hole correctly in ext4_ext_map_blocks · e7b319e3

由 Yongqiang Yang 提交于 10月 29, 2011

When ext4_ext_map_blocks() is called by punch_hole, trace should
trace blocks punched out.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e7b319e3

ext4: clean up AGGRESSIVE_TEST code · 02dc62fb

由 Yongqiang Yang 提交于 10月 29, 2011

Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

02dc62fb

ext4: move variables to their scope · 81fdbb4a

由 Yongqiang Yang 提交于 10月 29, 2011

Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

81fdbb4a

ext4: fix quota accounting during migration · 5cb81dab

由 Dmitry Monakhov 提交于 10月 29, 2011

The tmp_inode should have same uid/gid as the original inode.
Otherwise new metadata blocks will be accounted to wrong quota-id,
which will result in a quota leak after the inode migration is
completed.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5cb81dab

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多