提交 · a75ae78f087f933ab3432e98bb4dbbf2196cf6d5 · openeuler / raspberrypi-kernel

04 4月, 2013 4 次提交

ext4: support simple conversion of extent-mapped inodes to use i_blocks · 996bb9fd

由 Theodore Ts'o 提交于 4月 03, 2013

In order to make it simpler to test the code which support
i_blocks/indirect-mapped inodes, support the conversion of inodes
which are less than 12 blocks and which are contained in no more than
a single extent.

The primary intended use of this code is to converting freshly created
zero-length files and empty directories.

Note that the version of chattr in e2fsprogs 1.42.7 and earlier has a
check that prevents the clearing of the extent flag.  A simple patch
which allows "chattr -e <file>" to work will be checked into the
e2fsprogs git repository.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

996bb9fd

ext4: refactor truncate code · 819c4920

由 Theodore Ts'o 提交于 4月 03, 2013

Move common code in ext4_ind_truncate() and ext4_ext_truncate() into
ext4_truncate().  This saves over 60 lines of code.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

819c4920

ext4: refactor punch hole code · 26a4c0c6

由 Theodore Ts'o 提交于 4月 03, 2013

Move common code in ext4_ind_punch_hole() and ext4_ext_punch_hole()
into ext4_punch_hole(). This saves over 150 lines of code.

This also fixes a potential bug when the punch_hole() code is racing
against indirect-to-extents or extents-to-indirect migation. We are
currently using i_mutex to protect against changes to the inode flag;
specifically, the append-only, immutable, and extents inode flags. So
we need to take i_mutex before deciding whether to use the
extents-specific or indirect-specific punch_hole code.

Also, there was a missing call to ext4_inode_block_unlocked_dio() in
the indirect punch codepath. This was added in commit 02d262df
to block DIO readers racing against the punch operation in the
codepath for extent-mapped inodes, but it was missing for
indirect-block mapped inodes. One of the advantages of refactoring
the code is that it makes such oversights much less likely.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

26a4c0c6

ext4: collapse handling of data=ordered and data=writeback codepaths · 74d553aa

由 Theodore Ts'o 提交于 4月 03, 2013

The only difference between how we handle data=ordered and
data=writeback is a single call to ext4_jbd2_file_inode().  Eliminate
code duplication by factoring out redundant the code paths.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>

74d553aa

20 3月, 2013 1 次提交

ext4: fix ext4_evict_inode() racing against workqueue processing code · 1ada47d9

由 Theodore Ts'o 提交于 3月 20, 2013

Commit 84c17543 (ext4: move work from io_end to inode) triggered a
regression when running xfstest #270 when the file system is mounted
with dioread_nolock.

The problem is that after ext4_evict_inode() calls ext4_ioend_wait(),
this guarantees that last io_end structure has been freed, but it does
not guarantee that the workqueue structure, which was moved into the
inode by commit 84c17543, is actually finished.  Once
ext4_flush_completed_IO() calls ext4_free_io_end() on CPU #1, this
will allow ext4_ioend_wait() to return on CPU #2, at which point the
evict_inode() codepath can race against the workqueue code on CPU #1
accessing EXT4_I(inode)->i_unwritten_work to find the next item of
work to do.

Fix this by calling cancel_work_sync() in ext4_ioend_wait(), which
will be renamed ext4_ioend_shutdown(), since it is only used by
ext4_evict_inode().  Also, move the call to ext4_ioend_shutdown()
until after truncate_inode_pages() and filemap_write_and_wait() are
called, to make sure all dirty pages have been written back and
flushed from the page cache first.

BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e
*pdpt = 0000000030bc3001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in:
Pid: 6, comm: kworker/u:0 Not tainted 3.8.0-rc3-00013-g84c17543-dirty #91 Bochs Bochs
EIP: 0060:[<c01dda6a>] EFLAGS: 00010046 CPU: 0
EIP is at cwq_activate_delayed_work+0x3b/0x7e
EAX: 00000000 EBX: 00000000 ECX: f505fe54 EDX: 00000000
ESI: ed5b697c EDI: 00000006 EBP: f64b7e8c ESP: f64b7e84
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 30bc2000 CR4: 000006f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process kworker/u:0 (pid: 6, ti=f64b6000 task=f64b4160 task.ti=f64b6000)
Stack:
 f505fe00 00000006 f64b7e9c c01de3d7 f6435540 00000003 f64b7efc c01def1d
 f6435540 00000002 00000000 0000008a c16d0808 c040a10b c16d07d8 c16d08b0
 f505fe00 c16d0780 00000000 00000000 ee153df4 c1ce4a30 c17d0e30 00000000
Call Trace:
 [<c01de3d7>] cwq_dec_nr_in_flight+0x71/0xfb
 [<c01def1d>] process_one_work+0x5d8/0x637
 [<c040a10b>] ? ext4_end_bio+0x300/0x300
 [<c01e3105>] worker_thread+0x249/0x3ef
 [<c01ea317>] kthread+0xd8/0xeb
 [<c01e2ebc>] ? manage_workers+0x4bb/0x4bb
 [<c023a370>] ? trace_hardirqs_on+0x27/0x37
 [<c0f1b4b7>] ret_from_kernel_thread+0x1b/0x28
 [<c01ea23f>] ? __init_kthread_worker+0x71/0x71
Code: 01 83 15 ac ff 6c c1 00 31 db 89 c6 8b 00 a8 04 74 12 89 c3 30 db 83 05 b0 ff 6c c1 01 83 15 b4 ff 6c c1 00 89 f0 e8 42 ff ff ff <8b> 13 89 f0 83 05 b8 ff 6c c1
 6c c1 00 31 c9 83
EIP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e SS:ESP 0068:f64b7e84
CR2: 0000000000000000
---[ end trace a1923229da53d8a4 ]---
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>

1ada47d9

12 3月, 2013 1 次提交

ext4: use atomic64_t for the per-flexbg free_clusters count · 90ba983f

由 Theodore Ts'o 提交于 3月 11, 2013

A user who was using a 8TB+ file system and with a very large flexbg
size (> 65536) could cause the atomic_t used in the struct flex_groups
to overflow.  This was detected by PaX security patchset:

http://forums.grsecurity.net/viewtopic.php?f=3&t=3289&p=12551#p12551

This bug was introduced in commit 9f24e420, so it's been around
since 2.6.30.  :-(

Fix this by using an atomic64_t for struct orlav_stats's
free_clusters.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org

90ba983f

02 3月, 2013 1 次提交

ext4: use percpu counter for extent cache count · 1ac6466f

由 Theodore Ts'o 提交于 3月 02, 2013

Use a percpu counter rather than atomic types for shrinker accounting.
There's no need for ultimate accuracy in the shrinker, so this
should come a little more cheaply.  The percpu struct is somewhat
large, but there was a big gap before the cache-aligned
s_es_lru_lock anyway, and it fits nicely in there.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1ac6466f

01 3月, 2013 1 次提交

ext4: optimize ext4_es_shrink() · 24630774

由 Theodore Ts'o 提交于 2月 28, 2013

When the system is under memory pressure, ext4_es_srhink() will get
called very often.  So optimize returning the number of items in the
file system's extent status cache by keeping a per-filesystem count,
instead of calculating it each time by scanning all of the inodes in
the extent status cache.

Also rename the slab used for the extent status cache to be
"ext4_extent_status" so it's obviousl the slab in question is created
by ext4.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>

24630774

18 2月, 2013 4 次提交

ext4: reclaim extents from extent status tree · 74cd15cd

由 Zheng Liu 提交于 2月 18, 2013

Although extent status is loaded on-demand, we also need to reclaim
extent from the tree when we are under a heavy memory pressure because
in some cases fragmented extent tree causes status tree costs too much
memory.

Here we maintain a lru list in super_block.  When the extent status of
an inode is accessed and changed, this inode will be move to the tail
of the list.  The inode will be dropped from this list when it is
cleared.  In the inode, a counter is added to count the number of
cached objects in extent status tree.  Here only written/unwritten/hole
extent is counted because delayed extent doesn't be reclaimed due to
fiemap, bigalloc and seek_data/hole need it.  The counter will be
increased as a new extent is allocated, and it will be decreased as a
extent is freed.

In this commit we use normal shrinker framework to reclaim memory from
the status tree.  ext4_es_reclaim_extents_count() traverses the lru list
to count the number of reclaimable extents.  ext4_es_shrink() tries to
reclaim written/unwritten/hole extents from extent status tree.  The
inode that has been shrunk is moved to the tail of lru list.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>

74cd15cd

ext4: remove single extent cache · 69eb33dc

由 Zheng Liu 提交于 2月 18, 2013

Single extent cache could be removed because we have extent status tree
as a extent cache, and it would be better.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>

69eb33dc

ext4: lookup block mapping in extent status tree · d100eef2

由 Zheng Liu 提交于 2月 18, 2013

After tracking all extent status, we already have a extent cache in
memory.  Every time we want to lookup a block mapping, we can first
try to lookup it in extent status tree to avoid a potential disk I/O.

A new function called ext4_es_lookup_extent is defined to finish this
work.  When we try to lookup a block mapping, we always call
ext4_map_blocks and/or ext4_da_map_blocks.  So in these functions we
first try to lookup a block mapping in extent status tree.

A new flag EXT4_GET_BLOCKS_NO_PUT_HOLE is used in ext4_da_map_blocks
in order not to put a hole into extent status tree because this hole
will be converted to delayed extent in the tree immediately.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>

d100eef2

ext4: track all extent status in extent status tree · f7fec032

由 Zheng Liu 提交于 2月 18, 2013

By recording the phycisal block and status, extent status tree is able
to track the status of every extents.  When we call _map_blocks
functions to lookup an extent or create a new written/unwritten/delayed
extent, this extent will be inserted into extent status tree.

We don't load all extents from disk in alloc_inode() because it costs
too much memory, and if a file is opened and closed frequently it will
takes too much time to load all extent information.  So currently when
we create/lookup an extent, this extent will be inserted into extent
status tree.  Hence, the extent status tree may not comprehensively
contain all of the extents found in the file.

Here a condition we need to take care is that an extent might contains
unwritten and delayed status simultaneously because an extent is delayed
allocated and could be allocated by fallocate.  At this time we need to
keep delayed status because later we need to update delayed reservation
space using it.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Jan kara <jack@suse.cz>

f7fec032

10 2月, 2013 2 次提交

ext4: start handle at the last possible moment when creating inodes · 1139575a

由 Theodore Ts'o 提交于 2月 09, 2013

In ext4_{create,mknod,mkdir,symlink}(), don't start the journal handle
until the inode has been succesfully allocated. In order to do this,
we need to start the handle in the ext4_new_inode(). So create a new
variant of this function, ext4_new_inode_start_handle(), so the handle
can be created at the last possible minute, before we need to modify
the inode allocation bitmap block.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1139575a

ext4: fix the number of credits needed for acl ops with inline data · 95eaefbd

由 Theodore Ts'o 提交于 2月 09, 2013

Operations which modify extended attributes may need extra journal
credits if inline data is used, since there is a chance that some
extended attributes may need to get pushed to an external attribute
block.

Changes to reflect this was made in xattr.c, but they were missed in
fs/ext4/acl.c.  To fix this, abstract the calculation of the number of
credits needed for xattr operations to an inline function defined in
ext4_jbd2.h, and use it in acl.c and xattr.c.

Also move the function declarations used in inline.c from xattr.h
(where they are non-obviously hidden, and caused problems since
ext4_jbd2.h needs to use the function ext4_has_inline_data), and move
them to ext4.h.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NTao Ma <boyu.mt@taobao.com>
Reviewed-by: NJan Kara <jack@suse.cz>

95eaefbd

09 2月, 2013 1 次提交

ext4: move the jbd2 wrapper functions out of super.c · 722887dd

由 Theodore Ts'o 提交于 2月 08, 2013

Move the jbd2 wrapper functions which start and stop handles out of
super.c, where they don't really logically belong, and into
ext4_jbd2.c.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

722887dd

28 1月, 2013 3 次提交

ext4: move work from io_end to inode · 84c17543

由 Jan Kara 提交于 1月 28, 2013

It does not make much sense to have struct work in ext4_io_end_t
because we always use it for only one ext4_io_end_t per inode (the
first one in the i_completed_io list). So just move the structure to
inode itself.  This also allows for a small simplification in
processing io_end structures.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

84c17543

ext4: Always use ext4_bio_write_page() for writeout · 36ade451

由 Jan Kara 提交于 1月 28, 2013

Currently we sometimes used block_write_full_page() and sometimes
ext4_bio_write_page() for writeback (depending on mount options and call
path). Let's always use ext4_bio_write_page() to simplify things a bit.
Reviewed-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

36ade451

ext4: add punching hole support for non-extent-mapped files · 8bad6fc8

由 Zheng Liu 提交于 1月 28, 2013

This patch add supports for indirect file support punching hole.  It
is almost the same as ext4_ext_punch_hole.  First, we invalidate all
pages between this hole, and then we try to deallocate all blocks of
this hole.

A recursive function is used to handle deallocation of blocks.  In
this function, it iterates over the entries in inode's i_blocks or
indirect blocks, and try to free the block for each one of them.

After applying this patch, xfstest #255 will not pass w/o extent because
indirect-based file doesn't support unwritten extents.
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8bad6fc8

13 1月, 2013 1 次提交

ext4: trigger the lazy inode table initialization after resize · 7f511862

由 Theodore Ts'o 提交于 1月 13, 2013

After we have finished extending the file system, we need to trigger a
the lazy inode table thread to zero out the inode tables.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7f511862

11 12月, 2012 13 次提交

ext4: ensure Inode flags consistency are checked at build time · 9a4c8019

由 Carlos Maiolino 提交于 12月 10, 2012


Flags being used by atomic operations in inode flags (e.g.
ext4_test_inode_flag(), should be consistent with that actually stored
in inodes, i.e.: EXT4_XXX_FL.

It ensures that this consistency is checked at build-time, not at
run-time.

Currently, the flags consistency are being checked at run-time, but,
there is no real reason to not do a build-time check instead of a
run-time check. The code is comparing macro defined values with enum
type variables, where both are constants, so, there is no problem in
comparing constants at build-time.

enum variables are treated as constants by the C compiler, according
to the C99 specs (see www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf 
sec. 6.2.5, item 16), so, there is no real problem in comparing an
enumeration type at build time
Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9a4c8019

ext4: Remove CONFIG_EXT4_FS_XATTR · 939da108

由 Tao Ma 提交于 12月 10, 2012

Ted has sent out a RFC about removing this feature. Eric and Jan
confirmed that both RedHat and SUSE enable this feature in all their
product.  David also said that "As far as I know, it's enabled in all
Android kernels that use ext4."  So it seems OK for us.

And what's more, as inline data depends its implementation on xattr,
and to be frank, I don't run any test again inline data enabled while
xattr disabled.  So I think we should add inline data and remove this
config option in the same release.

[ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which
  isn't much in the grand scheme of things.  Since no one seems to be
  testing this configuration except for some automated compile farms, on
  balance we are better removing this config option, and so that it is
  effectively always enabled. -- tytso ]

Cc: David Brown <davidb@codeaurora.org>
Cc: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

939da108

ext4: enable ext4 inline support · f08225d1

由 Tao Ma 提交于 12月 10, 2012

Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f08225d1

ext4: make ext4_delete_entry generic · 05019a9e

由 Tao Ma 提交于 12月 10, 2012

Currently ext4_delete_entry() is used only for dir entry removing from
a dir block.  So let us create a new function
ext4_generic_delete_entry and this function takes a entry_buf and a
buf_size so that it can be used for inline data.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

05019a9e

ext4: create a new function search_dir · 7335cd3b

由 Tao Ma 提交于 12月 10, 2012

search_dirblock is used to search a dir block, but the code is almost
the same for searching an inline dir.

So create a new fuction search_dir and let search_dirblock call it.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7335cd3b

ext4: let ext4_readdir handle inline data · 65d165d9

由 Tao Ma 提交于 12月 10, 2012

For "." and "..", we just call filldir by ourselves
instead of iterating the real dir entry.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

65d165d9

ext4: let add_dir_entry handle inline data properly · 3c47d541

由 Tao Ma 提交于 12月 10, 2012

This patch let add_dir_entry handle the inline data case. So the
dir is initialized as inline dir first and then we can try to add
some files to it, when the inline space can't hold all the entries,
a dir block will be created and the dir entry will be moved to it.

Also for an inlined dir, "." and ".." are removed and we only use
4 bytes to store the parent inode number. These 2 entries will be
added when we convert an inline dir to a block-based one.

[ Folded in patch from Dan Carpenter to remove an unused variable. ]
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3c47d541

ext4: create __ext4_insert_dentry for dir entry insertion · 978fef91

由 Tao Ma 提交于 12月 10, 2012

The old add_dirent_to_buf handles all the work related to the
work of adding dir entry to a dir block. Now we have inline data,
so create 2 new function __ext4_find_dest_de and __ext4_insert_dentry
that do the real work and let add_dirent_to_buf call them.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

978fef91

ext4: refactor __ext4_check_dir_entry() to accept start and size · 226ba972

由 Tao Ma 提交于 12月 10, 2012

The __ext4_check_dir_entry() function() is used to check whether the
de is over the block boundary.  Now with inline data, it could be
within the block boundary while exceeds the inode size.  So check this
function to check the overflow more precisely.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

226ba972

ext4: make ext4_init_dot_dotdot for inline dir usage · a774f9c2

由 Tao Ma 提交于 12月 10, 2012

Currently, the initialization of dot and dotdot are encapsulated in
ext4_mkdir and also bond with dir_block. So create a new function
named ext4_init_new_dir and the initialization is moved to
ext4_init_dot_dotdot. Now it will called either in the normal non-inline
case(rec_len of ".." will cover the whole block) or when we converting an
inline dir to a block(rec len of ".." will be the real length). The start
of the next entry is also returned for inline dir usage.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a774f9c2

ext4: add delalloc support for inline data · 9c3569b5

由 Tao Ma 提交于 12月 10, 2012

For delayed allocation mode, we write to inline data if the file
is small enough. And in case of we write to some offset larger
than the inline size, the 1st page is dirtied, so that
ext4_da_writepages can handle the conversion. When the 1st page
is initialized with blocks, the inline part is removed.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9c3569b5

ext4: add normal write support for inline data · f19d5870

由 Tao Ma 提交于 12月 10, 2012

For a normal write case (not journalled write, not delayed
allocation), we write to the inline if the file is small and convert
it to an extent based file when the write is larger than the max
inline size.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f19d5870

ext4: add the basic function for inline data support · 67cf5b09

由 Tao Ma 提交于 12月 10, 2012

Implement inline data with xattr.

Now we use "system.data" to store xattr, and the xattr will
be extended if the i_size is increased while we don't release
the space during truncate.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

67cf5b09

29 11月, 2012 1 次提交

ext4: rationalize ext4_extents.h inclusion · 4a092d73

由 Theodore Ts'o 提交于 11月 28, 2012

Previously, ext4_extents.h was being included at the end of ext4.h,
which was bad for a number of reasons: (a) it was not being included
in the expected place, and (b) it caused the header to be included
multiple times.  There were #ifdef's to prevent this from causing any
problems, but it still was unnecessary.

By moving the function declarations that were in ext4_extents.h to
ext4.h, which is standard practice for where the function declarations
for the rest of ext4.h can be found, we can remove ext4_extents.h from
being included in ext4.h at all, and then we can only include
ext4_extents.h where it is needed in ext4's source files.

It should be possible to move a few more things into ext4.h, and
further reduce the number of source files that need to #include
ext4_extents.h, but that's a cleanup for another day.
Reported-by: NSachin Kamat <sachin.kamat@linaro.org>
Reported-by: NWei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4a092d73

19 11月, 2012 1 次提交

Fix misspellings of "whether" in comments. · 48fc7f7e

由 Adam Buchbinder 提交于 9月 19, 2012

"Whether" is misspelled in various comments across the tree; this
fixes them. No code changes.
Signed-off-by: NAdam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

48fc7f7e

09 11月, 2012 2 次提交

ext4: reimplement ext4_find_delay_alloc_range on extent status tree · 7d1b1fbc

由 Zheng Liu 提交于 11月 08, 2012

Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: NAllison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7d1b1fbc

ext4: add data structures for the extent status tree · c0677e6d

由 Zheng Liu 提交于 11月 08, 2012

This patch adds two structures that supports extent status tree, extent_status
and ext4_es_tree. Currently extent_status is used to track a delay extent for
an inode, which record the start block and the length of the delay extent.
ext4_es_tree is used to store all extent_status for an inode in memory.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: NAllison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c0677e6d

22 10月, 2012 1 次提交

ext4: Checksum the block bitmap properly with bigalloc enabled · 79f1ba49

由 Tao Ma 提交于 10月 22, 2012

In mke2fs, we only checksum the whole bitmap block and it is right.
While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the
size of the checksumed bitmap which is wrong when we enable bigalloc.
The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes
it.

Also as every caller of ext4_block_bitmap_csum_set and
ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8,
we'd better removes this parameter and sets it in the function itself.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org

79f1ba49

10 10月, 2012 1 次提交

ext4: fix metadata checksum calculation for the superblock · 06db49e6

由 Theodore Ts'o 提交于 10月 10, 2012

The function ext4_handle_dirty_super() was calculating the superblock
on the wrong block data.  As a result, when the superblock is modified
while it is mounted (most commonly, when inodes are added or removed
from the orphan list), the superblock checksum would be wrong.  We
didn't notice because the superblock *was* being correctly calculated
in ext4_commit_super(), and this would get called when the file system
was unmounted.  So the problem only became obvious if the system
crashed while the file system was mounted.

Fix this by removing the poorly designed function signature for
ext4_superblock_csum_set(); if it only took a single argument, the
pointer to a struct superblock, the ambiguity which caused this
mistake would have been impossible.
Reported-by: NGeorge Spelvin <linux@horizon.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

06db49e6

05 10月, 2012 1 次提交

ext4: fix ext4_flush_completed_IO wait semantics · c278531d

由 Dmitry Monakhov 提交于 10月 05, 2012

BUG #1) All places where we call ext4_flush_completed_IO are broken
    because buffered io and DIO/AIO goes through three stages
    1) submitted io,
    2) completed io (in i_completed_io_list) conversion pended
    3) finished  io (conversion done)
    And by calling ext4_flush_completed_IO we will flush only
    requests which were in (2) stage, which is wrong because:
     1) punch_hole and truncate _must_ wait for all outstanding unwritten io
      regardless to it's state.
     2) fsync and nolock_dio_read should also wait because there is
        a time window between end_page_writeback() and ext4_add_complete_io()
        As result integrity fsync is broken in case of buffered write
        to fallocated region:
        fsync                                      blkdev_completion
	 ->filemap_write_and_wait_range
                                                   ->ext4_end_bio
                                                     ->end_page_writeback
          <-- filemap_write_and_wait_range return
	 ->ext4_flush_completed_IO
   	 sees empty i_completed_io_list but pended
   	 conversion still exist
                                                     ->ext4_add_complete_io

BUG #2) Race window becomes wider due to the 'ext4: completed_io
locking cleanup V4' patch series

This patch make following changes:
1) ext4_flush_completed_io() now first try to flush completed io and when
   wait for any outstanding unwritten io via ext4_unwritten_wait()
2) Rename function to more appropriate name.
3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
   prevent endless wait
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

c278531d

29 9月, 2012 1 次提交

ext4: serialize dio nonlocked reads with defrag workers · 17335dcc

由 Dmitry Monakhov 提交于 9月 29, 2012

Inode's block defrag and ext4_change_inode_journal_flag() may
affect nonlocked DIO reads result, so proper synchronization
required.

- Add missed inode_dio_wait() calls where appropriate
- Check inode state under extra i_dio_count reference.
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

17335dcc