提交 · 8d5d02e6b176565c77ff03604908b1453a22044d · openeuler / raspberrypi-kernel

29 9月, 2009 4 次提交

ext4: async direct IO for holes and fallocate support · 8d5d02e6

由 Mingming Cao 提交于 9月 28, 2009

For async direct IO that covers holes or fallocate, the end_io
callback function now queued the convertion work on workqueue but
don't flush the work rightaway as it might take too long to afford.

But when fsync is called after all the data is completed, user expects
the metadata also being updated before fsync returns.

Thus we need to flush the conversion work when fsync() is called.
This patch keep track of a listed of completed async direct io that
has a work queued on workqueue.  When fsync() is called, it will go
through the list and do the conversion.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

8d5d02e6

ext4: Use end_io callback to avoid direct I/O fallback to buffered I/O · 4c0425ff

由 Mingming Cao 提交于 9月 28, 2009

Currently the DIO VFS code passes create = 0 when writing to the
middle of file.  It does this to avoid block allocation for holes, so
as not to expose stale data out when there is a parallel buffered read
(which does not hold the i_mutex lock).  Direct I/O writes into holes
falls back to buffered IO for this reason.

Since preallocated extents are treated as holes when doing a
get_block() look up (buffer is not mapped), direct IO over fallocate
also falls back to buffered IO.  Thus ext4 actually silently falls
back to buffered IO in above two cases, which is undesirable.

To fix this, this patch creates unitialized extents when a direct I/O
write into holes in sparse files, and registering an end_io callback which
converts the uninitialized extent to an initialized extent after the
I/O is completed.
Singed-Off-By: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4c0425ff

ext4: Split uninitialized extents for direct I/O · 0031462b

由 Mingming Cao 提交于 9月 28, 2009

When writing into an unitialized extent via direct I/O, and the direct
I/O doesn't exactly cover the unitialized extent, split the extent
into uninitialized and initialized extents before submitting the I/O.
This avoids needing to deal with an ENOSPC error in the end_io
callback that gets used for direct I/O.

When the IO is complete, the written extent will be marked as initialized.

Singed-Off-By: Mingming Cao <cmm@us.ibm.com> 
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0031462b

ext4: release reserved quota when block reservation for delalloc retry · 9f0ccfd8

由 Mingming Cao 提交于 9月 28, 2009

ext4_da_reserve_space() can reserve quota blocks multiple times if
ext4_claim_free_blocks() fail and we retry the allocation. We should
release the quota reservation before restarting.

Bug found by Jan Kara.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9f0ccfd8

30 9月, 2009 1 次提交

ext4: Adjust ext4_da_writepages() to write out larger contiguous chunks · 55138e0b

由 Theodore Ts'o 提交于 9月 29, 2009

Work around problems in the writeback code to force out writebacks in
larger chunks than just 4mb, which is just too small.  This also works
around limitations in the ext4 block allocator, which can't allocate
more than 2048 blocks at a time.  So we need to defeat the round-robin
characteristics of the writeback code and try to write out as many
blocks in one inode before allowing the writeback code to move on to
another inode.  We add a a new per-filesystem tunable,
max_writeback_mb_bump, which caps this to a default of 128mb per
inode.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

55138e0b

28 9月, 2009 1 次提交

ext4: Fix hueristic which avoids group preallocation for closed files · 71780577

由 Theodore Ts'o 提交于 9月 28, 2009

The hueristic was designed to avoid using locality group preallocation
when writing the last segment of a closed file.  Fix it by move
setting size to the maximum of size and isize until after we check
whether size == isize.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

71780577

27 9月, 2009 1 次提交

ext4: Use ext4_msg() for ext4_da_writepage() errors · 1693918e

由 Theodore Ts'o 提交于 9月 26, 2009

This allows the user to see what filesystem was involved with a
particular ext4_da_writepage() error.  Also, use KERN_CRIT which is
more appropriate than KERN_EMERG.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1693918e

22 9月, 2009 2 次提交

const: make struct super_block::s_qcop const · 0d54b217

由 Alexey Dobriyan 提交于 9月 21, 2009

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0d54b217

const: make struct super_block::dq_op const · 61e225dc

由 Alexey Dobriyan 提交于 9月 21, 2009

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

61e225dc

21 9月, 2009 1 次提交
- A
  trivial: fix typo "to to" in multiple files · fd589a8f
  由 Anand Gadiyar 提交于 7月 16, 2009
```
Signed-off-by: NAnand Gadiyar <gadiyar@ti.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
```
  fd589a8f
17 9月, 2009 9 次提交

ext4: replace MAX_DEFRAG_SIZE with EXT_MAX_BLOCK · 0a80e986

由 Eric Sandeen 提交于 9月 17, 2009

There's no reason to redefine the maximum allowable offset
in an extent-based file just for defrag; 
EXT_MAX_BLOCK already does this.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0a80e986

ext4: Fix the alloc on close after a truncate hueristic · 5534fb5b

由 Theodore Ts'o 提交于 9月 17, 2009

In an attempt to avoid doing an unneeded flush after opening a
(previously non-existent) file with O_CREAT|O_TRUNC, the code only
triggered the hueristic if ei->disksize was non-zero.  Turns out that
the VFS doesn't call ->truncate() if the file doesn't exist, and
ei->disksize is always zero even if the file previously existed.  So
remove the test, since it isn't necessary and in fact disabled the
hueristic.

Thanks to Clemens Eisserer that he was seeing problems with files
written using kwrite and eclipse after sudden crashes caused by a
buggy Intel video driver.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5534fb5b

T
ext4: Add a tracepoint for ext4_alloc_da_blocks() · fb40ba0d
由 Theodore Ts'o 提交于 9月 16, 2009
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
fb40ba0d

ext4: store EXT4_EXT_MIGRATE in i_state instead of i_flags · 1b9c12f4

由 Theodore Ts'o 提交于 9月 17, 2009

EXT4_EXT_MIGRATE is only intended to be used for an in-memory flag,
and the hex value assigned to it collides with FS_DIRECTIO_FL (which
is also stored in i_flags).  There's no reason for the
EXT4_EXT_MIGRATE bit to be stored in i_flags, so we switch it to use
i_state instead.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1b9c12f4

ext4: limit block allocations for indirect-block files to < 2^32 · fb0a387d

由 Eric Sandeen 提交于 9月 16, 2009

Today, the ext4 allocator will happily allocate blocks past
2^32 for indirect-block files, which results in the block
numbers getting truncated, and corruption ensues.

This patch limits such allocations to < 2^32, and adds
BUG_ONs if we do get blocks larger than that.

This should address RH Bug 519471, ext4 bitmap allocator 
must limit blocks to < 2^32

* ext4_find_goal() is modified to choose a goal < UINT_MAX,
  so that our starting point is in an acceptable range.

* ext4_xattr_block_set() is modified such that the goal block
  is < UINT_MAX, as above.

* ext4_mb_regular_allocator() is modified so that the group
  search does not continue into groups which are too high

* ext4_mb_use_preallocated() has a check that we don't use
  preallocated space which is too far out

* ext4_alloc_blocks() and ext4_xattr_block_set() add some BUG_ONs

No attempt has been made to limit inode locations to < 2^32,
so we may wind up with blocks far from their inodes.  Doing
this much already will lead to some odd ENOSPC issues when the
"lower 32" gets full, and further restricting inodes could
make that even weirder.

For high inodes, choosing a goal of the original, % UINT_MAX,
may be a bit odd, but then we're in an odd situation anyway,
and I don't know of a better heuristic.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fb0a387d

ext4: Fix different block exchange issue in EXT4_IOC_MOVE_EXT · c40ce3c9

由 Akira Fujita 提交于 9月 16, 2009

If logical block offset of original file which is passed to
EXT4_IOC_MOVE_EXT is different from donor file's,
a calculation error occurs in ext4_calc_swap_extents(),
therefore wrong block is exchanged between original file and donor file.
As a result, we hit ext4_error() in check_block_validity().
To detect the logical offset difference in EXT4_IOC_MOVE_EXT,
add checks to mext_calc_swap_extents() and handle it as error,
since data exchange must be done between the same blocks in EXT4_IOC_MOVE_EXT.
Reported-by: NPeng Tao <bergwolf@gmail.com>
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c40ce3c9

ext4: Add null extent check to ext_get_path · 347fa6f1

由 Akira Fujita 提交于 9月 16, 2009

There is the possibility that path structure which is taken
by ext4_ext_find_extent() indicates null extents.
Because during data block exchanging in ext4_move_extents(),
constitution of an extent tree may be changed.
As a solution, the patch adds null extent check
to ext_get_path().
Reported-by: NPeng Tao <bergwolf@gmail.com>
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

347fa6f1

ext4: Replace BUG_ON() with ext4_error() in move_extents.c · 2147b1a6

由 Akira Fujita 提交于 9月 16, 2009

Replace BUG_ON calls with a call to ext4_error()
to print an error message if EXT4_IOC_MOVE_EXT failed
with some kind of reasons.  This will help to debug.
Ted pointed this out, thanks.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2147b1a6

ext4: Replace get_ext_path macro with an inline funciton · e8505970

由 Akira Fujita 提交于 9月 16, 2009

Replace get_ext_path macro with an inline function,
since this macro looks like a function call but its arguments
get modified. Ted pointed this out, thanks.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e8505970

16 9月, 2009 1 次提交

HWPOISON: Enable .remove_error_page for migration aware file systems · aa261f54

由 Andi Kleen 提交于 9月 16, 2009

Enable removing of corrupted pages through truncation
for a bunch of file systems: ext*, xfs, gfs2, ocfs2, ntfs
These should cover most server needs.

I chose the set of migration aware file systems for this
for now, assuming they have been especially audited.
But in general it should be safe for all file systems
on the data area that support read/write and truncate.

Caveat: the hardware error handler does not take i_mutex
for now before calling the truncate function. Is that ok?

Cc: tytso@mit.edu
Cc: hch@infradead.org
Cc: mfasheh@suse.com
Cc: aia21@cantab.net
Cc: hugh.dickins@tiscali.co.uk
Cc: swhiteho@redhat.com
Signed-off-by: NAndi Kleen <ak@linux.intel.com>

aa261f54

15 9月, 2009 1 次提交

ext4: Fix include/trace/events/ext4.h to work with Systemtap · 3661d286

由 Theodore Ts'o 提交于 9月 14, 2009

Using relative pathnames in #include statements interacts badly with
SystemTap, since the fs/ext4/*.h header files are not packaged up as
part of a distribution kernel's header files. Since systemtap doesn't
use TP_fast_assign(), we can use a blind structure definition and then
make sure the needed header files are defined before the ext4 source
files #include the trace/events/ext4.h header file.

https://bugzilla.redhat.com/show_bug.cgi?id=512478Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

3661d286

14 9月, 2009 1 次提交

ext4: Remove syncing logic from ext4_file_write · 0d34ec62

由 Jan Kara 提交于 8月 18, 2009

The syncing is now properly handled by generic_file_aio_write() so
no special ext4 code is needed.

CC: linux-ext4@vger.kernel.org
CC: tytso@mit.edu
Signed-off-by: NJan Kara <jack@suse.cz>

0d34ec62

12 9月, 2009 1 次提交

ext4: Fix initalization of s_flex_groups · 7ad9bb65

由 Theodore Ts'o 提交于 9月 11, 2009

The s_flex_groups array should have been initialized using atomic_add
to sum up the free counts from the block groups that make up a
flex_bg. By using atomic_set, the value of the s_flex_groups array
was set to the values of the last block group in the flex_bg.

The impact of this bug is that the block and inode allocation
algorithms might not pick the best flex_bg for new allocation.

Thanks to Damien Guibouret for pointing out this problem!
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7ad9bb65

11 9月, 2009 2 次提交

ext4: Always set dx_node's fake_dirent explicitly. · 1f7bebb9

由 Andreas Schlick 提交于 9月 10, 2009

When ext4_dx_add_entry() has to split an index node, it has to ensure that
name_len of dx_node's fake_dirent is also zero, because otherwise e2fsck
won't recognise it as an intermediate htree node and consider the htree to
be corrupted.
Signed-off-by: NAndreas Schlick <schlick@lavabit.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f7bebb9

ext4: Don't update superblock write time when filesystem is read-only · 71290b36

由 Theodore Ts'o 提交于 9月 10, 2009

This avoids updating the superblock write time when we are mounting
the root file system read/only but we need to replay the journal; at
that point, for people who are east of GMT and who make their clock
tick in localtime for Windows bug-for-bug compatibility, and this will
cause e2fsck to complain and force a full file system check.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

71290b36

10 9月, 2009 4 次提交

ext4: Clarify the locking details in mballoc · 08c3a813

由 Aneesh Kumar K.V 提交于 9月 09, 2009

We don't need to take the alloc_sem lock when we are adding new
groups, since mballoc won't see the new group added until we bump
sbi->s_groups_count.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

08c3a813

ext4: check for need init flag in ext4_mb_load_buddy · f41c0750

由 Aneesh Kumar K.V 提交于 9月 09, 2009

We should check for need init flag with the group's alloc_sem held, to
make sure while we are loading the buddy cache and holding a reference
to it, a file system resize can't add new blocks to same group.

The patch also drops the need init flag check in
ext4_mb_regular_allocator() because doing the check without holding
alloc_sem is racy.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

f41c0750

ext4: move ext4_mb_init_group() function earlier in the mballoc.c · b6a758ec

由 Aneesh Kumar K.V 提交于 9月 09, 2009

This moves the function around so that it can be called from
ext4_mb_load_buddy().
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b6a758ec

ext4: Make non-journal fsync work properly · 91ac6f43

由 Frank Mayhar 提交于 9月 09, 2009

Teach ext4_write_inode() and ext4_do_update_inode() about non-journal
mode:  If we're not using a journal, ext4_write_inode() now calls
ext4_do_update_inode() (after getting the iloc via ext4_get_inode_loc())
with a new "do_sync" parameter.  If that parameter is nonzero _and_ we're
not using a journal, ext4_do_update_inode() calls sync_dirty_buffer()
instead of ext4_handle_dirty_metadata().

This problem was found in power-fail testing, checking the amount of
loss of files and blocks after a power failure when using fsync() and
when not using fsync().  It turned out that using fsync() was actually
worse than not doing so, possibly because it increased the likelihood
that the inodes would remain unflushed and would therefore be lost at
the power failure.
Signed-off-by: NFrank Mayhar <fmayhar@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

91ac6f43

13 9月, 2009 1 次提交

ext4: Assure that metadata blocks are written during fsync in no journal mode · fe188c0e

由 Theodore Ts'o 提交于 9月 12, 2009

When there is no journal present, we must attach buffer heads
associated with extent tree and indirect blocks to the inode's
mapping->private_list via mark_buffer_dirty_inode() so that
ext4_sync_file() --- which is called to service fsync() and
fdatasync() system calls --- can write out the inode's metadata blocks
by calling sync_mapping_buffers().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fe188c0e

10 9月, 2009 1 次提交

ext4: Use bforget() in no journal mode for ext4_journal_{forget,revoke}() · c7acb4c1

由 Theodore Ts'o 提交于 9月 09, 2009

When ext4 is using a journal, a metadata block which is deallocated
must be passed into the journal layer so it can be dropped from the
current transaction and/or revoked. This is done by calling the
functions ext4_journal_forget() and ext4_journal_revoke(), which call
jbd2_journal_forget(), and jbd2_journal_revoke(), respectively.

Since the jbd2_journal_forget() and jbd2_journal_revoke() call
bforget(), if ext4 is not using a journal, ext4_journal_forget() and
ext4_journal_revoke() must call bforget() to avoid a dirty metadata
block overwriting a block after it has been reallocated and reused for
another inode's data block.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c7acb4c1

09 9月, 2009 1 次提交

ext[234]: move over to 'check_acl' permission model · 1d5ccd1c

由 Linus Torvalds 提交于 8月 28, 2009

Don't implement per-filesystem 'extX_permission()' functions that have
to be called for every path component operation, and instead just expose
the actual ACL checking so that the VFS layer can now do it for us.
Reviewed-by: NJames Morris <jmorris@namei.org>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d5ccd1c

08 9月, 2009 1 次提交

ext4: print more sysadmin-friendly message in check_block_validity() · 80e42468

由 Theodore Ts'o 提交于 9月 08, 2009

Drop the WARN_ON(1), as he stack trace is not appropriate, since it is
triggered by file system corruption, and it misleads users into
thinking there is a kernel bug.  In addition, change the message
displayed by ext4_error() to make it clear that this is a file system
corruption problem.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

80e42468

10 9月, 2009 1 次提交

ext4: Take page lock before looking at attached buffer_heads flags · a827eaff

由 Aneesh Kumar K.V 提交于 9月 09, 2009

In order to check whether the buffer_heads are mapped we need to hold
page lock. Otherwise a reclaim can cleanup the attached buffer_heads.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a827eaff

06 9月, 2009 3 次提交

ext4: Fix small typo for move_extent_per_page() · 44fc48f7

由 Akira Fujita 提交于 9月 05, 2009

This function means moving extents every page, so change its name from
move_exgtent_par_page().
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.co.jp>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

44fc48f7

ext4: Return exchanged blocks count to user space in failure · 8d666913

由 Akira Fujita 提交于 9月 05, 2009

Return exchanged blocks count (moved_len) to user space,
if ext4_move_extents() failed on the way.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8d666913

ext4: Remove unneeded BUG_ON() in ext4_move_extents() · daea696d

由 Akira Fujita 提交于 9月 05, 2009

The ext4_move_extents() functions checks with BUG_ON() whether the
exchanged blocks count accords with request blocks count.  But, if the
target range (orig_start + len) includes sparse block(s), 'moved_len'
(exchanged blocks count) does not agree with 'len' (request blocks
count), since sparse block is not counted in 'moved_len'.  This causes
us to hit the BUG_ON(), even though the function succeeded.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

daea696d

17 9月, 2009 1 次提交

ext4: Fix wrong comparisons in mext_check_arguments() · 70d5d3dc

由 Akira Fujita 提交于 9月 16, 2009

The mext_check_arguments() function in move_extents.c has wrong
comparisons.  orig_start which is passed from user-space is block
unit, but i_size of inode is byte unit, therefore the checks do not
work fine.  This mis-check leads to the overflow of 'len' and then
hits BUG_ON() in ext4_move_extents().  The patch fixes this issue.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Reviewed-by: NGreg Freemyer <greg.freemyer@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

70d5d3dc

06 9月, 2009 2 次提交

ext4: fix cache flush in ext4_sync_file · 5f3481e9

由 Christoph Hellwig 提交于 9月 05, 2009

We need to flush the write cache unconditionally in ->fsync, otherwise
writes into already allocated blocks can get lost.  Writes into fully
allocated files are very common when using disk images for
virtualization, and without this fix can easily lose data after
an fdatasync, which is the typical implementation for a cache flush on
the virtual drive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5f3481e9

ext4: Remove journal_checksum mount option and enable it by default · d0646f7b

由 Theodore Ts'o 提交于 9月 05, 2009

There's no real cost for the journal checksum feature, and we should
make sure it is enabled all the time.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d0646f7b