提交 · cc967be54710d97c05229b2e5ba2d00df84ddd64 · openeuler / Kernel

17 5月, 2010 8 次提交

ext4: Make fsync sync new parent directories in no-journal mode · 14ece102

由 Frank Mayhar 提交于 5月 17, 2010

Add a new ext4 state to tell us when a file has been newly created; use
that state in ext4_sync_file in no-journal mode to tell us when we need
to sync the parent directory as well as the inode and data itself.  This
fixes a problem in which a panic or power failure may lose the entire
file even when using fsync, since the parent directory entry is lost.

Addresses-Google-Bug: #2480057
Signed-off-by: NFrank Mayhar <fmayhar@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

14ece102

ext4: Drop whitespace at end of lines · 60e6679e

由 Theodore Ts'o 提交于 5月 17, 2010

This patch was generated using:

#!/usr/bin/perl -i
while (<>) {
    s/[ 	]+$//;
    print;
}
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

60e6679e

ext4: Fix compat EXT4_IOC_ADD_GROUP · 4d92dc0f

由 Ben Hutchings 提交于 5月 17, 2010

struct ext4_new_group_input needs to be converted because u64 has
only 32-bit alignment on some 32-bit architectures, notably i386.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4d92dc0f

ext4: Conditionally define compat ioctl numbers · 899ad0ce

由 Ben Hutchings 提交于 5月 17, 2010

It is unnecessary, and in general impossible, to define the compat
ioctl numbers except when building the filesystem with CONFIG_COMPAT
defined.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

899ad0ce

ext4: Use bitops to read/modify i_flags in struct ext4_inode_info · 12e9b892

由 Dmitry Monakhov 提交于 5月 16, 2010

At several places we modify EXT4_I(inode)->i_flags without holding
i_mutex (ext4_do_update_inode, ...). These modifications are racy and
we can lose updates to i_flags. So convert handling of i_flags to use
bitops which are atomic.

https://bugzilla.kernel.org/show_bug.cgi?id=15792Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

12e9b892

ext4: Convert calls of ext4_error() to EXT4_ERROR_INODE() · 24676da4

由 Theodore Ts'o 提交于 5月 16, 2010

EXT4_ERROR_INODE() tends to provide better error information and in a
more consistent format.  Some errors were not even identifying the inode
or directory which was corrupted, which made them not very useful.

Addresses-Google-Bug: #2507977
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

24676da4

ext4: Add new abstraction ext4_map_blocks() underneath ext4_get_blocks() · e35fd660

由 Theodore Ts'o 提交于 5月 16, 2010

Jack up ext4_get_blocks() and add a new function, ext4_map_blocks()
which uses a much smaller structure, struct ext4_map_blocks which is
20 bytes, as opposed to a struct buffer_head, which nearly 5 times
bigger on an x86_64 machine. By switching things to use
ext4_map_blocks(), we can save stack space by using ext4_map_blocks()
since we can avoid allocating a struct buffer_head on the stack.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e35fd660

ext4: check for a good block group before loading buddy pages · 8a57d9d6

由 Curt Wohlgemuth 提交于 5月 16, 2010

This adds a new field in ext4_group_info to cache the largest available
block range in a block group; and don't load the buddy pages until *after*
we've done a sanity check on the block group.

With large allocation requests (e.g., fallocate(), 8MiB) and relatively full
partitions, it's easy to have no block groups with a block extent large
enough to satisfy the input request length.  This currently causes the loop
during cr == 0 in ext4_mb_regular_allocator() to load the buddy bitmap pages
for EVERY block group.  That can be a lot of pages.  The patch below allows
us to call ext4_mb_good_group() BEFORE we load the buddy pages (although we
have check again after we lock the block group).

Addresses-Google-Bug: #2578108
Addresses-Google-Bug: #2704453
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8a57d9d6

06 3月, 2010 1 次提交

pass writeback_control to ->write_inode · a9185b41

由 Christoph Hellwig 提交于 3月 05, 2010

This gives the filesystem more information about the writeback that
is happening.  Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a9185b41

04 3月, 2010 1 次提交

ext4: consolidate in_range() definitions · 731eb1a0

由 Akinobu Mita 提交于 3月 03, 2010

There are duplicate macro definitions of in_range() in mballoc.h and
balloc.c.  This consolidates these two definitions into ext4.h, and
changes extents.c to use in_range() as well.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>

731eb1a0

03 3月, 2010 1 次提交

ext4: Convert BUG_ON checks to use ext4_error() instead · 273df556

由 Frank Mayhar 提交于 3月 02, 2010

Convert a bunch of BUG_ONs to emit a ext4_error() message and return
EIO.  This is a first pass and most notably does _not_ cover
mballoc.c, which is a morass of void functions.
Signed-off-by: NFrank Mayhar <fmayhar@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

273df556

05 3月, 2010 1 次提交

ext4: use ext4_get_block_write in buffer write · 744692dc

由 Jiaying Zhang 提交于 3月 04, 2010

Allocate uninitialized extent before ext4 buffer write and
convert the extent to initialized after io completes.
The purpose is to make sure an extent can only be marked
initialized after it has been written with new data so
we can safely drop the i_mutex lock in ext4 DIO read without
exposing stale data. This helps to improve multi-thread DIO
read performance on high-speed disks.

Skip the nobh and data=journal mount cases to make things simple for now.
Signed-off-by: NJiaying Zhang <jiayingz@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

744692dc

03 3月, 2010 1 次提交

ext4: mechanical rename some of the direct I/O get_block's identifiers · c7064ef1

由 Jiaying Zhang 提交于 3月 02, 2010

This commit renames some of the direct I/O's block allocation flags,
variables, and functions introduced in Mingming's "Direct IO for holes
and fallocate" patches so that they can be used by ext4's buffered
write path as well. Also changed the related function comments
accordingly to cover both direct write and buffered write cases.
Signed-off-by: NJiaying Zhang <jiayingz@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c7064ef1

24 2月, 2010 1 次提交

ext4: Add flag to files with blocks intentionally past EOF · c8d46e41

由 Jiaying Zhang 提交于 2月 24, 2010

fallocate() may potentially instantiate blocks past EOF, depending
on the flags used when it is called.

e2fsck currently has a test for blocks past i_size, and it
sometimes trips up - noticeably on xfstests 013 which runs fsstress.

This patch from Jiayang does fix it up - it (along with
e2fsprogs updates and other patches recently from Aneesh) has
survived many fsstress runs in a row.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NJiaying Zhang <jiayingz@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c8d46e41

17 2月, 2010 1 次提交

percpu: add __percpu sparse annotations to fs · 003cb608

由 Tejun Heo 提交于 2月 02, 2010

Add __percpu sparse annotations to fs.

These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors.  This patch doesn't affect normal builds.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>

003cb608

16 2月, 2010 1 次提交

ext4: move __func__ into a macro for ext4_warning, ext4_error · 12062ddd

由 Eric Sandeen 提交于 2月 15, 2010

Just a pet peeve of mine; we had a mishash of calls with either __func__
or "function_name" and the latter tends to get out of sync.

I think it's easier to just hide the __func__ in a macro, and it'll
be consistent from then on.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

12062ddd

25 1月, 2010 2 次提交

T
ext4: Reserve INCOMPAT_EA_INODE and INCOMPAT_DIRDATA feature codepoints · f710b4b9
由 Theodore Ts'o 提交于 1月 25, 2010
```
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
f710b4b9

ext4: Use bitops to read/modify EXT4_I(inode)->i_state · 19f5fb7a

由 Theodore Ts'o 提交于 1月 24, 2010

At several places we modify EXT4_I(inode)->i_state without holding
i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage,
ext4_do_update_inode, ...). These modifications are racy and we can
lose updates to i_state. So convert handling of i_state to use bitops
which are atomic.

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

19f5fb7a

15 1月, 2010 1 次提交

ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag · 1296cc85

由 Aneesh Kumar K.V 提交于 1月 15, 2010

We should update reserve space if it is delalloc buffer
and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag.
So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of
EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

1296cc85

25 1月, 2010 1 次提交

ext4: Fix quota accounting error with fallocate · 5f634d06

由 Aneesh Kumar K.V 提交于 1月 25, 2010

When we fallocate a region of the file which we had recently written,
and which is still in the page cache marked as delayed allocated blocks
we need to make sure we don't do the quota update on writepage path.
This is because the needed quota updated would have already be done
by fallocate.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

5f634d06

01 1月, 2010 1 次提交

ext4: Calculate metadata requirements more accurately · 9d0be502

由 Theodore Ts'o 提交于 1月 01, 2010

In the past, ext4_calc_metadata_amount(), and its sub-functions
ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
badly over-estimated the number of metadata blocks that might be
required for delayed allocation blocks. This didn't matter as much
when functions which managed the reserved metadata blocks were more
aggressive about dropping reserved metadata blocks as delayed
allocation blocks were written, but unfortunately they were too
aggressive. This was fixed in commit 0637c6f4, but as a result the
over-estimation by ext4_calc_metadata_amount() would lead to reserving
2-3 times the number of pending delayed allocation blocks as
potentially required metadata blocks. So if there are 1 megabytes of
blocks which have been not yet been allocation, up to 3 megabytes of
space would get reserved out of the user's quota and from the file
system free space pool until all of the inode's data blocks have been
allocated.

This commit addresses this problem by much more accurately estimating
the number of metadata blocks that will be required. It will still
somewhat over-estimate the number of blocks needed, since it must make
a worst case estimate not knowing which physical blocks will be
needed, but it is much more accurate than before.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

9d0be502

23 12月, 2009 1 次提交

ext4: Convert to generic reserved quota's space management. · a9e7f447

由 Dmitry Monakhov 提交于 12月 14, 2009

This patch also fixes write vs chown race condition.
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NJan Kara <jack@suse.cz>

a9e7f447

23 1月, 2010 1 次提交

ext4: Add block validity check when truncating indirect block mapped inodes · 1f2acb60

由 Theodore Ts'o 提交于 1月 22, 2010

Add checks to ext4_free_branches() to make sure a block number found
in an indirect block are valid before trying to free it.  If a bad
block number is found, stop freeing the indirect block immediately,
since the file system is corrupt and we will need to run fsck anyway.
This also avoids spamming the logs, and specifically avoids
driver-level "attempt to access beyond end of device" errors obscure
what is really going on.

If you get *really*, *really*, *really* unlucky, without this patch, a
supposed indirect block containing garbage might contain a reference
to a primary block group descriptor, in which case
ext4_free_branches() could end up zero'ing out a block group
descriptor block, and if then one of the block bitmaps for a block
group described by that bg descriptor block is not in memory, and is
read in by ext4_read_block_bitmap().  This function calls
ext4_valid_block_bitmap(), which assumes that bg_inode_table() was
validated at mount time and hasn't been modified since.  Since this
assumption is no longer valid, it's possible for the value
(ext4_inode_table(sb, desc) - group_first_block) to go negative, which
will cause ext4_find_next_zero_bit() to trigger a kernel GPF.

Addresses-Google-Bug: #2220436
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1f2acb60

05 2月, 2010 1 次提交

ext4: fix async i/o writes beyond 4GB to a sparse file · a1de02dc

由 Eric Sandeen 提交于 2月 04, 2010

The "offset" member in ext4_io_end holds bytes, not blocks, so
ext4_lblk_t is wrong - and too small (u32).

This caused the async i/o writes to sparse files beyond 4GB to fail
when they wrapped around to 0.

Also fix up the type of arguments to ext4_convert_unwritten_extents(),
it gets ssize_t from ext4_end_aio_dio_nolock() and
ext4_ext_direct_IO().
Reported-by: NGiel de Nijs <giel@vectorwise.com>
Signed-off-by: NEric Sandeen <sandeen@redhat.com>

a1de02dc

09 12月, 2009 1 次提交

ext4: Wait for proper transaction commit on fsync · b436b9be

由 Jan Kara 提交于 12月 08, 2009

We cannot rely on buffer dirty bits during fsync because pdflush can come
before fsync is called and clear dirty bits without forcing a transaction
commit. What we do is that we track which transaction has last changed
the inode and which transaction last changed allocation and force it to
disk on fsync.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

b436b9be

23 11月, 2009 1 次提交

ext4: call ext4_forget() from ext4_free_blocks() · e6362609

由 Theodore Ts'o 提交于 11月 23, 2009

Add the facility for ext4_forget() to be called from
ext4_free_blocks().  This simplifies the code in a large number of
places, and centralizes most of the work of calling ext4_forget() into
a single place.

Also fix a bug in the extents migration code; it wasn't calling
ext4_forget() when releasing the indirect blocks during the
conversion.  As a result, if the system cashed during or shortly after
the extents migration, and the released indirect blocks get reused as
data blocks, the journal replay would corrupt the data blocks.  With
this new patch, fixing this bug was as simple as adding the
EXT4_FREE_BLOCKS_FORGET flags to the call to ext4_free_blocks().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

e6362609

22 11月, 2009 1 次提交

ext4: fold ext4_free_blocks() and ext4_mb_free_blocks() · 44338711

由 Theodore Ts'o 提交于 11月 22, 2009

ext4_mb_free_blocks() is only called by ext4_free_blocks(), and the
latter function doesn't really do much. So merge the two functions
together, such that ext4_free_blocks() is now found in
fs/ext4/mballoc.c. This saves about 200 bytes of compiled text space.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

44338711

23 11月, 2009 1 次提交

ext4: move ext4_forget() to ext4_jbd2.c · d6797d14

由 Theodore Ts'o 提交于 11月 22, 2009

The ext4_forget() function better belongs in ext4_jbd2.c.  This will
allow us to do some cleanup of the ext4_journal_revoke() and
ext4_journal_forget() functions, as well as giving us better error
reporting since we can report the caller of ext4_forget() when things
go wrong.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d6797d14

20 11月, 2009 1 次提交

ext4: make trim/discard optional (and off by default) · 5328e635

由 Eric Sandeen 提交于 11月 19, 2009

It is anticipated that when sb_issue_discard starts doing
real work on trim-capable devices, we may see issues.  Make
this mount-time optional, and default it to off until we know
that things are working out OK.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5328e635

10 11月, 2009 1 次提交

ext4: skip conversion of uninit extents after direct IO if there isn't any · 5f524950

由 Mingming 提交于 11月 10, 2009

At the end of direct I/O operation, ext4_ext_direct_IO() always called
ext4_convert_unwritten_extents(), regardless of whether there were any
unwritten extents involved in the I/O or not.

This commit adds a state flag so that ext4_ext_direct_IO() only calls
ext4_convert_unwritten_extents() when necessary.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5f524950

03 11月, 2009 1 次提交

Revert "ext4: Remove journal_checksum mount option and enable it by default" · d4da6c9c

由 Linus Torvalds 提交于 11月 02, 2009

This reverts commit d0646f7b, as
requested by Eric Sandeen.

It can basically cause an ext4 filesystem to miss recovery (and thus get
mounted with errors) if the journal checksum does not match.

Quoth Eric:

   "My hand-wavy hunch about what is happening is that we're finding a
    bad checksum on the last partially-written transaction, which is
    not surprising, but if we have a wrapped log and we're doing the
    initial scan for head/tail, and we abort scanning on that bad
    checksum, then we are essentially running an unrecovered filesystem.

    But that's hand-wavy and I need to go look at the code.

    We lived without journal checksums on by default until now, and at
    this point they're doing more harm than good, so we should revert
    the default-changing commit until we can fix it and do some good
    power-fail testing with the fixes in place."

See

	http://bugzilla.kernel.org/show_bug.cgi?id=14354

for all the gory details.
Requested-by: NEric Sandeen <sandeen@redhat.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Alexey Fisher <bug-track@fisher-privat.net>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mathias Burén <mathias.buren@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d4da6c9c

30 9月, 2009 2 次提交

ext4: Fix time encoding with extra epoch bits · c1fccc06

由 Theodore Ts'o 提交于 9月 30, 2009

"Looking at ext4.h, I think the setting of extra time fields forgets to
mask the epoch bits so the epoch part overwrites nsec part. The second
change is only for coherency (2 -> EXT4_EPOCH_BITS)."

Thanks to Damien Guibouret for pointing out this problem.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c1fccc06

ext4: Use tracepoints for mb_history trace file · 296c355c

由 Theodore Ts'o 提交于 9月 30, 2009

The /proc/fs/ext4/<dev>/mb_history was maintained manually, and had a
number of problems: it required a largish amount of memory to be
allocated for each ext4 filesystem, and the s_mb_history_lock
introduced a CPU contention problem.  

By ripping out the mb_history code and replacing it with ftrace
tracepoints, and we get more functionality: timestamps, event
filtering, the ability to correlate mballoc history with other ext4
tracepoints, etc.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

296c355c

29 9月, 2009 3 次提交

ext4: async direct IO for holes and fallocate support · 8d5d02e6

由 Mingming Cao 提交于 9月 28, 2009

For async direct IO that covers holes or fallocate, the end_io
callback function now queued the convertion work on workqueue but
don't flush the work rightaway as it might take too long to afford.

But when fsync is called after all the data is completed, user expects
the metadata also being updated before fsync returns.

Thus we need to flush the conversion work when fsync() is called.
This patch keep track of a listed of completed async direct io that
has a work queued on workqueue.  When fsync() is called, it will go
through the list and do the conversion.
Signed-off-by: NMingming Cao <cmm@us.ibm.com>

8d5d02e6

ext4: Use end_io callback to avoid direct I/O fallback to buffered I/O · 4c0425ff

由 Mingming Cao 提交于 9月 28, 2009

Currently the DIO VFS code passes create = 0 when writing to the
middle of file.  It does this to avoid block allocation for holes, so
as not to expose stale data out when there is a parallel buffered read
(which does not hold the i_mutex lock).  Direct I/O writes into holes
falls back to buffered IO for this reason.

Since preallocated extents are treated as holes when doing a
get_block() look up (buffer is not mapped), direct IO over fallocate
also falls back to buffered IO.  Thus ext4 actually silently falls
back to buffered IO in above two cases, which is undesirable.

To fix this, this patch creates unitialized extents when a direct I/O
write into holes in sparse files, and registering an end_io callback which
converts the uninitialized extent to an initialized extent after the
I/O is completed.
Singed-Off-By: NMingming Cao <cmm@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4c0425ff

ext4: Split uninitialized extents for direct I/O · 0031462b

由 Mingming Cao 提交于 9月 28, 2009

When writing into an unitialized extent via direct I/O, and the direct
I/O doesn't exactly cover the unitialized extent, split the extent
into uninitialized and initialized extents before submitting the I/O.
This avoids needing to deal with an ENOSPC error in the end_io
callback that gets used for direct I/O.

When the IO is complete, the written extent will be marked as initialized.

Singed-Off-By: Mingming Cao <cmm@us.ibm.com> 
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0031462b

30 9月, 2009 1 次提交

ext4: Adjust ext4_da_writepages() to write out larger contiguous chunks · 55138e0b

由 Theodore Ts'o 提交于 9月 29, 2009

Work around problems in the writeback code to force out writebacks in
larger chunks than just 4mb, which is just too small.  This also works
around limitations in the ext4 block allocator, which can't allocate
more than 2048 blocks at a time.  So we need to defeat the round-robin
characteristics of the writeback code and try to write out as many
blocks in one inode before allowing the writeback code to move on to
another inode.  We add a a new per-filesystem tunable,
max_writeback_mb_bump, which caps this to a default of 128mb per
inode.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

55138e0b

17 9月, 2009 3 次提交

ext4: replace MAX_DEFRAG_SIZE with EXT_MAX_BLOCK · 0a80e986

由 Eric Sandeen 提交于 9月 17, 2009

There's no reason to redefine the maximum allowable offset
in an extent-based file just for defrag; 
EXT_MAX_BLOCK already does this.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0a80e986

ext4: store EXT4_EXT_MIGRATE in i_state instead of i_flags · 1b9c12f4

由 Theodore Ts'o 提交于 9月 17, 2009

EXT4_EXT_MIGRATE is only intended to be used for an in-memory flag,
and the hex value assigned to it collides with FS_DIRECTIO_FL (which
is also stored in i_flags).  There's no reason for the
EXT4_EXT_MIGRATE bit to be stored in i_flags, so we switch it to use
i_state instead.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1b9c12f4

ext4: limit block allocations for indirect-block files to < 2^32 · fb0a387d

由 Eric Sandeen 提交于 9月 16, 2009

Today, the ext4 allocator will happily allocate blocks past
2^32 for indirect-block files, which results in the block
numbers getting truncated, and corruption ensues.

This patch limits such allocations to < 2^32, and adds
BUG_ONs if we do get blocks larger than that.

This should address RH Bug 519471, ext4 bitmap allocator 
must limit blocks to < 2^32

* ext4_find_goal() is modified to choose a goal < UINT_MAX,
  so that our starting point is in an acceptable range.

* ext4_xattr_block_set() is modified such that the goal block
  is < UINT_MAX, as above.

* ext4_mb_regular_allocator() is modified so that the group
  search does not continue into groups which are too high

* ext4_mb_use_preallocated() has a check that we don't use
  preallocated space which is too far out

* ext4_alloc_blocks() and ext4_xattr_block_set() add some BUG_ONs

No attempt has been made to limit inode locations to < 2^32,
so we may wind up with blocks far from their inodes.  Doing
this much already will lead to some odd ENOSPC issues when the
"lower 32" gets full, and further restricting inodes could
make that even weirder.

For high inodes, choosing a goal of the original, % UINT_MAX,
may be a bit odd, but then we're in an odd situation anyway,
and I don't know of a better heuristic.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

fb0a387d

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功