提交 · a521100231f816f8cdd9c8e77da14ff1e42c2b17 · OpenHarmony / kernel_linux

05 9月, 2014 1 次提交

ext4: pass allocation_request struct to ext4_(alloc,splice)_branch · a5211002

由 Theodore Ts'o 提交于 9月 04, 2014

Instead of initializing the allocation_request structure in
ext4_alloc_branch(), set it up in ext4_ind_map_blocks(), and then pass
it to ext4_alloc_branch() and ext4_splice_branch().

This allows ext4_ind_map_blocks to pass flags in the allocation
request structure without having to add Yet Another argument to
ext4_alloc_branch().
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

a5211002

02 9月, 2014 16 次提交

ext4: track extent status tree shrinker delay statictics · eb68d0e2

由 Zheng Liu 提交于 9月 01, 2014

This commit adds some statictics in extent status tree shrinker.  The
purpose to add these is that we want to collect more details when we
encounter a stall caused by extent status tree shrinker.  Here we count
the following statictics:
  stats:
    the number of all objects on all extent status trees
    the number of reclaimable objects on lru list
    cache hits/misses
    the last sorted interval
    the number of inodes on lru list
  average:
    scan time for shrinking some objects
    the number of shrunk objects
  maximum:
    the inode that has max nr. of objects on lru list
    the maximum scan time for shrinking some objects

The output looks like below:
  $ cat /proc/fs/ext4/sda1/es_shrinker_info
  stats:
    28228 objects
    6341 reclaimable objects
    5281/631 cache hits/misses
    586 ms last sorted interval
    250 inodes on lru list
  average:
    153 us scan time
    128 shrunk objects
  maximum:
    255 inode (255 objects, 198 reclaimable)
    125723 us max scan time

If the lru list has never been sorted, the following line will not be
printed:
    586ms last sorted interval
If there is an empty lru list, the following lines also will not be
printed:
    250 inodes on lru list
  ...
  maximum:
    255 inode (255 objects, 198 reclaimable)
    0 us max scan time

Meanwhile in this commit a new trace point is defined to print some
details in __ext4_es_shrink().

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

eb68d0e2

ext4: improve extents status tree trace point · e963bb1d

由 Zheng Liu 提交于 9月 01, 2014

This commit improves the trace point of extents status tree.  We rename
trace_ext4_es_shrink_enter in ext4_es_count() because it is also used
in ext4_es_scan() and we can not identify them from the result.

Further this commit fixes a variable name in trace point in order to
keep consistency with others.

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

e963bb1d

ext4: fix comments about get_blocks · d91bd2c1

由 Seunghun Lee 提交于 9月 01, 2014

get_blocks is renamed to get_block.
Signed-off-by: NSeunghun Lee <waydi1@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

d91bd2c1

ext4: enable block_validity by default · 45f1a9c3

由 Darrick J. Wong 提交于 9月 01, 2014

Enable by default the block_validity feature, which checks for
collisions between newly allocated blocks and critical system
metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

45f1a9c3

T
ext4: rename ext4_ext_find_extent() to ext4_find_extent() · ed8a1a76
由 Theodore Ts'o 提交于 9月 01, 2014
```
Make the function name less redundant.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
ed8a1a76

ext4: reuse path object in ext4_move_extents() · 3bdf14b4

由 Theodore Ts'o 提交于 9月 01, 2014

Reuse the path object in ext4_move_extents() so we don't unnecessarily
free and reallocate it.

Also clean up the get_ext_path() wrapper so that it has the same
semantics of freeing the path object on error as ext4_ext_find_extent().
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

3bdf14b4

ext4: reuse path object in ext4_ext_shift_extents() · ee4bd0d9

由 Theodore Ts'o 提交于 9月 01, 2014

Now that the semantics of ext4_ext_find_extent() are much cleaner,
it's safe and more efficient to reuse the path object across the
multiple calls to ext4_ext_find_extent() in ext4_ext_shift_extents().
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

ee4bd0d9

ext4: teach ext4_ext_find_extent() to realloc path if necessary · 10809df8

由 Theodore Ts'o 提交于 9月 01, 2014

This adds additional safety in case for some reason we end reusing a
path structure which isn't big enough for current depth of the inode.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

10809df8

ext4: allow a NULL argument to ext4_ext_drop_refs() · b7ea89ad

由 Theodore Ts'o 提交于 9月 01, 2014

Teach ext4_ext_drop_refs() to accept a NULL argument, much like
kfree().  This allows us to drop a lot of checks to make sure path is
non-NULL before calling ext4_ext_drop_refs().
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

b7ea89ad

ext4: call ext4_ext_drop_refs() from ext4_ext_find_extent() · 523f431c

由 Theodore Ts'o 提交于 9月 01, 2014

In nearly all of the calls to ext4_ext_find_extent() where the caller
is trying to recycle the path object, ext4_ext_drop_refs() gets called
to release the buffer heads before the path object gets overwritten.
To simplify things for the callers, and to avoid the possibility of a
memory leak, make ext4_ext_find_extent() responsible for dropping the
buffers.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

523f431c

ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code · dfe50809

由 Theodore Ts'o 提交于 9月 01, 2014

Drop EXT4_EX_NOFREE_ON_ERR from ext4_ext_create_new_leaf(),
ext4_split_extent(), ext4_convert_unwritten_extents_endio().

This requires fixing all of their callers to potentially
ext4_ext_find_extent() to free the struct ext4_ext_path object in case
of an error, and there are interlocking dependencies all the way up to
ext4_ext_map_blocks(), ext4_swap_extents(), and
ext4_ext_remove_space().

Once this is done, we can drop the EXT4_EX_NOFREE_ON_ERR flag since it
is no longer necessary.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

dfe50809

ext4: drop EXT4_EX_NOFREE_ON_ERR in convert_initialized_extent() · 4f224b8b

由 Theodore Ts'o 提交于 9月 01, 2014

Transfer responsibility of freeing struct ext4_ext_path on error to
ext4_ext_find_extent().
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4f224b8b

ext4: collapse ext4_convert_initialized_extents() · e8b83d93

由 Theodore Ts'o 提交于 9月 01, 2014

The function ext4_convert_initialized_extents() is only called by a
single function --- ext4_ext_convert_initalized_extents().  Inline the
code and get rid of the unnecessary bits in order to simplify the code.

Rename ext4_ext_convert_initalized_extents() to
convert_initalized_extents() since it's a static function that is
actually only used in a single caller, ext4_ext_map_blocks().
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

e8b83d93

ext4: teach ext4_ext_find_extent() to free path on error · 705912ca

由 Theodore Ts'o 提交于 9月 01, 2014

Right now, there are a places where it is all to easy to leak memory
on an error path, via a usage like this:

	struct ext4_ext_path *path = NULL

	while (...) {
		...
		path = ext4_ext_find_extent(inode, block, path, 0);
		if (IS_ERR(path)) {
			/* oops, if path was non-NULL before the call to
			   ext4_ext_find_extent, we've leaked it!  :-(  */
			...
			return PTR_ERR(path);
		}
		...
	}

Unfortunately, there some code paths where we are doing the following
instead:

	path = ext4_ext_find_extent(inode, block, orig_path, 0);

and where it's important that we _not_ free orig_path in the case
where ext4_ext_find_extent() returns an error.

So change the function signature of ext4_ext_find_extent() so that it
takes a struct ext4_ext_path ** for its third argument, and by
default, on an error, it will free the struct ext4_ext_path, and then
zero out the struct ext4_ext_path * pointer.  In order to avoid
causing problems, we add a flag EXT4_EX_NOFREE_ON_ERR which causes
ext4_ext_find_extent() to use the original behavior of forcing the
caller to deal with freeing the original path pointer on the error
case.

The goal is to get rid of EXT4_EX_NOFREE_ON_ERR entirely, but this
allows for a gentle transition and makes the patches easier to verify.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

705912ca

ext4: fix accidental flag aliasing in ext4_map_blocks flags · bd30d702

由 Theodore Ts'o 提交于 9月 01, 2014

Commit b8a86845 introduced an accidental flag aliasing between
EXT4_EX_NOCACHE and EXT4_GET_BLOCKS_CONVERT_UNWRITTEN.

Fortunately, this didn't introduce any untorward side effects --- we
got lucky.  Nevertheless, fix this and leave a warning to hopefully
avoid this from happening in the future.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

bd30d702

ext4: fix ZERO_RANGE bug hidden by flag aliasing · 713e8dde

由 Theodore Ts'o 提交于 9月 01, 2014

We accidently aliased EXT4_EX_NOCACHE and EXT4_GET_CONVERT_UNWRITTEN
falgs, which apparently was hiding a bug that was unmasked when this
flag aliasing issue was addressed (see the subsequent commit).  The
reproduction case was:

   fsx -N 10000 -l 500000 -r 4096 -t 4096 -w 4096 -Z -R -W /vdb/junk

... which would cause fsx to report corruption in the data file.

The fix we have is a bit of an overkill, but I'd much rather be
conservative for now, and we can optimize ZERO_RANGE_FL handling
later.  The fact that we need to zap the extent_status cache for the
inode is unfortunate, but correctness is far more important than
performance.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>

713e8dde

01 9月, 2014 1 次提交

ext4: fix ext4_swap_extents() error handling · 19008f6d

由 Theodore Ts'o 提交于 8月 31, 2014

If ext4_ext_find_extent() returns an error, we have to clear path1 or
path2 or else we would end up trying to free an ERR_PTR, which would
be bad.

Also eliminate some redundant code and mark the error paths as unlikely()
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

19008f6d

31 8月, 2014 3 次提交

ext4: refactor ext4_move_extents code base · fcf6b1b7

由 Dmitry Monakhov 提交于 8月 30, 2014

ext4_move_extents is too complex for review. It has duplicate almost
each function available in the rest of other codebase. It has useless
artificial restriction orig_offset == donor_offset. But in fact logic
of ext4_move_extents is very simple:

Iterate extents one by one (similar to ext4_fill_fiemap_extents)
   ->Iterate each page covered extent (similar to generic_perform_write)
     ->swap extents for covered by page (can be shared with IOC_MOVE_DATA)
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

fcf6b1b7

ext4: use ext4_ext_next_allocated_block instead of mext_next_extent · f8fb4f41

由 Dmitry Monakhov 提交于 8月 30, 2014

This allows us to make mext_next_extent static and potentially get rid
of it.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

f8fb4f41

D
ext4: use ext4_update_i_disksize instead of opencoded ones · ee124d27
由 Dmitry Monakhov 提交于 8月 30, 2014
```
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
ee124d27

30 8月, 2014 6 次提交
- W
  ext4: remove a duplicate call in ext4_init_new_dir() · 52c826db
  由 Wang Shilong 提交于 8月 29, 2014
```
ext4_journal_get_write_access() has just been called in ext4_append()
calling it again here is duplicated.
Signed-off-by: NWang Shilong <wshilong@ddn.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  52c826db
- T
  ext4: convert do_split() to use the ERR_PTR convention · f8b3b59d
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  f8b3b59d
- T
  ext4: convert dx_probe() to use the ERR_PTR convention · dd73b5d5
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  dd73b5d5
- T
  ext4: convert ext4_bread() to use the ERR_PTR convention · 1c215028
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  1c215028
- T
  ext4: convert ext4_getblk() to use the ERR_PTR convention · 10560082
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  10560082
- T
  ext4: convert ext4_dx_find_entry() to use the ERR_PTR convention · 537d8f93
  由 Theodore Ts'o 提交于 8月 29, 2014
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
  537d8f93
29 8月, 2014 3 次提交

ext4: fix same-dir rename when inline data directory overflows · d80d448c

由 Darrick J. Wong 提交于 8月 27, 2014

When performing a same-directory rename, it's possible that adding or
setting the new directory entry will cause the directory to overflow
the inline data area, which causes the directory to be converted to an
extent-based directory.  Under this circumstance it is necessary to
re-read the directory when deleting the old dirent because the "old
directory" context still points to i_block in the inode table, which
is now an extent tree root!  The delete fails with an FS error, and
the subsequent fsck complains about incorrect link counts and
hardlinked directories.

Test case (originally found with flat_dir_test in the metadata_csum
test program):

# mkfs.ext4 -O inline_data /dev/sda
# mount /dev/sda /mnt
# mkdir /mnt/x
# touch /mnt/x/changelog.gz /mnt/x/copyright /mnt/x/README.Debian
# sync
# for i in /mnt/x/*; do mv $i $i.longer; done
# ls -la /mnt/x/
total 0
-rw-r--r-- 1 root root 0 Aug 25 12:03 changelog.gz.longer
-rw-r--r-- 1 root root 0 Aug 25 12:03 copyright
-rw-r--r-- 1 root root 0 Aug 25 12:03 copyright.longer
-rw-r--r-- 1 root root 0 Aug 25 12:03 README.Debian.longer

(Hey!  Why are there four files now??)
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

d80d448c

jbd2: fix descriptor block size handling errors with journal_csum · db9ee220

由 Darrick J. Wong 提交于 8月 27, 2014

It turns out that there are some serious problems with the on-disk
format of journal checksum v2.  The foremost is that the function to
calculate descriptor tag size returns sizes that are too big.  This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.

Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.

Add a few function helpers so we don't have to open-code quite so
many pieces.

Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reported-by: NTR Reardon <thomas_reardon@hotmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

db9ee220

ext4: update i_disksize coherently with block allocation on error path · 6603120e

由 Dmitry Monakhov 提交于 8月 27, 2014

In case of delalloc block i_disksize may be less than i_size. So we
have to update i_disksize each time we allocated and submitted some
blocks beyond i_disksize.  We weren't doing this on the error paths,
so fix this.

testcase: xfstest generic/019
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

6603120e

28 8月, 2014 2 次提交

ext4: fix transaction issues for ext4_fallocate and ext_zero_range · c174e6d6

由 Dmitry Monakhov 提交于 8月 27, 2014

After commit f282ac19 we use different transactions for
preallocation and i_disksize update which result in complain from fsck
after power-failure.  spotted by generic/019. IMHO this is regression
because fs becomes inconsistent, even more 'e2fsck -p' will no longer
works (which drives admins go crazy) Same transaction requirement
applies ctime,mtime updates

testcase: xfstest generic/019
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

c174e6d6

ext4: fix incorect journal credits reservation in ext4_zero_range · 69dc9536

由 Dmitry Monakhov 提交于 8月 27, 2014

Currently we reserve only 4 blocks but in worst case scenario
ext4_zero_partial_blocks() may want to zeroout and convert two
non adjacent blocks.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

69dc9536

24 8月, 2014 3 次提交

ext4: move i_size,i_disksize update routines to helper function · 4631dbf6

由 Dmitry Monakhov 提交于 8月 23, 2014

Cc: stable@vger.kernel.org # needed for bug fix patches
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4631dbf6

ext4: fix BUG_ON in mb_free_blocks() · c99d1e6e

由 Theodore Ts'o 提交于 8月 23, 2014

If we suffer a block allocation failure (for example due to a memory
allocation failure), it's possible that we will call
ext4_discard_allocated_blocks() before we've actually allocated any
blocks.  In that case, fe_len and fe_start in ac->ac_f_ex will still
be zero, and this will result in mb_free_blocks(inode, e4b, 0, 0)
triggering the BUG_ON on mb_free_blocks():

	BUG_ON(last >= (sb->s_blocksize << 3));

Fix this by bailing out of ext4_discard_allocated_blocks() if fs_len
is zero.

Also fix a missing ext4_mb_unload_buddy() call in
ext4_discard_allocated_blocks().

Google-Bug-Id: 16844242

Fixes: 86f0afd4Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

c99d1e6e

ext4: propagate errors up to ext4_find_entry()'s callers · 36de9286

由 Theodore Ts'o 提交于 8月 23, 2014

If we run into some kind of error, such as ENOMEM, while calling
ext4_getblk() or ext4_dx_find_entry(), we need to make sure this error
gets propagated up to ext4_find_entry() and then to its callers.  This
way, transient errors such as ENOMEM can get propagated to the VFS.
This is important so that the system calls return the appropriate
error, and also so that in the case of ext4_lookup(), we return an
error instead of a NULL inode, since that will result in a negative
dentry cache entry that will stick around long past the OOM condition
which caused a transient ENOMEM error.

Google-Bug-Id: #17142205
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org

36de9286

08 8月, 2014 1 次提交

fs: call rename2 if exists · 7177a9c4

由 Miklos Szeredi 提交于 7月 23, 2014

Christoph Hellwig suggests:

1) make vfs_rename call ->rename2 if it exists instead of ->rename
2) switch all filesystems that you're adding NOREPLACE support for to
   use ->rename2
3) see how many ->rename instances we'll have left after a few
   iterations of 2.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7177a9c4

31 7月, 2014 1 次提交

ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct · 86f0afd4

由 Theodore Ts'o 提交于 7月 30, 2014

If there is a failure while allocating the preallocation structure, a
number of blocks can end up getting marked in the in-memory buddy
bitmap, and then not getting released.  This can result in the
following corruption getting reported by the kernel:

EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
12793 clusters in bitmap, 12729 in gd

In that case, we need to release the blocks using mb_free_blocks().

Tested: fs smoke test; also demonstrated that with injected errors,
	the file system is no longer getting corrupted

Google-Bug-Id: 16657874
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

86f0afd4

29 7月, 2014 2 次提交

ext4: fix COLLAPSE RANGE test for bigalloc file systems · ee98fa3a

由 Namjae Jeon 提交于 7月 29, 2014

Blocks in collapse range should be collapsed per cluster unit when
bigalloc is enable. If bigalloc is not enable, EXT4_CLUSTER_SIZE will
be same with EXT4_BLOCK_SIZE.

With this bug fixed, patch enables COLLAPSE_RANGE for bigalloc, which
fixes a large number of xfstest failures which use fsx.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

ee98fa3a

ext4: check inline directory before converting · 40b163f1

由 Darrick J. Wong 提交于 7月 28, 2014

Before converting an inline directory to a regular directory, check
the directory entries to make sure they're not obviously broken.
This helps us to avoid a BUG_ON if one of the dirents is trashed.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

40b163f1

28 7月, 2014 1 次提交

ext4: fix incorrect locking in move_extent_per_page · 6e263146

由 Dmitry Monakhov 提交于 7月 27, 2014

If we have to copy data we must drop i_data_sem because of
get_blocks() will be called inside mext_page_mkuptodate(), but later we must
reacquire it again because we are about to change extent's tree
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NJan Kara <jack@suse.cz>

6e263146

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年