提交 · 66a83cde47deb4e8874539326e12e88ed82158d3 · openanolis / cloud-kernel

26 10月, 2011 7 次提交

ext4: remove unused variable in ext4_mb_generate_from_pa() · 66a83cde

由 Robin Dong 提交于 10月 26, 2011

The variable 'count' in function ext4_mb_generate_from_pa() looks
useless, so remove it.
Signed-off-by: NRobin Dong <sanbai@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

66a83cde

ext4: use stream-alloc when mb_group_prealloc set to zero · ebbe0277

由 Robin Dong 提交于 10月 26, 2011

The kernel will crash on 

ext4_mb_mark_diskspace_used:
	BUG_ON(ac->ac_b_ex.fe_len <= 0);

after we set /sys/fs/ext4/sda/mb_group_prealloc to zero and create new files in an ext4 filesystem.

The reason is: ac_b_ex.fe_len also set to zero(mb_group_prealloc) in ext4_mb_normalize_group_request
because the ac_flags contains EXT4_MB_HINT_GROUP_ALLOC.

I think when someone set mb_group_prealloc to zero, it means DO NOT USE GROUP PREALLOCATION,
so we should set alloc-strategy to STREAM in this case.
Signed-off-by: NRobin Dong <sanbai@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ebbe0277

ext4: let ext4_page_mkwrite stop started handle in failure · fcbb5515

由 Yongqiang Yang 提交于 10月 26, 2011

The started journal handle should be stopped in failure case.
Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Acked-by: NJan Kara <jack@suse.cz>
Cc: stable@kernel.org

fcbb5515

ext4: handle NULL p_ext in ext4_ext_next_allocated_block() · 6f8ff537

由 Curt Wohlgemuth 提交于 10月 26, 2011

In ext4_ext_next_allocated_block(), the path[depth] might
have a p_ext that is NULL -- see ext4_ext_binsearch().  In
such a case, dereferencing it will crash the machine.

This patch checks for p_ext == NULL in
ext4_ext_next_allocated_block() before dereferencinging it.

Tested using a hand-crafted an inode with eh_entries == 0 in
an extent block, verified that running FIEMAP on it crashes
without this patch, works fine with it.
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6f8ff537

ext4: error handling fix in ext4_ext_convert_to_initialized() · f85b287a

由 Dan Carpenter 提交于 10月 26, 2011

When allocated is unsigned it breaks the error handling at the end
of the function when we call:
	allocated = ext4_split_extent(...);
	if (allocated < 0)
		err = allocated;

I've made it a signed int instead of unsigned.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f85b287a

ext4: use ext4_reserve_inode_write in ext4_xattr_set_handle · 66543617

由 Eric Sandeen 提交于 10月 26, 2011

ext4_mark_iloc_dirty() says:

 * The caller must have previously called ext4_reserve_inode_write().
 * Give this, we know that the caller already has write access to iloc->bh.

ext4_xattr_set_handle, however, just open-codes it.  May as well use
the helper function for consistency.

No bug here, just tidiness.

(Note: on cleanup path, ext4_reserve_inode_write sets
the bh to NULL if it returns an error, and brelse() of 
a null bh is handled gracefully).
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

66543617

ext4: avoid setting directory i_nlink to zero · 909a4cf1

由 Andreas Dilger 提交于 10月 26, 2011

If a directory with more than EXT4_LINK_MAX subdirectories, the nlink
count is set to 1.  Subsequently, if any subdirectories are deleted,
ext4_dec_count() decrements the i_nlink count, which may go to 0
temporarily before being incremented back to 1.

While this is done under i_mutex, which prevents races for directory
and inode operations that check i_nlink, the temporary i_nlink == 0
case is exposed to userspace via stat() and similar calls that do not
hold i_mutex.

Instead, change the code to not decrement i_nlink count for any
directories that do not already have i_nlink larger than 2.
Reported-by: NCliff White <cliffw@whamcloud.com>
Reviewed-by: NJohann Lombardi <johann@whamcloud.com>
Signed-off-by: NAndreas Dilger <adilger@whamcloud.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

909a4cf1

25 10月, 2011 3 次提交

ext4: prevent stack overrun in ext4_file_open · cf803903

由 Darrick J. Wong 提交于 10月 25, 2011

In ext4_file_open, the filesystem records the mountpoint of the first
file that is opened after mounting the filesystem.  It does this by
allocating a 64-byte stack buffer, calling d_path() to grab the mount
point through which this file was accessed, and then memcpy()ing 64
bytes into the superblock's s_last_mounted field, starting from the
return value of d_path(), which is stored as "cp".  However, if cp >
buf (which it frequently is since path components are prepended
starting at the end of buf) then we can end up copying stack data into
the superblock.

Writing stack variables into the superblock doesn't sound like a great
idea, so use strlcpy instead.  Andi Kleen suggested using strlcpy
instead of strncpy.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cf803903

ext4: update EOFBLOCKS flag on fallocate properly · a4e5d88b

由 Dmitry Monakhov 提交于 10月 25, 2011

EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
Currently it happens only if new extent was allocated.

TESTCASE:
fallocate test_file -n -l4096
fallocate test_file -l4096
Last fallocate cmd has updated size, but keept EOFBLOCK_FL set. And
fsck will complain about that.

Also remove ping pong in ext4_fallocate() in case of new extents,
where ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
ext4_falloc_update_inode() restore it again.
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

a4e5d88b

ext4: remove messy logic from ext4_ext_rm_leaf · 750c9c47

由 Dmitry Monakhov 提交于 10月 25, 2011

- Both callers(truncate and punch_hole) already aligned left end point
  so we no longer need split logic here.
- Remove dead duplicated code.
- Call ext4_ext_dirty only after we have updated eh_entries, otherwise
  we'll loose entries update. Regression caused by d583fb87
  266'th testcase in xfstests (http://patchwork.ozlabs.org/patch/120872)
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

750c9c47

22 10月, 2011 1 次提交

ext4: cleanup ext4_ext_grow_indepth code · 1939dd84

由 Dmitry Monakhov 提交于 10月 22, 2011

Currently code make an impression what grow procedure is very complicated
and some mythical paths, blocks are involved. But in fact grow in depth
it relatively simple procedure:
 1) Just create new meta block and copy root data to that block.
 2) Convert root from extent to index if old depth == 0
 3) Update root block pointer

This patch does:
 - Reorganize code to make it more self explanatory
 - Do not pass path parameter to new_meta_block() in order to
   provoke allocation from inode's group because top-level block
   should site closer to it's inode, but not to leaf data block.

   [ This happens anyway, due to logic in mballoc; we should drop
     the path parameter from new_meta_block() entirely.  -- tytso ]
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1939dd84

21 10月, 2011 3 次提交

ext4: Allow quota file use root reservation · 45dc63e7

由 Dmitry Monakhov 提交于 10月 20, 2011

Quota file is fs's metadata, so it is reasonable  to permit use
root resevation if necessary. This patch fix 265'th xfstest failure
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

45dc63e7

ext4: fix the deadlock in mpage_da_map_and_submit() · 8de49e67

由 Kazuya Mio 提交于 10月 20, 2011

If ext4_jbd2_file_inode() in mpage_da_map_and_submit() fails due to
journal abort, this function returns to caller without unlocking the
page.  It leads to the deadlock, and the patch fixes this issue by
calling mpage_da_submit_io().
Signed-off-by: NKazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

8de49e67

ext4: fix deadlock in ext4_ordered_write_end() · 09e0834f

由 Akira Fujita 提交于 10月 20, 2011

If ext4_jbd2_file_inode() in ext4_ordered_write_end() fails for some
reasons, this function returns to caller without unlocking the page.
It leads to the deadlock, and the patch fixes this issue.
Signed-off-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

09e0834f

18 10月, 2011 7 次提交

ext4: quiet sparse noise about plain integer as NULL pointer · ee90d57e

由 H Hartley Sweeten 提交于 10月 18, 2011

The third parameter to ext4_free_blocks is a struct buffer_head *.  This
parameter should be NULL not 0.

This quiets the sparse noise:

warning: Using plain integer as NULL pointer
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ee90d57e

ext4: add __user decoration to calls of copy_{from,to}_user() · e6705f7c

由 H Hartley Sweeten 提交于 10月 18, 2011

This quiets the sparse noise:

warning: incorrect type in argument 2 (different address spaces)
   expected void const [noderef] <asn:1>*from
   got struct fstrim_range *<noident>
warning: incorrect type in argument 1 (different address spaces)
   expected void [noderef] <asn:1>*to
   got struct fstrim_range *<noident>
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e6705f7c

ext4: functions should not be declared extern · e0cbee3e

由 H Hartley Sweeten 提交于 10月 18, 2011

The function declarations in ext4.h are already marked extern, so it's
not necessary to do so in the .c files.

This quiets the sparse noise:

warning: function 'ext4_flush_completed_IO' with external linkage has definition
warning: function 'ext4_init_inode_table' with external linkage has definition
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e0cbee3e

ext4: add block plug for .writepages · 1bce63d1

由 Shaohua Li 提交于 10月 18, 2011

Add block plug for ext4 .writepages. Though ext4 .writepages
already handles request merge very well, block plug is still
helpful to reduce block lock contention.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

1bce63d1

ext4: Fix comparison endianness problem in MMP initialization · f6f96fdb

由 Darrick J. Wong 提交于 10月 18, 2011

As part of startup, the MMP initialization code does this:

mmp->mmp_seq = seq = cpu_to_le32(mmp_new_seq());

Next, mmp->mmp_seq is written out to disk, a delay happens, and then
the MMP block is read back in and the sequence value is tested:

if (seq != le32_to_cpu(mmp->mmp_seq)) {
	/* fail the mount */

On a LE system such as x86, the *le32* functions do nothing and this
works.  Unfortunately, on a BE system such as ppc64, this comparison
becomes:

if (cpu_to_le32(new_seq) != le32_to_cpu(cpu_to_le32(new_seq)) {
	/* fail the mount */

Except for a few palindromic sequence numbers, this test always causes
the mount to fail, which makes MMP filesystems generally unmountable
on ppc64.  The attached patch fixes this situation.
Signed-off-by: NDarrick J. Wong <djwong@us.ibm.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f6f96fdb

ext4: MMP: fix error message rate-limiting logic in kmmpd · bdfc230f

由 Nikitas Angelinas 提交于 10月 18, 2011

Current logic would print an error message only once, and then
'failed_writes' would stay at 1.  Rework the loop to increment
'failed_writes' and print the error message every
s_mmp_update_interval * 60 seconds, as intended according to the
comment.
Signed-off-by: NNikitas Angelinas <nikitas_angelinas@xyratex.com>
Signed-off-by: NAndrew Perepechko <andrew_perepechko@xyratex.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Acked-by: NAndreas Dilger <adilger@dilger.ca>

bdfc230f

ext4: MMP: kmmpd should use nodename from init_uts_ns.name, not sysname · 215fc6af

由 Nikitas Angelinas 提交于 10月 18, 2011

sysname holds "Linux" by default, i.e. what appears when doing a "uname
-s"; nodename should be used to print the machine's hostname, i.e. what
is returned when doing a "uname -n" or "hostname", and what
gethostname(2)/sethostname(2) manipulate, in order to notify the
administrator of the node which is contending to mount the filesystem.
Acked-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NNikitas Angelinas <nikitas_angelinas@xyratex.com>
Signed-off-by: NAndrew Perepechko <andrew_perepechko@xyratex.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

215fc6af

17 10月, 2011 1 次提交

ext4: avoid stamping on other memories in ext4_ext_insert_index() · f472e026

由 Tao Ma 提交于 10月 17, 2011

Add a sanity check to make sure ix hasn't gone beyond the valid bounds
of the extent block.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

f472e026

09 10月, 2011 6 次提交

ext4: fix ext4 so it works without CONFIG_PROC_FS · d44651d0

由 Fabrice Jouhaud 提交于 10月 08, 2011

This fixes a bug which was introduced in dd68314c.  The problem
came from the test of the return value of proc_mkdir which is always
false without procfs, and this would initialization of ext4.
Signed-off-by: NFabrice Jouhaud <yargil@free.fr>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d44651d0

ext4: use le32_to_cpu for ext4_extent_idx.ei_block in ext4_ext_search_left() · 6ee3b212

由 Tao Ma 提交于 10月 08, 2011

ext4_extent_idx.e_block is __le32, so use le32_to_cpu() in
ext4_ext_search_left().
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6ee3b212

ext4: remove the obsolete/broken EXT4_IOC_WAIT_FOR_READONLY ioctl · 7fd59c83

由 Tao Ma 提交于 10月 08, 2011

There are no users of the EXT4_IOC_WAIT_FOR_READONLY ioctl, and it is
also broken.  No one sets the set_ro_timer, no one wakes up us and our
state is set to TASK_INTERRUPTIBLE not RUNNING.  So remove it.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7fd59c83

ext4: fix the comment describing ext4_ext_search_right() · df3ab170

由 Tao Ma 提交于 10月 08, 2011

The comment describing what ext4_ext_search_right() does is incorrect.
We return 0 in *phys when *logical is the 'largest' allocated block,
not smallest.  

Fix a few other typos while we're at it.

Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NTao Ma <boyu.mt@taobao.com>

df3ab170

ext4: remove deprecated oldalloc · 4113c4ca

由 Lukas Czerner 提交于 10月 08, 2011

For a long time now orlov is the default block allocator in the
ext4. It performs better than the old one and no one seems to claim
otherwise so we can safely drop it and make oldalloc and orlov mount
option deprecated.

This is a part of the effort to reduce number of ext4 options hence the
test matrix.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

4113c4ca

ext4: documentation: remove acl and user_xattr mount options · af909a57

由 Theodore Ts'o 提交于 10月 08, 2011

Acl and user_xattr mount options are no longer needed since those
features are enabled by default if configured in (seee commit
ea663336). We can not easily deprecate
mount options itself (since it is probably too early), but we can
remove it from documentation first.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

af909a57

07 10月, 2011 1 次提交

ext4: Free resources in some error path in ext4_fill_super · dcf2d804

由 Tao Ma 提交于 10月 06, 2011

Some of the error path in ext4_fill_super don't release the
resouces properly. So this patch just try to release them
in the right way.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

dcf2d804

06 10月, 2011 1 次提交

ext4: Free resources in ext4_mb_init()'s error paths · 7aa0baea

由 Tao Ma 提交于 10月 06, 2011

In commit 79a77c5a, we move ext4_mb_init_backend after the allocation
of s_locality_group to avoid memory leak in error path, but there are
still some other error paths in ext4_mb_init that need to do the same
work. So this patch adds all the error patch for ext4_mb_init. And all
the pointers are reset to NULL in case the caller may double free them.
Signed-off-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7aa0baea

10 9月, 2011 10 次提交

ext4: attempt to fix race in bigalloc code path · 5356f261

由 Aditya Kali 提交于 9月 09, 2011

Currently, there exists a race between delayed allocated writes and
the writeback when bigalloc feature is in use. The race was because we
wanted to determine what blocks in a cluster are under delayed
allocation and we were using buffer_delayed(bh) check for it. But, the
writeback codepath clears this bit without any synchronization which
resulted in a race and an ext4 warning similar to:

EXT4-fs (ram1): ext4_da_update_reserve_space: ino 13, used 1 with only 0
		reserved data blocks

The race existed in two places.
(1) between ext4_find_delalloc_range() and ext4_map_blocks() when called from
    writeback code path.
(2) between ext4_find_delalloc_range() and ext4_da_get_block_prep() (where
    buffer_delayed(bh) is set.

To fix (1), this patch introduces a new buffer_head state bit -
BH_Da_Mapped.  This bit is set under the protection of
EXT4_I(inode)->i_data_sem when we have actually mapped the delayed
allocated blocks during the writeout time. We can now reliably check
for this bit inside ext4_find_delalloc_range() to determine whether
the reservation for the blocks have already been claimed or not.

To fix (2), it was necessary to set buffer_delay(bh) under the
protection of i_data_sem.  So, I extracted the very beginning of
ext4_map_blocks into a new function - ext4_da_map_blocks() - and
performed the required setting of bh_delay bit and the quota
reservation under the protection of i_data_sem.  These two fixes makes
the checking of buffer_delay(bh) and buffer_da_mapped(bh) consistent,
thus removing the race.

Tested: I was able to reproduce the problem by running 'dd' and
'fsync' in parallel. Also, xfstests sometimes used to reproduce this
race. After the fix both my test and xfstests were successful and no
race (warning message) was observed.

Google-Bug-Id: 4997027
Signed-off-by: NAditya Kali <adityakali@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5356f261

ext4: add some tracepoints in ext4/extents.c · d8990240

由 Aditya Kali 提交于 9月 09, 2011

This patch adds some tracepoints in ext4/extents.c and updates a tracepoint in
ext4/inode.c.

Tested: Built and ran the kernel and verified that these tracepoints work.
Also ran xfstests.
Signed-off-by: NAditya Kali <adityakali@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

d8990240

ext4: rename ext4_has_free_blocks() to ext4_has_free_clusters() · df55c99d

由 Theodore Ts'o 提交于 9月 09, 2011

Rename the function so it is more clear what is going on.  Also rename
the various variables so it's clearer what's happening.

Also fix a missing blocks to cluster conversion when reading the
number of reserved blocks for root.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

df55c99d

ext4: rename ext4_claim_free_blocks() to ext4_claim_free_clusters() · e7d5f315

由 Theodore Ts'o 提交于 9月 09, 2011

This function really claims a number of free clusters, not blocks, so
rename it so it's clearer what's going on.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e7d5f315

ext4: rename ext4_free_blocks_after_init() to ext4_free_clusters_after_init() · cff1dfd7

由 Theodore Ts'o 提交于 9月 09, 2011

This function really returns the number of clusters after initializing
an uninitalized block bitmap has been initialized.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

cff1dfd7

ext4: rename ext4_count_free_blocks() to ext4_count_free_clusters() · 5dee5437

由 Theodore Ts'o 提交于 9月 09, 2011

This function really counts the free clusters reported in the block
group descriptors, so rename it to reduce confusion.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5dee5437

ext4: Rename ext4_free_blks_{count,set}() to refer to clusters · 021b65bb

由 Theodore Ts'o 提交于 9月 09, 2011

The field bg_free_blocks_count_{lo,high} in the block group
descriptor has been repurposed to hold the number of free clusters for
bigalloc functions.  So rename the functions so it makes it easier to
read and audit the block allocation and block freeing code.

Note: at this point in bigalloc development we doesn't support
online resize, so this also makes it really obvious all of the places
we need to fix up to add support for online resize.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

021b65bb

ext4: enable mounting bigalloc as read/write · 6f16b606

由 Theodore Ts'o 提交于 9月 09, 2011

Now that we have implemented all of the changes needed for bigalloc,
we can finally enable it!
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6f16b606

ext4: Fix bigalloc quota accounting and i_blocks value · 7b415bf6

由 Aditya Kali 提交于 9月 09, 2011

With bigalloc changes, the i_blocks value was not correctly set (it was still
set to number of blocks being used, but in case of bigalloc, we want i_blocks
to represent the number of clusters being used). Since the quota subsystem sets
the i_blocks value, this patch fixes the quota accounting and makes sure that
the i_blocks value is set correctly.
Signed-off-by: NAditya Kali <adityakali@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7b415bf6

ext4: tune mballoc's default group prealloc size for bigalloc file systems · 27baebb8

由 Theodore Ts'o 提交于 9月 09, 2011

The default group preallocation size had been previously set to 512
blocks/clusters, regardless of the block/cluster size.  This is
probably too big for large cluster sizes.  So adjust the default so
that it is 2 megabytes or 32 clusters, whichever is larger.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

27baebb8

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功