提交 · 6fd7a46781999c32f423025767e43b349b967d57 · OpenHarmony / kernel_linux

27 2月, 2011 3 次提交

ext4: enable mblk_io_submit by default · 6fd7a467

由 Theodore Ts'o 提交于 2月 26, 2011

Now that we've fixed the file corruption bug in commit d50bdd5a,
it's time to enable mblk_io_submit by default.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

6fd7a467

ext4: fix ext4_da_block_invalidatepages() to handle page range properly · c7f5938a

由 Curt Wohlgemuth 提交于 2月 26, 2011

If ext4_da_block_invalidatepages() is called because of a
failure from ext4_map_blocks() in mpage_da_map_and_submit(),
it's supposed to clean up -- including unlock -- all the
pages in the mpd structure.  But these values may not match
up, even on a system in which block size == page size:

   mpd->b_blocknr != mpd->first_page
   mpd->b_size != (mpd->next_page - mpd->first_page)

ext4_da_block_invalidatepages() has been using b_blocknr and
b_size; this patch changes it to use first_page and
next_page.

Tested:  I injected a small number (5%) of failures in
ext4_map_blocks() in the case that the flags contain
EXT4_GET_BLOCKS_DELALLOC_RESERVE, and ran fsstress on this
kernel.  Without this patch, I got hung tasks every time.
With this patch, I see no hangs in many runs of fsstress.
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

c7f5938a

ext4: mark multi-page IO complete on mapping failure · e0fd9b90

由 Curt Wohlgemuth 提交于 2月 26, 2011

In mpage_da_map_and_submit(), if we have a delayed block
allocation failure from ext4_map_blocks(), we need to mark
the IO as complete, by setting

      mpd->io_done = 1;

Otherwise, we could end up submitting the pages in an outer
loop; since they are unlocked on mapping failure in
ext4_da_block_invalidatepages(), this will cause a bug check
in mpage_da_submit_io().

I tested this by injected failures into ext4_map_blocks().
Without this patch, a simple fsstress run will bug check;
with the patch, it works fine.
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e0fd9b90

25 2月, 2011 5 次提交

ext4: mballoc: don't replace the current preallocation group unnecessarily · 5a54b2f1

由 Coly Li 提交于 2月 24, 2011

In ext4_mb_check_group_pa(), the current preallocation space is
replaced with a new preallocation space when the two have the same
distance from the goal block.

This doesn't actually gain us anything, so change things so that the
function only switches to the new preallocation group if its distance
from the goal block is strictly smaller than the current preallocaiton
group's distance from the goal block.
Signed-off-by: NColy Li <bosong.ly@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5a54b2f1

C
ext4: clarify description of ac_g_ex in struct ext4_allocation_context · 58696f3a
由 Coly Li 提交于 2月 24, 2011
```
Signed-off-by: NColy Li <bosong.ly@taobao.com>
Cc: Alex Tomas <alex@clusterfs.com>
Cc: Theodore Tso <tytso@google.com>
```
58696f3a

mballoc: add comments to ext4_mb_mark_free_simple() · 7c786059

由 Coly Li 提交于 2月 24, 2011

This patch adds comments to ext4_mb_mark_free_simple to make it more
understandable.
Signed-off-by: NColy Li <bosong.ly@taobao.com>
Cc: Alex Tomas <alex@clusterfs.com>
Cc: Theodore Tso <tytso@google.com>

7c786059

ext4: remove unncessary call mb_find_buddy() in debugging code · 235772da

由 Coly Li 提交于 2月 24, 2011

In __mb_check_buddy(), look at the code below:
  591         fstart = -1;
  592         buddy = mb_find_buddy(e4b, 0, &max);
  593         for (i = 0; i < max; i++) {
  594                 if (!mb_test_bit(i, buddy)) {
  595                         MB_CHECK_ASSERT(i >= e4b->bd_info->bb_first_free);
  596                         if (fstart == -1) {
  597                                 fragments++;
  598                                 fstart = i;
  599                         }
  600                         continue;
  601                 }
  602                 fstart = -1;
  603                 /* check used bits only */
  604                 for (j = 0; j < e4b->bd_blkbits + 1; j++) {
  605                         buddy2 = mb_find_buddy(e4b, j, &max2);
  606                         k = i >> j;
  607                         MB_CHECK_ASSERT(k < max2);
  608                         MB_CHECK_ASSERT(mb_test_bit(k, buddy2));
  609                 }
  610         }
  611         MB_CHECK_ASSERT(!EXT4_MB_GRP_NEED_INIT(e4b->bd_info));
  612         MB_CHECK_ASSERT(e4b->bd_info->bb_fragments == fragments);
  613
  614         grp = ext4_get_group_info(sb, e4b->bd_group);
  615         buddy = mb_find_buddy(e4b, 0, &max);

On line 592, buddy is fetched by mb_find_buddy() with order 0, between
line 593 to line 615, buddy is not changed, therefore there is
no need to fetch buddy again from mb_find_buddy() with order 0 again.

We can safely remove the second mb_find_buddy() on line 615.
Signed-off-by: NColy Li <bosong.ly@taobao.com>
Cc: Alex Tomas <alex@clusterfs.com>
Cc: Theodore Tso <tytso@google.com>

235772da

ext4: code cleanup in mb_find_buddy() · 84b775a3

由 Coly Li 提交于 2月 24, 2011

Current code calculate max no matter whether order is zero, it's
unnecessary. This cleanup patch sets max to "1 << (e4b->bd_blkbits
+ 3)" only when order == 0.
Signed-off-by: NColy Li <bosong.ly@taobao.com>
Cc: Alex Tomas <alex@clusterfs.com>
Cc: Theodore Tso <tytso@google.com>

84b775a3

24 2月, 2011 4 次提交

ext4: enable acls and user_xattr by default · ea663336

由 Eric Sandeen 提交于 2月 23, 2011

There's no good reason to require the extra step of providing
a mount option for acl or user_xattr once the feature is configured
on; no other filesystem that I know of requires this.

Userspace patches have set these options in default mount options,
and this patch makes them default in the kernel.  At some point
we can start to deprecate the options, perhaps.

For now I've removed default mount option checks in show_options()
to be explicit about what's set, since it's changing the default,
but I'm open to alternatives if desired.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

ea663336

ext4: Adjust minlen with discard_granularity in the FITRIM ioctl · 5c2ed62f

由 Lukas Czerner 提交于 2月 23, 2011

Discard granularity tells us the minimum size of extent that can be
discarded by the device. If the user supplies a minimum extent that
should be discarded (range.minlen) which is smaller than the discard
granularity, increase minlen to the discard granularity, since there's
no point submitting trim requests that the device will reject anyway.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5c2ed62f

ext4: check if device support discard in FITRIM ioctl · 41431792

由 Lukas Czerner 提交于 2月 23, 2011

For a device that does not support discard, the FITRIM ioctl returns
-EOPNOTSUPP when blkdev_issue_discard() returns this error code, which
is how the user is informed that the device does not support discard.

If there are no suitable free extents to be trimmed, then FITRIM will
return success even though the device does not support discard, which
could confuse the user.  So check explicitly if the device supports
discard and return an error code at the beginning of the FITRIM ioctl
processing.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

41431792

L
ext4: mark file-local functions and variables as static · 0b75a840
由 Lukas Czerner 提交于 2月 23, 2011
```
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
```
0b75a840

22 2月, 2011 3 次提交

ext4: allow inode_readahead_blks=0 (linux-2.6.37) · 5dbd571d

由 Alexander V. Lukyanov 提交于 2月 21, 2011

I cannot disable inode-read-ahead feature of ext4 (on 2.6.37):

# echo 0 > /sys/fs/ext4/sda2/inode_readahead_blks 
bash: echo: write error: Invalid argument

On a server with lots of small files and random access this read-ahead makes
performance worse, and I'd like to disable it. I work around this problem
by using value of 1, but it still reads an extra block.

This patch fixes the problem by checking for zero explicitly.
Signed-off-by: NAlexander V. Lukyanov <lav@netis.ru>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

5dbd571d

ext4: Fix sparse warning: Using plain integer as NULL pointer · 7dc57615

由 Peter Huewe 提交于 2月 21, 2011

This patch fixes the warning "Using plain integer as NULL pointer",
generated by sparse, by replacing the offending 0s with NULL.
Signed-off-by: NPeter Huewe <peterhuewe@gmx.de>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

7dc57615

ext4: fix compile warnings with EXT4FS_DEBUG enabled · da488945

由 Theodore Ts'o 提交于 2月 21, 2011

Compile 2.6.38-rc1 with turning EXT4FS_DEBUG on,
we get following compile warnings. This patch fixes them.

  CC      fs/ext4/hash.o
  CC      fs/ext4/resize.o
fs/ext4/resize.c: In function 'setup_new_group_blocks':
fs/ext4/resize.c:233:2: warning: format '%#04llx' expects type 'long long
unsigned int', but argument 3 has type 'long unsigned int'
fs/ext4/resize.c:251:2: warning: format '%#04llx' expects type 'long long
unsigned int', but argument 3 has type 'long unsigned int'
  CC      fs/ext4/extents.o
  CC      fs/ext4/ext4_jbd2.o
  CC      fs/ext4/migrate.o
Reported-by: NAkira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

da488945

17 2月, 2011 1 次提交

vfs: fix BUG_ON() in fs/namei.c:1461 · 3abb17e8

由 Linus Torvalds 提交于 2月 16, 2011

When Al moved the nameidata_dentry_drop_rcu_maybe() call into the
do_follow_link function in commit 844a3917 ("nothing in
do_follow_link() is going to see RCU"), he mistakenly left the

	BUG_ON(inode != path->dentry->d_inode);

behind.  Which would otherwise be ok, but that BUG_ON() really needs to
be _after_ dropping RCU, since the dentry isn't necessarily stable
otherwise.

So complete the code movement in that commit, and move the BUG_ON() into
do_follow_link() too.  This means that we need to pass in 'inode' as an
argument (just for this one use), but that's a small thing.  And
eventually we may be confident enough in our path lookup that we can
just remove the BUG_ON() and the unnecessary inode argument.
Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3abb17e8

15 2月, 2011 12 次提交

s390: remove task_show_regs · 261cd298

由 Martin Schwidefsky 提交于 2月 15, 2011

task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/<pid>/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

261cd298

A
get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu() · 4e924a4f
由 Al Viro 提交于 2月 15, 2011
```
can't happen anymore and didn't work right anyway
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4e924a4f

drop out of RCU in return_reval · f60aef7e

由 Al Viro 提交于 2月 15, 2011

... thus killing the need to handle drop-from-RCU in d_revalidate()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f60aef7e

A
split do_revalidate() into RCU and non-RCU cases · f5e1c1c1
由 Al Viro 提交于 2月 15, 2011
```
fixing oopsen in lookup_one_len()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f5e1c1c1
A
in do_lookup() split RCU and non-RCU cases of need_revalidate · 24643087
由 Al Viro 提交于 2月 15, 2011
```
and use unlikely() instead of gotos, for fsck sake...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
24643087
A
nothing in do_follow_link() is going to see RCU · 844a3917
由 Al Viro 提交于 2月 15, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
844a3917

Btrfs: check return value of alloc_extent_map() · c26a9203

由 Tsutomu Itoh 提交于 2月 14, 2011

I add the check on the return value of alloc_extent_map() to several places.
In addition, alloc_extent_map() returns only the address or NULL.
Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c26a9203

Btrfs - Fix memory leak in btrfs_init_new_device() · 67100f25

由 Ilya Dryomov 提交于 2月 06, 2011

Memory allocated by calling kstrdup() should be freed.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

67100f25

btrfs: prevent heap corruption in btrfs_ioctl_space_info() · 51788b1b

由 Dan Rosenberg 提交于 2月 14, 2011

Commit bf5fc093 refactored
btrfs_ioctl_space_info() and introduced several security issues.

space_args.space_slots is an unsigned 64-bit type controlled by a
possibly unprivileged caller.  The comparison as a signed int type
allows providing values that are treated as negative and cause the
subsequent allocation size calculation to wrap, or be truncated to 0.
By providing a size that's truncated to 0, kmalloc() will return
ZERO_SIZE_PTR.  It's also possible to provide a value smaller than the
slot count.  The subsequent loop ignores the allocation size when
copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR.

The fix changes the slot count type and comparison typecast to u64,
which prevents truncation or signedness errors, and also ensures that we
don't copy more data than we've allocated in the subsequent loop.  Note
that zero-size allocations are no longer possible since there is already
an explicit check for space_args.space_slots being 0 and truncation of
this value is no longer an issue.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

51788b1b

Btrfs: Fix balance panic · 6848ad64

由 Yan, Zheng 提交于 2月 14, 2011

Mark the cloned backref_node as checked in clone_backref_node()
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6848ad64

Btrfs: don't release pages when we can't clear the uptodate bits · e3f24cc5

由 Chris Mason 提交于 2月 14, 2011

Btrfs tracks uptodate state in an rbtree as well as in the
page bits.  This is supposed to enable us to use block sizes other than
the page size, but there are a few parts still missing before that
completely works.

But, our readpage routine trusts this additional range based tracking
of uptodateness, much in the same way the buffer head up to date bits
are trusted for the other filesystems.

The problem is that sometimes we need to allocate memory in order to
split records in the rbtree, even when we are just clearing bits.  This
can be difficult when our clearing function is called GFP_ATOMIC, which
can happen in the releasepage path.

So, what happens today looks like this:

releasepage called with GFP_ATOMIC
btrfs_releasepage calls clear_extent_bit
clear_extent_bit fails to allocate ram, leaving the up to date bit set
btrfs_releasepage returns success

The end result is the page being gone, but btrfs thinking the range is
up to date.   Later on if someone tries to read that same page, the
btrfs readpage code will return immediately thinking the page is already
up to date.

This commit fixes things to fail the releasepage when we can't clear the
extent state bits.  It covers both data pages and metadata tree blocks.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3f24cc5

Btrfs: fix page->private races · eb14ab8e

由 Chris Mason 提交于 2月 10, 2011

There is a race where btrfs_releasepage can drop the
page->private contents just as alloc_extent_buffer is setting
up pages for metadata.  Because of how the Btrfs page flags work,
this results in us skipping the crc on the page during IO.

This patch sovles the race by waiting until after the extent buffer
is inserted into the radix tree before it sets page private.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eb14ab8e

14 2月, 2011 11 次提交

nfsd: break lease on unlink due to rename · 83f6b0c1

由 J. Bruce Fields 提交于 2月 06, 2011

4795bb37 "nfsd: break lease on unlink,
link, and rename", only broke the lease on the file that was being
renamed, and didn't handle the case where the target path refers to an
already-existing file that will be unlinked by a rename--in that case
the target file should have any leases broken as well.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

83f6b0c1

nfsd4: acquire only one lease per file · acfdf5c3

由 J. Bruce Fields 提交于 1月 31, 2011

Instead of acquiring one lease each time another client opens a file,
nfsd can acquire just one lease to represent all of them, and reference
count it to determine when to release it.

This fixes a regression introduced by
c45821d2 "locks: eliminate fl_mylease
callback": after that patch, only the struct file * is used to determine
who owns a given lease.  But since we recently converted the server to
share a single struct file per open, if we acquire multiple leases on
the same file from nfsd, it then becomes impossible on unlocking a lease
to determine which of those leases (all of whom share the same struct
file *) we meant to remove.

Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous
version of this patch.
Tested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

acfdf5c3

nfsd4: modify fi_delegations under recall_lock · 5d926e8c

由 J. Bruce Fields 提交于 2月 07, 2011

Modify fi_delegations only under the recall_lock, allowing us to use
that list on lease breaks.

Also some trivial cleanup to simplify later changes.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5d926e8c

nfsd4: remove unused deleg dprintk's. · 65bc58f5

由 J. Bruce Fields 提交于 2月 07, 2011

These aren't all that useful, and get in the way of the next steps.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

65bc58f5

nfsd4: split lease setting into separate function · edab9782

由 J. Bruce Fields 提交于 1月 31, 2011

Splitting some code into a separate function which we'll be adding some
more to.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

edab9782

J
nfsd4: fix leak on allocation error · dd239cc0
由 J. Bruce Fields 提交于 1月 31, 2011
```
Also share some common exit code.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
dd239cc0
J
nfsd4: add helper function for lease setup · 22d38c4c
由 J. Bruce Fields 提交于 1月 31, 2011
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
22d38c4c

nfsd4: split up nfsd_break_deleg_cb · 6b57d9c8

由 J. Bruce Fields 提交于 1月 31, 2011

We'll be adding some more code here soon.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6b57d9c8

NFSD: memory corruption due to writing beyond the stat array · 3aa6e0aa

由 Konstantin Khorenko 提交于 2月 01, 2011

If nfsd fails to find an exported via NFS file in the readahead cache, it
should increment corresponding nfsdstats counter (ra_depth[10]), but due to a
bug it may instead write to ra_depth[11], corrupting the following field.

In a kernel with NFSDv4 compiled in the corruption takes the form of an
increment of a counter of the number of NFSv4 operation 0's received; since
there is no operation 0, this is harmless.

In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the
memory beyond nfsdstats.
Signed-off-by: NKonstantin Khorenko <khorenko@openvz.org>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3aa6e0aa

NFSD: use nfserr for status after decode_cb_op_status · 0af3f814

由 Benny Halevy 提交于 1月 13, 2011

Bugs introduced in 85a56480
"NFSD: Update XDR decoders in NFSv4 callback client"

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0af3f814

J
nfsd: don't leak dentry count on mnt_want_write failure · 541ce98c
由 J. Bruce Fields 提交于 1月 14, 2011
```
The exit cleanup isn't quite right here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
541ce98c

12 2月, 2011 1 次提交

jbd2: call __jbd2_log_start_commit with j_state_lock write locked · e4471831

由 Theodore Ts'o 提交于 2月 12, 2011

On an SMP ARM system running ext4, I've received a report that the
first J_ASSERT in jbd2_journal_commit_transaction has been triggering:

	J_ASSERT(journal->j_running_transaction != NULL);

While investigating possible causes for this problem, I noticed that
__jbd2_log_start_commit() is getting called with j_state_lock only
read-locked, in spite of the fact that it's possible for it might
j_commit_request.  Fix this by grabbing the necessary information so
we can test to see if we need to start a new transaction before
dropping the read lock, and then calling jbd2_log_start_commit() which
will grab the write lock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e4471831

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多