提交 · 2c9c8f36c34e1defcaa7e4c298651998b47f5282 · openeuler / raspberrypi-kernel

23 2月, 2011 1 次提交

NFSD: fix decode_cb_sequence4resok · 2c9c8f36

由 Benny Halevy 提交于 2月 22, 2011

Fix bug introduced in patch
85a56480 NFSD: Update XDR decoders in NFSv4 callback client

Although decode_cb_sequence4resok ignores highest slotid and target highest slotid
it must account for their space in their xdr stream when calling xdr_inline_decode

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2c9c8f36

17 2月, 2011 1 次提交

nfsd: correctly handle return value from nfsd_map_name_to_* · 47c85291

由 NeilBrown 提交于 2月 16, 2011

These functions return an nfs status, not a host_err.  So don't
try to convert  before returning.

This is a regression introduced by
3c726023; I fixed up two of the callers,
but missed these two.

Cc: stable@kernel.org
Reported-by: NHerbert Poetzl <herbert@13thfloor.at>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

47c85291

15 2月, 2011 12 次提交

s390: remove task_show_regs · 261cd298

由 Martin Schwidefsky 提交于 2月 15, 2011

task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/<pid>/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

261cd298

A
get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu() · 4e924a4f
由 Al Viro 提交于 2月 15, 2011
```
can't happen anymore and didn't work right anyway
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4e924a4f

drop out of RCU in return_reval · f60aef7e

由 Al Viro 提交于 2月 15, 2011

... thus killing the need to handle drop-from-RCU in d_revalidate()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f60aef7e

A
split do_revalidate() into RCU and non-RCU cases · f5e1c1c1
由 Al Viro 提交于 2月 15, 2011
```
fixing oopsen in lookup_one_len()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f5e1c1c1
A
in do_lookup() split RCU and non-RCU cases of need_revalidate · 24643087
由 Al Viro 提交于 2月 15, 2011
```
and use unlikely() instead of gotos, for fsck sake...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
24643087
A
nothing in do_follow_link() is going to see RCU · 844a3917
由 Al Viro 提交于 2月 15, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
844a3917

Btrfs: check return value of alloc_extent_map() · c26a9203

由 Tsutomu Itoh 提交于 2月 14, 2011

I add the check on the return value of alloc_extent_map() to several places.
In addition, alloc_extent_map() returns only the address or NULL.
Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c26a9203

Btrfs - Fix memory leak in btrfs_init_new_device() · 67100f25

由 Ilya Dryomov 提交于 2月 06, 2011

Memory allocated by calling kstrdup() should be freed.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

67100f25

btrfs: prevent heap corruption in btrfs_ioctl_space_info() · 51788b1b

由 Dan Rosenberg 提交于 2月 14, 2011

Commit bf5fc093 refactored
btrfs_ioctl_space_info() and introduced several security issues.

space_args.space_slots is an unsigned 64-bit type controlled by a
possibly unprivileged caller.  The comparison as a signed int type
allows providing values that are treated as negative and cause the
subsequent allocation size calculation to wrap, or be truncated to 0.
By providing a size that's truncated to 0, kmalloc() will return
ZERO_SIZE_PTR.  It's also possible to provide a value smaller than the
slot count.  The subsequent loop ignores the allocation size when
copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR.

The fix changes the slot count type and comparison typecast to u64,
which prevents truncation or signedness errors, and also ensures that we
don't copy more data than we've allocated in the subsequent loop.  Note
that zero-size allocations are no longer possible since there is already
an explicit check for space_args.space_slots being 0 and truncation of
this value is no longer an issue.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

51788b1b

Btrfs: Fix balance panic · 6848ad64

由 Yan, Zheng 提交于 2月 14, 2011

Mark the cloned backref_node as checked in clone_backref_node()
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6848ad64

Btrfs: don't release pages when we can't clear the uptodate bits · e3f24cc5

由 Chris Mason 提交于 2月 14, 2011

Btrfs tracks uptodate state in an rbtree as well as in the
page bits.  This is supposed to enable us to use block sizes other than
the page size, but there are a few parts still missing before that
completely works.

But, our readpage routine trusts this additional range based tracking
of uptodateness, much in the same way the buffer head up to date bits
are trusted for the other filesystems.

The problem is that sometimes we need to allocate memory in order to
split records in the rbtree, even when we are just clearing bits.  This
can be difficult when our clearing function is called GFP_ATOMIC, which
can happen in the releasepage path.

So, what happens today looks like this:

releasepage called with GFP_ATOMIC
btrfs_releasepage calls clear_extent_bit
clear_extent_bit fails to allocate ram, leaving the up to date bit set
btrfs_releasepage returns success

The end result is the page being gone, but btrfs thinking the range is
up to date.   Later on if someone tries to read that same page, the
btrfs readpage code will return immediately thinking the page is already
up to date.

This commit fixes things to fail the releasepage when we can't clear the
extent state bits.  It covers both data pages and metadata tree blocks.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3f24cc5

Btrfs: fix page->private races · eb14ab8e

由 Chris Mason 提交于 2月 10, 2011

There is a race where btrfs_releasepage can drop the
page->private contents just as alloc_extent_buffer is setting
up pages for metadata.  Because of how the Btrfs page flags work,
this results in us skipping the crc on the page during IO.

This patch sovles the race by waiting until after the extent buffer
is inserted into the radix tree before it sets page private.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eb14ab8e

14 2月, 2011 11 次提交

nfsd: break lease on unlink due to rename · 83f6b0c1

由 J. Bruce Fields 提交于 2月 06, 2011

4795bb37 "nfsd: break lease on unlink,
link, and rename", only broke the lease on the file that was being
renamed, and didn't handle the case where the target path refers to an
already-existing file that will be unlinked by a rename--in that case
the target file should have any leases broken as well.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

83f6b0c1

nfsd4: acquire only one lease per file · acfdf5c3

由 J. Bruce Fields 提交于 1月 31, 2011

Instead of acquiring one lease each time another client opens a file,
nfsd can acquire just one lease to represent all of them, and reference
count it to determine when to release it.

This fixes a regression introduced by
c45821d2 "locks: eliminate fl_mylease
callback": after that patch, only the struct file * is used to determine
who owns a given lease.  But since we recently converted the server to
share a single struct file per open, if we acquire multiple leases on
the same file from nfsd, it then becomes impossible on unlocking a lease
to determine which of those leases (all of whom share the same struct
file *) we meant to remove.

Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous
version of this patch.
Tested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

acfdf5c3

nfsd4: modify fi_delegations under recall_lock · 5d926e8c

由 J. Bruce Fields 提交于 2月 07, 2011

Modify fi_delegations only under the recall_lock, allowing us to use
that list on lease breaks.

Also some trivial cleanup to simplify later changes.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5d926e8c

nfsd4: remove unused deleg dprintk's. · 65bc58f5

由 J. Bruce Fields 提交于 2月 07, 2011

These aren't all that useful, and get in the way of the next steps.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

65bc58f5

nfsd4: split lease setting into separate function · edab9782

由 J. Bruce Fields 提交于 1月 31, 2011

Splitting some code into a separate function which we'll be adding some
more to.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

edab9782

J
nfsd4: fix leak on allocation error · dd239cc0
由 J. Bruce Fields 提交于 1月 31, 2011
```
Also share some common exit code.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
dd239cc0
J
nfsd4: add helper function for lease setup · 22d38c4c
由 J. Bruce Fields 提交于 1月 31, 2011
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
22d38c4c

nfsd4: split up nfsd_break_deleg_cb · 6b57d9c8

由 J. Bruce Fields 提交于 1月 31, 2011

We'll be adding some more code here soon.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6b57d9c8

NFSD: memory corruption due to writing beyond the stat array · 3aa6e0aa

由 Konstantin Khorenko 提交于 2月 01, 2011

If nfsd fails to find an exported via NFS file in the readahead cache, it
should increment corresponding nfsdstats counter (ra_depth[10]), but due to a
bug it may instead write to ra_depth[11], corrupting the following field.

In a kernel with NFSDv4 compiled in the corruption takes the form of an
increment of a counter of the number of NFSv4 operation 0's received; since
there is no operation 0, this is harmless.

In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the
memory beyond nfsdstats.
Signed-off-by: NKonstantin Khorenko <khorenko@openvz.org>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3aa6e0aa

NFSD: use nfserr for status after decode_cb_op_status · 0af3f814

由 Benny Halevy 提交于 1月 13, 2011

Bugs introduced in 85a56480
"NFSD: Update XDR decoders in NFSv4 callback client"

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0af3f814

J
nfsd: don't leak dentry count on mnt_want_write failure · 541ce98c
由 J. Bruce Fields 提交于 1月 14, 2011
```
The exit cleanup isn't quite right here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
541ce98c

12 2月, 2011 6 次提交

jbd2: call __jbd2_log_start_commit with j_state_lock write locked · e4471831

由 Theodore Ts'o 提交于 2月 12, 2011

On an SMP ARM system running ext4, I've received a report that the
first J_ASSERT in jbd2_journal_commit_transaction has been triggering:

	J_ASSERT(journal->j_running_transaction != NULL);

While investigating possible causes for this problem, I noticed that
__jbd2_log_start_commit() is getting called with j_state_lock only
read-locked, in spite of the fact that it's possible for it might
j_commit_request.  Fix this by grabbing the necessary information so
we can test to see if we need to start a new transaction before
dropping the read lock, and then calling jbd2_log_start_commit() which
will grab the write lock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e4471831

ext4: serialize unaligned asynchronous DIO · e9e3bcec

由 Eric Sandeen 提交于 2月 12, 2011

ext4 has a data corruption case when doing non-block-aligned
asynchronous direct IO into a sparse file, as demonstrated
by xfstest 240.

The root cause is that while ext4 preallocates space in the
hole, mappings of that space still look "new" and 
dio_zero_block() will zero out the unwritten portions.  When
more than one AIO thread is going, they both find this "new"
block and race to zero out their portion; this is uncoordinated
and causes data corruption.

Dave Chinner fixed this for xfs by simply serializing all
unaligned asynchronous direct IO.  I've done the same here.
The difference is that we only wait on conversions, not all IO.
This is a very big hammer, and I'm not very pleased with
stuffing this into ext4_file_write().  But since ext4 is
DIO_LOCKING, we need to serialize it at this high level.

I tried to move this into ext4_ext_direct_IO, but by then
we have the i_mutex already, and we will wait on the
work queue to do conversions - which must also take the
i_mutex.  So that won't work.

This was originally exposed by qemu-kvm installing to
a raw disk image with a normal sector-63 alignment.  I've
tested a backport of this patch with qemu, and it does
avoid the corruption.  It is also quite a lot slower
(14 min for package installs, vs. 8 min for well-aligned)
but I'll take slow correctness over fast corruption any day.

Mingming suggested that we can track outstanding
conversions, and wait on those so that non-sparse
files won't be affected, and I've implemented that here;
unaligned AIO to nonsparse files won't take a perf hit.

[tytso@mit.edu: Keep the mutex as a hashed array instead
 of bloating the ext4 inode]

[tytso@mit.edu: Fix up namespace issues so that global
 variables are protected with an "ext4_" prefix.]
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e9e3bcec

ext4: make grpinfo slab cache names static · 2892c15d

由 Eric Sandeen 提交于 2月 12, 2011

In 2.6.37 I was running into oopses with repeated module
loads & unloads.  I tracked this down to:

fb1813f4 ext4: use dedicated slab caches for group_info structures

(this was in addition to the features advert unload problem)

The kstrdup & subsequent kfree of the cache name was causing
a double free.  In slub, at least, if I read it right it allocates
& frees the name itself, slab seems to do something different...
so in slub I think we were leaking -our- cachep->name, and double
freeing the one allocated by slub.

After getting lost in slab/slub/slob a bit, I just looked at other
sized-caches that get allocated.  jbd2, biovec, sgpool all do it
more or less the way jbd2 does.  Below patch follows the jbd2
method of dynamically allocating a cache at mount time from
a list of static names.

(This might also possibly fix a race creating the caches with
parallel mounts running).

[Folded in a fix from Dan Carpenter which fixed an off-by-one error in
the original patch]

Cc: stable@kernel.org
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2892c15d

vfs: call rcu_barrier after ->kill_sb() · d863b50a

由 Boaz Harrosh 提交于 2月 10, 2011

In commit fa0d7e3d ("fs: icache RCU free inodes"), we use rcu free
inode instead of freeing the inode directly.  It causes a crash when we
rmmod immediately after we umount the volume[1].

So we need to call rcu_barrier after we kill_sb so that the inode is
freed before we do rmmod.  The idea is inspired by Aneesh Kumar.
rcu_barrier will wait for all callbacks to end before preceding.  The
original patch was done by Tao Ma, but synchronize_rcu() is not enough
here.

1. http://marc.info/?l=linux-fsdevel&m=129680863330185&w=2Tested-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d863b50a

Fix possible filp_cachep memory corruption · 2dab5974

由 Linus Torvalds 提交于 2月 11, 2011

In commit 31e6b01f ("fs: rcu-walk for path lookup") we started doing
path lookup using RCU, which then falls back to a careful non-RCU lookup
in case of problems (LOOKUP_REVAL).  So do_filp_open() has this "re-do
the lookup carefully" looping case.

However, that means that we must not release the open-intent file data
if we are going to loop around and use it once more!

Fix this by moving the release of the open-intent data to the function
that allocates it (do_filp_open() itself) rather than the helper
functions that can get called multiple times (finish_open() and
do_last()).  This makes the logic for the lifetime of that field much
more obvious, and avoids the possible double free.
Reported-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2dab5974

dlm: use single thread workqueues · 6b155c8f

由 David Teigland 提交于 2月 11, 2011

The recent commit to use cmwq for send and recv threads
dcce240e introduced problems,
apparently due to multiple workqueue threads.  Single threads
make the problems go away, so return to that until we fully
understand the concurrency issues with multiple threads.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

6b155c8f

11 2月, 2011 1 次提交

cifs: don't always drop malformed replies on the floor (try ) · 71823baf

由 Jeff Layton 提交于 2月 10, 2011

Slight revision to this patch...use min_t() instead of conditional
assignment. Also, remove the FIXME comment and replace it with the
explanation that Steve gave earlier.

After receiving a packet, we currently check the header. If it's no
good, then we toss it out and continue the loop, leaving the caller
waiting on that response.

In cases where the packet has length inconsistencies, but the MID is
valid, this leads to unneeded delays. That's especially problematic now
that the client waits indefinitely for responses.

Instead, don't immediately discard the packet if checkSMB fails. Try to
find a matching mid_q_entry, mark it as having a malformed response and
issue the callback.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

71823baf

10 2月, 2011 1 次提交

cifs: clean up checks in cifs_echo_request · 195291e6

由 Jeff Layton 提交于 2月 09, 2011

Follow-on patch to 7e90d705 which is already in Steve's tree...

The check for tcpStatus == CifsGood is not meaningful since it doesn't
indicate whether the NEGOTIATE request has been done. Also, clarify
why we're checking for maxBuf == 0.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

195291e6

09 2月, 2011 1 次提交

[CIFS] Do not send SMBEcho requests on new sockets until SMBNegotiate · 7e90d705

由 Steve French 提交于 2月 08, 2011

In order to determine whether an SMBEcho request can be sent
we need to know that the socket is established (server tcpStatus == CifsGood)
AND that an SMB NegotiateProtocol has been sent (server maxBuf != 0).
Without the second check we can send an Echo request during reconnection
before the server can accept it.

CC: JG <jg@cms.ac>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

7e90d705

08 2月, 2011 3 次提交

Btrfs: Fix page count calculation · 3a90983d

由 Yan, Zheng 提交于 1月 18, 2011

take offset of start position into account when calculating page count.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3a90983d

ext4: Fix data corruption with multi-block writepages support · d50bdd5a

由 Curt Wohlgemuth 提交于 2月 07, 2011

This fixes a corruption problem with the multi-block
writepages submittal change for ext4, from commit
bd2d0210 ("ext4: use bio
layer instead of buffer layer in mpage_da_submit_io").

(Note that this corruption is not present in 2.6.37 on
ext4, because the corruption was detected after the
feature was merged in 2.6.37-rc1, and so it was turned
off by adding a non-default mount option,
mblk_io_submit.  With this commit, which hopefully
fixes the last of the bugs with this feature, we'll be
able to turn on this performance feature by default in
2.6.38, and remove the mblk_io_submit option.)

The ext4 code path to bundle multiple pages for
writeback in ext4_bio_write_page() had a bug: we should
be clearing buffer head dirty flags *before* we submit
the bio, not in the completion routine.

The patch below was tested on 2.6.37 under KVM with the
postgresql script which was submitted by Jon Nelson as
documented in commit 1449032b.

Without the patch, I'd hit the corruption problem about
50-70% of the time.  With the patch, I executed the
script > 100 times with no corruption seen.

I also fixed a bug to make sure ext4_end_bio() doesn't
dereference the bio after the bio_put() call.
Reported-by: NJon Nelson <jnelson@jamponi.net>
Reported-by: NMatthias Bayer <jackdachef@gmail.com>
Signed-off-by: NCurt Wohlgemuth <curtw@google.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

d50bdd5a

cifs: remove checks for ses->status == CifsExiting · d402539b

由 Jeff Layton 提交于 2月 07, 2011

ses->status is never set to CifsExiting, so these checks are
always false.
Tested-by: NJG <jg@cms.ac>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

d402539b

06 2月, 2011 3 次提交

btrfs: Drop __exit attribute on btrfs_exit_compress · 8e4eef7a

由 Alexey Charkov 提交于 2月 02, 2011

As this function is called in some error paths while not
removing the module, the __exit attribute prevents the kernel
image from linking when btrfs is compiled in statically.
Signed-off-by: NAlexey Charkov <alchark@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8e4eef7a

btrfs: cleanup error handling in btrfs_unlink_inode() · 554233a6

由 Tsutomu Itoh 提交于 2月 03, 2011

When btrfs_alloc_path() fails, btrfs_free_path() need not be called.
Therefore, it changes the branch ahead.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

554233a6

Btrfs: exclude super blocks when we read in block groups · 3c14874a

由 Josef Bacik 提交于 2月 02, 2011

This has been resulting in a BUT_ON(ret) after btrfs_reserve_extent in
btrfs_cow_file_range. The reason is we don't actually calculate the bytes_super
for a block group until we go to cache it, which means that the space_info can
hand out reservations for space that it doesn't actually have, and we can run
out of data space. This is also a problem if you are using space caching since
we don't ever calculate bytes_super for the block groups. So instead everytime
we read a block group call exclude_super_stripes, which calculates the
bytes_super for the block group so it can be left out of the space_info. Then
whenever caching completes we just call free_excluded_extents so that the super
excluded extents are freed up. Also if we are unmounting and we hit any block
groups that haven't been cached we still need to call free_excluded_extents to
make sure things are cleaned up properly. Thanks,
Reported-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3c14874a