提交 · 361821854b71fc3a53c9e17701538247bddbd4ba · openanolis / cloud-kernel

22 2月, 2011 3 次提交

Docbook: add fs/eventfd.c and fix typos in it · 36182185

由 Randy Dunlap 提交于 2月 20, 2011

Add fs/eventfd.c to filesystems docbook.
Make typo corrections in fs/eventfd.c.
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

36182185

[CIFS] update cifs version · eed9e830

由 Steve French 提交于 2月 21, 2011

Update version to 1.71 so we can more easily spot modules with the last two fixes
Signed-off-by: NSteve French <sfrench@us.ibm.com>

eed9e830

cifs: Fix regression in LANMAN (LM) auth code · 5e640927

由 Shirish Pargaonkar 提交于 2月 17, 2011

LANMAN response length was changed to 16 bytes instead of 24 bytes.
Revert it back to 24 bytes.
Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
CC: stable@kernel.org
Signed-off-by: NSteve French <sfrench@us.ibm.com>

5e640927

20 2月, 2011 1 次提交

ceph: keep reference to parent inode on ceph_dentry · 97d79b40

由 Yehuda Sadeh 提交于 1月 18, 2011

When creating a new dentry we now hold a reference to the parent
inode in the ceph_dentry.  This is required due to the new RCU
changes from 949854d0, which set dentry->d_parent to NULL in d_kill before
calling the ->release() callback.  If/when that behavior is changed, we can
revert this hack.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

97d79b40

18 2月, 2011 1 次提交

fs/partitions: Validate map_count in Mac partition tables · fa7ea87a

由 Timo Warns 提交于 2月 17, 2011

Validate number of blocks in map and remove redundant variable.
Signed-off-by: NTimo Warns <warns@pre-sense.de>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fa7ea87a

17 2月, 2011 5 次提交

cifs: fix handling of scopeid in cifs_convert_address · 96161256

由 Jeff Layton 提交于 2月 16, 2011

The code finds, the '%' sign in an ipv6 address and copies that to a
buffer allocated on the stack. It then ignores that buffer, and passes
'pct' to simple_strtoul(), which doesn't work right because we're
comparing 'endp' against a completely different string.

Fix it by passing the correct pointer. While we're at it, this is a
good candidate for conversion to strict_strtoul as well.

Cc: stable@kernel.org
Cc: David Howells <dhowells@redhat.com>
Reported-by: NBjÃ¶rn JACKE <bj@sernet.de>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

96161256

block: revert block_dev read-only check · e51900f7

由 Chuck Ebbert 提交于 2月 16, 2011

This reverts commit 75f1dc0d ("block: check bdev_read_only() from
blkdev_get()").  That commit added stricter checking to make sure
devices that were being used read-only were actually opened in that
mode.

It turns out that the change breaks a bunch of kernel code that opens
block devices.  Affected systems include dm, md, and the loop device.
Because strict checking for read-only opens of block devices was not
done before this, the code that opens the devices was opening them
read-write even if they were being used read-only.  Auditing all that
code will take time, and new userspace packages for dm, mdadm, etc.
will also be required.
Signed-off-by: NChuck Ebbert <cebbert@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e51900f7

nfsd: correctly handle return value from nfsd_map_name_to_* · 47c85291

由 NeilBrown 提交于 2月 16, 2011

These functions return an nfs status, not a host_err.  So don't
try to convert  before returning.

This is a regression introduced by
3c726023; I fixed up two of the callers,
but missed these two.

Cc: stable@kernel.org
Reported-by: NHerbert Poetzl <herbert@13thfloor.at>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

47c85291

vfs: fix BUG_ON() in fs/namei.c:1461 · 3abb17e8

由 Linus Torvalds 提交于 2月 16, 2011

When Al moved the nameidata_dentry_drop_rcu_maybe() call into the
do_follow_link function in commit 844a3917 ("nothing in
do_follow_link() is going to see RCU"), he mistakenly left the

	BUG_ON(inode != path->dentry->d_inode);

behind.  Which would otherwise be ok, but that BUG_ON() really needs to
be _after_ dropping RCU, since the dentry isn't necessarily stable
otherwise.

So complete the code movement in that commit, and move the BUG_ON() into
do_follow_link() too.  This means that we need to pass in 'inode' as an
argument (just for this one use), but that's a small thing.  And
eventually we may be confident enough in our path lookup that we can
just remove the BUG_ON() and the unnecessary inode argument.
Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3abb17e8

workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable' · 58a69cb4

由 Tejun Heo 提交于 2月 16, 2011

There are two spellings in use for 'freeze' + 'able' - 'freezable' and
'freezeable'.  The former is the more prominent one.  The latter is
mostly used by workqueue and in a few other odd places.  Unify the
spelling to 'freezable'.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Acked-by: NDmitry Torokhov <dtor@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Steven Whitehouse <swhiteho@redhat.com>

58a69cb4

15 2月, 2011 12 次提交

s390: remove task_show_regs · 261cd298

由 Martin Schwidefsky 提交于 2月 15, 2011

task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/<pid>/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

261cd298

A
get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu() · 4e924a4f
由 Al Viro 提交于 2月 15, 2011
```
can't happen anymore and didn't work right anyway
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4e924a4f

drop out of RCU in return_reval · f60aef7e

由 Al Viro 提交于 2月 15, 2011

... thus killing the need to handle drop-from-RCU in d_revalidate()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

f60aef7e

A
split do_revalidate() into RCU and non-RCU cases · f5e1c1c1
由 Al Viro 提交于 2月 15, 2011
```
fixing oopsen in lookup_one_len()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f5e1c1c1
A
in do_lookup() split RCU and non-RCU cases of need_revalidate · 24643087
由 Al Viro 提交于 2月 15, 2011
```
and use unlikely() instead of gotos, for fsck sake...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
24643087
A
nothing in do_follow_link() is going to see RCU · 844a3917
由 Al Viro 提交于 2月 15, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
844a3917

Btrfs: check return value of alloc_extent_map() · c26a9203

由 Tsutomu Itoh 提交于 2月 14, 2011

I add the check on the return value of alloc_extent_map() to several places.
In addition, alloc_extent_map() returns only the address or NULL.
Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c26a9203

Btrfs - Fix memory leak in btrfs_init_new_device() · 67100f25

由 Ilya Dryomov 提交于 2月 06, 2011

Memory allocated by calling kstrdup() should be freed.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

67100f25

btrfs: prevent heap corruption in btrfs_ioctl_space_info() · 51788b1b

由 Dan Rosenberg 提交于 2月 14, 2011

Commit bf5fc093 refactored
btrfs_ioctl_space_info() and introduced several security issues.

space_args.space_slots is an unsigned 64-bit type controlled by a
possibly unprivileged caller.  The comparison as a signed int type
allows providing values that are treated as negative and cause the
subsequent allocation size calculation to wrap, or be truncated to 0.
By providing a size that's truncated to 0, kmalloc() will return
ZERO_SIZE_PTR.  It's also possible to provide a value smaller than the
slot count.  The subsequent loop ignores the allocation size when
copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR.

The fix changes the slot count type and comparison typecast to u64,
which prevents truncation or signedness errors, and also ensures that we
don't copy more data than we've allocated in the subsequent loop.  Note
that zero-size allocations are no longer possible since there is already
an explicit check for space_args.space_slots being 0 and truncation of
this value is no longer an issue.
Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Reviewed-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

51788b1b

Btrfs: Fix balance panic · 6848ad64

由 Yan, Zheng 提交于 2月 14, 2011

Mark the cloned backref_node as checked in clone_backref_node()
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6848ad64

Btrfs: don't release pages when we can't clear the uptodate bits · e3f24cc5

由 Chris Mason 提交于 2月 14, 2011

Btrfs tracks uptodate state in an rbtree as well as in the
page bits.  This is supposed to enable us to use block sizes other than
the page size, but there are a few parts still missing before that
completely works.

But, our readpage routine trusts this additional range based tracking
of uptodateness, much in the same way the buffer head up to date bits
are trusted for the other filesystems.

The problem is that sometimes we need to allocate memory in order to
split records in the rbtree, even when we are just clearing bits.  This
can be difficult when our clearing function is called GFP_ATOMIC, which
can happen in the releasepage path.

So, what happens today looks like this:

releasepage called with GFP_ATOMIC
btrfs_releasepage calls clear_extent_bit
clear_extent_bit fails to allocate ram, leaving the up to date bit set
btrfs_releasepage returns success

The end result is the page being gone, but btrfs thinking the range is
up to date.   Later on if someone tries to read that same page, the
btrfs readpage code will return immediately thinking the page is already
up to date.

This commit fixes things to fail the releasepage when we can't clear the
extent state bits.  It covers both data pages and metadata tree blocks.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3f24cc5

Btrfs: fix page->private races · eb14ab8e

由 Chris Mason 提交于 2月 10, 2011

There is a race where btrfs_releasepage can drop the
page->private contents just as alloc_extent_buffer is setting
up pages for metadata.  Because of how the Btrfs page flags work,
this results in us skipping the crc on the page during IO.

This patch sovles the race by waiting until after the extent buffer
is inserted into the radix tree before it sets page private.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

eb14ab8e

14 2月, 2011 11 次提交

nfsd: break lease on unlink due to rename · 83f6b0c1

由 J. Bruce Fields 提交于 2月 06, 2011

4795bb37 "nfsd: break lease on unlink,
link, and rename", only broke the lease on the file that was being
renamed, and didn't handle the case where the target path refers to an
already-existing file that will be unlinked by a rename--in that case
the target file should have any leases broken as well.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

83f6b0c1

nfsd4: acquire only one lease per file · acfdf5c3

由 J. Bruce Fields 提交于 1月 31, 2011

Instead of acquiring one lease each time another client opens a file,
nfsd can acquire just one lease to represent all of them, and reference
count it to determine when to release it.

This fixes a regression introduced by
c45821d2 "locks: eliminate fl_mylease
callback": after that patch, only the struct file * is used to determine
who owns a given lease.  But since we recently converted the server to
share a single struct file per open, if we acquire multiple leases on
the same file from nfsd, it then becomes impossible on unlocking a lease
to determine which of those leases (all of whom share the same struct
file *) we meant to remove.

Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous
version of this patch.
Tested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

acfdf5c3

nfsd4: modify fi_delegations under recall_lock · 5d926e8c

由 J. Bruce Fields 提交于 2月 07, 2011

Modify fi_delegations only under the recall_lock, allowing us to use
that list on lease breaks.

Also some trivial cleanup to simplify later changes.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5d926e8c

nfsd4: remove unused deleg dprintk's. · 65bc58f5

由 J. Bruce Fields 提交于 2月 07, 2011

These aren't all that useful, and get in the way of the next steps.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

65bc58f5

nfsd4: split lease setting into separate function · edab9782

由 J. Bruce Fields 提交于 1月 31, 2011

Splitting some code into a separate function which we'll be adding some
more to.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

edab9782

J
nfsd4: fix leak on allocation error · dd239cc0
由 J. Bruce Fields 提交于 1月 31, 2011
```
Also share some common exit code.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
dd239cc0
J
nfsd4: add helper function for lease setup · 22d38c4c
由 J. Bruce Fields 提交于 1月 31, 2011
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
22d38c4c

nfsd4: split up nfsd_break_deleg_cb · 6b57d9c8

由 J. Bruce Fields 提交于 1月 31, 2011

We'll be adding some more code here soon.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6b57d9c8

NFSD: memory corruption due to writing beyond the stat array · 3aa6e0aa

由 Konstantin Khorenko 提交于 2月 01, 2011

If nfsd fails to find an exported via NFS file in the readahead cache, it
should increment corresponding nfsdstats counter (ra_depth[10]), but due to a
bug it may instead write to ra_depth[11], corrupting the following field.

In a kernel with NFSDv4 compiled in the corruption takes the form of an
increment of a counter of the number of NFSv4 operation 0's received; since
there is no operation 0, this is harmless.

In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the
memory beyond nfsdstats.
Signed-off-by: NKonstantin Khorenko <khorenko@openvz.org>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3aa6e0aa

NFSD: use nfserr for status after decode_cb_op_status · 0af3f814

由 Benny Halevy 提交于 1月 13, 2011

Bugs introduced in 85a56480
"NFSD: Update XDR decoders in NFSv4 callback client"

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0af3f814

J
nfsd: don't leak dentry count on mnt_want_write failure · 541ce98c
由 J. Bruce Fields 提交于 1月 14, 2011
```
The exit cleanup isn't quite right here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
541ce98c

12 2月, 2011 6 次提交

jbd2: call __jbd2_log_start_commit with j_state_lock write locked · e4471831

由 Theodore Ts'o 提交于 2月 12, 2011

On an SMP ARM system running ext4, I've received a report that the
first J_ASSERT in jbd2_journal_commit_transaction has been triggering:

	J_ASSERT(journal->j_running_transaction != NULL);

While investigating possible causes for this problem, I noticed that
__jbd2_log_start_commit() is getting called with j_state_lock only
read-locked, in spite of the fact that it's possible for it might
j_commit_request.  Fix this by grabbing the necessary information so
we can test to see if we need to start a new transaction before
dropping the read lock, and then calling jbd2_log_start_commit() which
will grab the write lock.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e4471831

ext4: serialize unaligned asynchronous DIO · e9e3bcec

由 Eric Sandeen 提交于 2月 12, 2011

ext4 has a data corruption case when doing non-block-aligned
asynchronous direct IO into a sparse file, as demonstrated
by xfstest 240.

The root cause is that while ext4 preallocates space in the
hole, mappings of that space still look "new" and 
dio_zero_block() will zero out the unwritten portions.  When
more than one AIO thread is going, they both find this "new"
block and race to zero out their portion; this is uncoordinated
and causes data corruption.

Dave Chinner fixed this for xfs by simply serializing all
unaligned asynchronous direct IO.  I've done the same here.
The difference is that we only wait on conversions, not all IO.
This is a very big hammer, and I'm not very pleased with
stuffing this into ext4_file_write().  But since ext4 is
DIO_LOCKING, we need to serialize it at this high level.

I tried to move this into ext4_ext_direct_IO, but by then
we have the i_mutex already, and we will wait on the
work queue to do conversions - which must also take the
i_mutex.  So that won't work.

This was originally exposed by qemu-kvm installing to
a raw disk image with a normal sector-63 alignment.  I've
tested a backport of this patch with qemu, and it does
avoid the corruption.  It is also quite a lot slower
(14 min for package installs, vs. 8 min for well-aligned)
but I'll take slow correctness over fast corruption any day.

Mingming suggested that we can track outstanding
conversions, and wait on those so that non-sparse
files won't be affected, and I've implemented that here;
unaligned AIO to nonsparse files won't take a perf hit.

[tytso@mit.edu: Keep the mutex as a hashed array instead
 of bloating the ext4 inode]

[tytso@mit.edu: Fix up namespace issues so that global
 variables are protected with an "ext4_" prefix.]
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

e9e3bcec

ext4: make grpinfo slab cache names static · 2892c15d

由 Eric Sandeen 提交于 2月 12, 2011

In 2.6.37 I was running into oopses with repeated module
loads & unloads.  I tracked this down to:

fb1813f4 ext4: use dedicated slab caches for group_info structures

(this was in addition to the features advert unload problem)

The kstrdup & subsequent kfree of the cache name was causing
a double free.  In slub, at least, if I read it right it allocates
& frees the name itself, slab seems to do something different...
so in slub I think we were leaking -our- cachep->name, and double
freeing the one allocated by slub.

After getting lost in slab/slub/slob a bit, I just looked at other
sized-caches that get allocated.  jbd2, biovec, sgpool all do it
more or less the way jbd2 does.  Below patch follows the jbd2
method of dynamically allocating a cache at mount time from
a list of static names.

(This might also possibly fix a race creating the caches with
parallel mounts running).

[Folded in a fix from Dan Carpenter which fixed an off-by-one error in
the original patch]

Cc: stable@kernel.org
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

2892c15d

vfs: call rcu_barrier after ->kill_sb() · d863b50a

由 Boaz Harrosh 提交于 2月 10, 2011

In commit fa0d7e3d ("fs: icache RCU free inodes"), we use rcu free
inode instead of freeing the inode directly.  It causes a crash when we
rmmod immediately after we umount the volume[1].

So we need to call rcu_barrier after we kill_sb so that the inode is
freed before we do rmmod.  The idea is inspired by Aneesh Kumar.
rcu_barrier will wait for all callbacks to end before preceding.  The
original patch was done by Tao Ma, but synchronize_rcu() is not enough
here.

1. http://marc.info/?l=linux-fsdevel&m=129680863330185&w=2Tested-by: NTao Ma <boyu.mt@taobao.com>
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d863b50a

Fix possible filp_cachep memory corruption · 2dab5974

由 Linus Torvalds 提交于 2月 11, 2011

In commit 31e6b01f ("fs: rcu-walk for path lookup") we started doing
path lookup using RCU, which then falls back to a careful non-RCU lookup
in case of problems (LOOKUP_REVAL).  So do_filp_open() has this "re-do
the lookup carefully" looping case.

However, that means that we must not release the open-intent file data
if we are going to loop around and use it once more!

Fix this by moving the release of the open-intent data to the function
that allocates it (do_filp_open() itself) rather than the helper
functions that can get called multiple times (finish_open() and
do_last()).  This makes the logic for the lifetime of that field much
more obvious, and avoids the possible double free.
Reported-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2dab5974

dlm: use single thread workqueues · 6b155c8f

由 David Teigland 提交于 2月 11, 2011

The recent commit to use cmwq for send and recv threads
dcce240e introduced problems,
apparently due to multiple workqueue threads.  Single threads
make the problems go away, so return to that until we fully
understand the concurrency issues with multiple threads.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

6b155c8f

11 2月, 2011 1 次提交

cifs: don't always drop malformed replies on the floor (try ) · 71823baf

由 Jeff Layton 提交于 2月 10, 2011

Slight revision to this patch...use min_t() instead of conditional
assignment. Also, remove the FIXME comment and replace it with the
explanation that Steve gave earlier.

After receiving a packet, we currently check the header. If it's no
good, then we toss it out and continue the loop, leaving the caller
waiting on that response.

In cases where the packet has length inconsistencies, but the MID is
valid, this leads to unneeded delays. That's especially problematic now
that the client waits indefinitely for responses.

Instead, don't immediately discard the packet if checkSMB fails. Try to
find a matching mid_q_entry, mark it as having a malformed response and
issue the callback.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

71823baf

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功