提交 · 97c68d00dbb2decda4b3dce79ce55df04246a834 · openeuler / Kernel

24 4月, 2009 2 次提交

check_unsafe_exec: s/lock_task_sighand/rcu_read_lock/ · 437f7fdb

由 Oleg Nesterov 提交于 4月 24, 2009

write_lock(&current->fs->lock) guarantees we can't wrongly miss
LSM_UNSAFE_SHARE, this is what we care about. Use rcu_read_lock()
instead of ->siglock to iterate over the sub-threads. We must see
all CLONE_THREAD|CLONE_FS threads which didn't pass exit_fs(), it
takes fs->lock too.

With or without this patch we can miss the freshly cloned thread
and set LSM_UNSAFE_SHARE, we don't care.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
[ Fixed lock/unlock typo  - Hugh ]
Acked-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

437f7fdb

do_execve() must not clear fs->in_exec if it was set by another thread · 8c652f96

由 Oleg Nesterov 提交于 4月 24, 2009

If do_execve() fails after check_unsafe_exec(), it clears fs->in_exec
unconditionally. This is wrong if we race with our sub-thread which
also does do_execve:

	Two threads T1 and T2 and another process P, all share the same
	->fs.

	T1 starts do_execve(BAD_FILE). It calls check_unsafe_exec(), since
	->fs is shared, we set LSM_UNSAFE but not ->in_exec.

	P exits and decrements fs->users.

	T2 starts do_execve(), calls check_unsafe_exec(), now ->fs is not
	shared, we set fs->in_exec.

	T1 continues, open_exec(BAD_FILE) fails, we clear ->in_exec and
	return to the user-space.

	T1 does clone(CLONE_FS /* without CLONE_THREAD */).

	T2 continues without LSM_UNSAFE_SHARE while ->fs is shared with
	another process.

Change check_unsafe_exec() to return res = 1 if we set ->in_exec, and change
do_execve() to clear ->in_exec depending on res.

When do_execve() suceeds, it is safe to clear ->in_exec unconditionally.
It can be set only if we don't share ->fs with another process, and since
we already killed all sub-threads either ->in_exec == 0 or we are the
only user of this ->fs.

Also, we do not need fs->lock to clear fs->in_exec.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NRoland McGrath <roland@redhat.com>
Acked-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8c652f96

22 4月, 2009 4 次提交

bio: use bio_kmalloc() in copy/map functions · a9e9dc24

由 Tejun Heo 提交于 4月 15, 2009

Impact: remove possible deadlock condition

There is no reason to use mempool backed allocation for map functions.
Also, because kern mapping is used inside LLDs (e.g. for EH), using
mempool backed allocation can lead to deadlock under extreme
conditions (mempool already consumed by the time a request reached EH
and requests are blocked on EH).

Switch copy/map functions to bio_kmalloc().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a9e9dc24

bio: fix bio_kmalloc() · 451a9ebf

由 Tejun Heo 提交于 4月 15, 2009

Impact: fix bio_kmalloc() and its destruction path

bio_kmalloc() was broken in two ways.

* bvec_alloc_bs() first allocates bvec using kmalloc() and then
  ignores it and allocates again like non-kmalloc bvecs.

* bio_kmalloc_destructor() didn't check for and free bio integrity
  data.

This patch fixes the above problems.  kmalloc patch is separated out
from bio_alloc_bioset() and allocates the requested number of bvecs as
inline bvecs.

* bio_alloc_bioset() no longer takes NULL @bs.  None other than
  bio_kmalloc() used it and outside users can't know how it was
  allocated anyway.

* Define and use BIO_POOL_NONE so that pool index check in
  bvec_free_bs() triggers if inline or kmalloc allocated bvec gets
  there.

* Relocate destructors on top of each allocation function so that how
  they're used is more clear.

Jens Axboe suggested allocating bvecs inline.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

451a9ebf

hugetlbfs: return negative error code for bad mount option · c12ddba0

由 Akinobu Mita 提交于 4月 21, 2009

This fixes the following BUG:

  # mount -o size=MM -t hugetlbfs none /huge
  hugetlbfs: Bad value 'MM' for mount option 'size=MM'
  ------------[ cut here ]------------
  kernel BUG at fs/super.c:996!

Due to

	BUG_ON(!mnt->mnt_sb);

in vfs_kern_mount().

Also, remove unused #include <linux/quotaops.h>

Cc: William Irwin <wli@holomorphy.com>
Cc: <stable@kernel.org>
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c12ddba0

Btrfs: fix btrfs fallocate oops and deadlock · 546888da

由 Chris Mason 提交于 4月 21, 2009

Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock. Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.

When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer. This was triggering an assertion and
oops because the lock is supposed to be held.

The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run. btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

546888da

21 4月, 2009 22 次提交

NFS: Fix the XDR iovec calculation in nfs3_xdr_setaclargs · 83404372

由 Trond Myklebust 提交于 4月 20, 2009

Commit ae46141f (NFSv3: Fix posix ACL code)
introduces a bug in the calculation of the XDR header iovec. In the case
where we are inlining the acls, we need to adjust the length of the iovec
req->rq_svec, in addition to adjusting the total buffer length.
Tested-by: NLeonardo Chiquitto <leonardo.lists@gmail.com>
Tested-by: NSuresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

83404372

fs: Mark get_filesystem_list() as __init function. · 38e23c95

由 Tetsuo Handa 提交于 4月 09, 2009

"int get_filesystem_list(char * buf)" is called by only
"static void __init get_fs_names(char *page)".
We can mark get_filesystem_list() as "__init".
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

38e23c95

kill vfs_stat_fd / vfs_lstat_fd · 2eae7a18

由 Christoph Hellwig 提交于 4月 08, 2009

There's really no reason to keep vfs_stat_fd and vfs_lstat_fd with
Oleg's vfs_fstatat.  Use vfs_fstatat for the few cases having the
directory fd, and switch all others to vfs_stat / vfs_lstat.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2eae7a18

Separate out common fstatat code into vfs_fstatat · 0112fc22

由 Oleg Drokin 提交于 4月 08, 2009

This is a version incorporating Christoph's suggestion.

Separate out common *fstatat functionality into a single function
instead of duplicating it all over the code.
Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0112fc22

ecryptfs: use memdup_user() · fd56d242

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fd56d242

ncpfs: use memdup_user() · a9482ebc

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user()
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a9482ebc

xfs: use memdup_user() · 0e639bde

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user()
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0e639bde

sysfs: use memdup_user() · 1c8542c7

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1c8542c7

btrfs: use memdup_user() · dae7b665

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user().

Note this changes some GFP_NOFS to GFP_KERNEL, since copy_from_user() may
cause pagefault, it's pointless to pass GFP_NOFS to kmalloc().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dae7b665

xattr: use memdup_user() · 3939fcde

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user()
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3939fcde

A
autofs4: use memchr() in invalid_string() · 3eac8778
由 Al Viro 提交于 4月 07, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3eac8778

Fix i_mutex vs. readdir handling in nfsd · 2f9092e1

由 David Woodhouse 提交于 4月 20, 2009

Commit 14f7dd63 ("Copy XFS readdir hack into nfsd code") introduced a
bug to generic code which had been extant for a long time in the XFS
version -- it started to call through into lookup_one_len() and hence
into the file systems' ->lookup() methods without i_mutex held on the
directory.

This patch fixes it by locking the directory's i_mutex again before
calling the filldir functions. The original deadlocks which commit
14f7dd63 was designed to avoid are still avoided, because they were due
to fs-internal locking, not i_mutex.

While we're at it, fix the return type of nfsd_buffered_readdir() which
should be a __be32 not an int -- it's an NFS errno, not a Linux errno.
And return nfserrno(-ENOMEM) when allocation fails, not just -ENOMEM.
Sparse would have caught that, if it wasn't so busy bitching about
__cold__.

Commit 05f4f678 ("nfsd4: don't do lookup within readdir in recovery
code") introduced a similar problem with calling lookup_one_len()
without i_mutex, which this patch also addresses. To fix that, it was
necessary to fix the called functions so that they expect i_mutex to be
held; that part was done by J. Bruce Fields.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Umm-I-can-live-with-that-by: NAl Viro <viro@zeniv.linux.org.uk>
Reported-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Tested-by: NJ. Bruce Fields <bfields@citi.umich.edu>
LKML-Reference: <8036.1237474444@jrobl>
Cc: stable@kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2f9092e1

fs/compat_ioctl: fix build when !BLOCK · 1ba0c7db

由 Alexander Beregalov 提交于 4月 20, 2009

In file included from fs/compat_ioctl.c:61:
include/linux/loop.h:59: error: field 'lo_bio_list' has incomplete type
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1ba0c7db

Fix autofs_expire() · 117aff74

由 Al Viro 提交于 4月 18, 2009

mnt should remain the same for all iterations through the list;
as it is, if we have a busy mount, mnt follows into it and isn't
restored for the next iteration.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

117aff74

No need for crossing to mountpoint in audit_tag_tree() · 24b6f16e

由 Al Viro 提交于 4月 18, 2009

is_under() will DTRT anyway.  And yes, is_subdir() behaviour
is intentional.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

24b6f16e

Safer nfsd_cross_mnt() · 1644ccc8

由 Al Viro 提交于 4月 18, 2009

AFAICS, we have a subtle bug there: if we have crossed mountpoint
*and* it got mount --move'd away, we'll be holding only one
reference to fs containing dentry - exp->ex_path.mnt.  IOW, we
ought to dput() before exp_put().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1644ccc8

Touch all affected namespaces on propagation of mount · e5d67f07

由 Al Viro 提交于 4月 07, 2009

We shouldn't just touch the namespace of current process
Caught-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e5d67f07

Fix AUTOFS_DEV_IOCTL_REQUESTER_CMD · cf2706a3

由 Al Viro 提交于 4月 07, 2009

Missing conversion from kernel to userland dev_t; this sucker
breaks as soon as we get sufficiently many autofs mounts for
new_encode_dev(s_dev) != s_dev.

Note: this is the minimal fix.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cf2706a3

Btrfs: use the right node in reada_for_balance · 8c594ea8

由 Chris Mason 提交于 4月 20, 2009

reada_for_balance was using the wrong index into the path node array,
so it wasn't reading the right blocks.  We never directly used the
results of the read done by this function because the btree search is
started over at the end.

This fixes reada_for_balance to reada in the correct node and to
avoid searching past the last slot in the node.  It also makes sure to
hold the parent lock while we are finding the nodes to read.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8c594ea8

Btrfs: fix oops on page->mapping->host during writepage · 11c8349b

由 Chris Mason 提交于 4月 20, 2009

The extent_io writepage call updates the writepage index in the inode
as it makes progress.  But, it was doing the update after unlocking the page,
which isn't legal because page->mapping can't be trusted once the page
is unlocked.

This lead to an oops, especially common with compression turned on.  The
fix here is to update the writeback index before unlocking the page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11c8349b

Btrfs: add a priority queue to the async thread helpers · d313d7a3

由 Chris Mason 提交于 4月 20, 2009

Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
higher priority.  But, the checksumming helper threads prevent it
from being fully effective.

There are two problems.  First, a big queue of pending checksumming
will delay the synchronous IO behind other lower priority writes.  Second,
the checksumming uses an ordered async work queue.  The ordering makes sure
that IOs are sent to the block layer in the same order they are sent
to the checksumming threads.  Usually this gives us less seeky IO.

But, when we start mixing IO priorities, the lower priority IO can delay
the higher priority IO.

This patch solves both problems by adding a high priority list to the async
helper threads, and a new btrfs_set_work_high_prio(), which is used
to make put a new async work item onto the higher priority list.

The ordering is still done on high priority IO, but all of the high
priority bios are ordered separately from the low priority bios.  This
ordering is purely an IO optimization, it is not involved in data
or metadata integrity.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d313d7a3

Btrfs: use WRITE_SYNC for synchronous writes · ffbd517d

由 Chris Mason 提交于 4月 20, 2009

Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future.  This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.

Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios.  The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.

This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ffbd517d

20 4月, 2009 3 次提交

GFS2: Fix page_mkwrite() return code · e56985da

由 Steven Whitehouse 提交于 4月 20, 2009

This allows for the possibility of returning VM_FAULT_OOM as
well as VM_FAULT_SIGBUS. This ensures that the correct action
is taken.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e56985da

GFS2: Clear dirty bit at end of inode glock sync · 52fcd11c

由 Steven Whitehouse 提交于 4月 20, 2009

The dirty bit can get set during the inode glock sync. Its too
complicated to change that at the moment, so this is the quick
fix - to clear the bit again at the end of the function.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

52fcd11c

Don't set relatime when noatime is specified · 613cbe3d

由 Andi Kleen 提交于 4月 19, 2009

Since commit 0a1c01c9 ("Make relatime
default") when a file system is mounted explicitely with noatime it gets
both the MNT_RELATIME and MNT_NOATIME bits set.

This shows up like this in /proc/mounts:

  /dev/xxx /yyy ext3 rw,noatime,relatime,errors=continue,data=writeback 0 0

That looks strange.  The VFS uses noatime in this case, but both flags
are set.  So it's more a cosmetic issue, but still better to fix.

Cc: mjg@redhat.com
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

613cbe3d

18 4月, 2009 3 次提交

cifs: when renaming don't try to unlink negative dentry · fc6f3943

由 Jeff Layton 提交于 4月 17, 2009

When attempting to rename a file on a read-only share, the kernel can
call cifs_unlink on a negative dentry, which causes an oops. Only try
to unlink the file if it's a positive dentry.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Tested-by: NShirish Pargaonkar <shirishp@us.ibm.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

fc6f3943

AFS: Guard afs_file_readpage_read_complete() definition with CONFIG_AFS_FSCACHE · 6566abdb

由 Matt Kraai 提交于 4月 17, 2009

If CONFIG_AFS_FSCACHE is not defined, the following warning is displayed when
fs/afs/file.c is compiled:

 fs/afs/file.c:111: warning: ‘afs_file_readpage_read_complete’ defined but not used

This occurs because all calls to this function are guarded by
CONFIG_AFS_FSCACHE.  Thus, guard its definition as well.
Signed-off-by: NMatt Kraai <kraai@ftbfs.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6566abdb

vfat: Note the NLS requirement · d29a2e94

由 Alan Cox 提交于 4月 17, 2009

Close bug #4754. Stop people getting into a situation where they can't
get their FAT filesystems to mount as they expect.
Signed-off-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d29a2e94

17 4月, 2009 6 次提交

splice: fix new kernel-doc warnings · b80901bb

由 Randy Dunlap 提交于 4月 16, 2009

splice: fix kernel-doc warnings

  Warning(fs/splice.c:617): bad line:
  Warning(fs/splice.c:722): No description found for parameter 'sd'
  Warning(fs/splice.c:722): Excess function parameter 'pipe' description in 'splice_from_pipe_begin'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b80901bb

cifs: remove unneeded bcc_ptr update in CIFSTCon · 22c9d52b

由 Jeff Layton 提交于 4月 16, 2009

This pointer isn't used again after this point. It's also not updated in
the ascii case, so there's no need to update it here.
Pointed-out-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

22c9d52b

cifs: add cFYI messages with some of the saved strings from ssetup/tcon · 313fecfa

由 Jeff Layton 提交于 4月 16, 2009

...to make it easier to find problems in this area in the future.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

313fecfa

cifs: fix buffer size for tcon->nativeFileSystem field · f083def6

由 Jeff Layton 提交于 4月 16, 2009

The buffer for this was resized recently to fix a bug. It's still
possible however that a malicious server could overflow this field
by sending characters in it that are >2 bytes in the local charset.
Double the size of the buffer to account for this possibility.

Also get rid of some really strange and seemingly pointless NULL
termination. It's NULL terminating the string in the source buffer,
but by the time that happens, we've already copied the string.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

f083def6

cifs: fix unicode string area word alignment in session setup · 27b87fe5

由 Jeff Layton 提交于 4月 14, 2009

The handling of unicode string area alignment is wrong.
decode_unicode_ssetup improperly assumes that it will always be preceded
by a pad byte. This isn't the case if the string area is already
word-aligned.

This problem, combined with the bad buffer sizing for the serverDomain
string can cause memory corruption. The bad alignment can make it so
that the alignment of the characters is off. This can make them
translate to characters that are greater than 2 bytes each. If this
happens we can overflow the allocation.

Fix this by fixing the alignment in CIFS_SessSetup instead so we can
verify it against the head of the response. Also, clean up the
workaround for improperly terminated strings by checking for a
odd-length unicode buffers and then forcibly terminating them.

Finally, resize the buffer for serverDomain. Now that we've fixed
the alignment, it's probably fine, but a malicious server could
overflow it.

A better solution for handling these strings is still needed, but
this should be a suitable bandaid.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

27b87fe5

S
[CIFS] Fix build break caused by change to new current_umask helper function · 88dd47ff
由 Steve French 提交于 4月 15, 2009
```
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
88dd47ff

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功