提交 · 261758b5c3dfeac73ca364c47ed538f5ce4250ee · openanolis / cloud-kernel

29 4月, 2009 12 次提交

NFSD: Stricter buffer size checking in write_versions() · 261758b5

由 Chuck Lever 提交于 4月 23, 2009

While it's not likely today that there are enough NFS versions to
overflow the output buffer in write_versions(), we should be more
careful about detecting the end of the buffer.

The number of NFS versions will only increase as NFSv4 minor versions
are added.

Note that this API doesn't behave the same as portlist.  Here we
attempt to display as many versions as will fit in the buffer, and do
not provide any indication that an overflow would have occurred.  I
don't have any good rationale for that.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

261758b5

NFSD: Stricter buffer size checking in write_recoverydir() · 3d72ab8f

由 Chuck Lever 提交于 4月 23, 2009

While it's not likely a pathname will be longer than
SIMPLE_TRANSACTION_SIZE, we should be more careful about just
plopping it into the output buffer without bounds checking.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

3d72ab8f

SUNRPC: pass buffer size to svc_sock_names() · 8435d34d

由 Chuck Lever 提交于 4月 23, 2009

Adjust the synopsis of svc_sock_names() to pass in the size of the
output buffer.  Add a documenting comment.

This is a cosmetic change for now.  A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

8435d34d

SUNRPC: pass buffer size to svc_addsock() · bfba9ab4

由 Chuck Lever 提交于 4月 23, 2009

Adjust the synopsis of svc_addsock() to pass in the size of the output
buffer.  Add a documenting comment.

This is a cosmetic change for now.  A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

bfba9ab4

NFSD: Prevent a buffer overflow in svc_xprt_names() · 335c54bd

由 Chuck Lever 提交于 4月 23, 2009

The svc_xprt_names() function can overflow its buffer if it's so near
the end of the passed in buffer that the "name too long" string still
doesn't fit.  Of course, it could never tell if it was near the end
of the passed in buffer, since its only caller passes in zero as the
buffer length.

Let's make this API a little safer.

Change svc_xprt_names() so it *always* checks for a buffer overflow,
and change its only caller to pass in the correct buffer length.

If svc_xprt_names() does overflow its buffer, it now fails with an
ENAMETOOLONG errno, instead of trying to write a message at the end
of the buffer.  I don't like this much, but I can't figure out a clean
way that's always safe to return some of the names, *and* an
indication that the buffer was not long enough.

The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
"File name too long".
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

335c54bd

NFSD: move lockd_up() before svc_addsock() · ea068bad

由 Chuck Lever 提交于 4月 23, 2009

Clean up.

A couple of years ago, a series of commits, finishing with commit
5680c446, swapped the order of the lockd_up() and svc_addsock() calls
in __write_ports().  At that time lockd_up() needed to know the
transport protocol of the passed-in socket to start a listener on the
same transport protocol.

These days, lockd_up() doesn't take a protocol argument; it always
starts both a UDP and TCP listener.  It's now more straightforward to
try the lockd_up() first, then do a lockd_down() if the svc_addsock()
fails.

Careful review of this code shows that the svc_sock_names() call is
used only to close the just-opened socket in case lockd_up() fails.
So it is no longer needed if lockd_up() is done first.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

ea068bad

NFSD: Finish refactoring __write_ports() · 0a5372d8

由 Chuck Lever 提交于 4月 23, 2009

Clean up: Refactor transport name listing out of __write_ports() to
make it easier to understand and maintain.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

0a5372d8

NFSD: Note an additional requirement when passing TCP sockets to portlist · c71206a7

由 Chuck Lever 提交于 4月 23, 2009

User space must call listen(3) on SOCK_STREAM sockets passed into
/proc/fs/nfsd/portlist, otherwise that listener is ignored.  Document
this.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

c71206a7

NFSD: Refactor socket creation out of __write_ports() · 0b7c2f6f

由 Chuck Lever 提交于 4月 23, 2009

Clean up: Refactor the socket creation logic out of __write_ports() to
make it easier to understand and maintain.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

0b7c2f6f

NFSD: Refactor portlist socket closing into a helper · 82d56591

由 Chuck Lever 提交于 4月 23, 2009

Clean up: Refactor the socket closing logic out of __write_ports() to
make it easier to understand and maintain.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

82d56591

NFSD: Refactor transport addition out of __write_ports() · 4eb68c26

由 Chuck Lever 提交于 4月 23, 2009

Clean up: Refactor transport addition out of __write_ports() to make
it easier to understand and maintain.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

4eb68c26

NFSD: Refactor transport removal out of __write_ports() · 4cd5dc75

由 Chuck Lever 提交于 4月 23, 2009

Clean up: Refactor transport removal out of __write_ports() to make it
easier to understand and maintain.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

4cd5dc75

25 4月, 2009 2 次提交

nfsd4: distinguish expired from stale stateids · 78155ed7

由 Bian Naimeng 提交于 4月 22, 2009

If we encode the time of client creation into the stateid instead of the
time of server boot, then we can determine whether that stateid is from
a previous instance of the a server, or from a client that has expired,
and return an appropriate error to the client.
Signed-off-by: NBian Naimeng <biannm@cn.fujitsu.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

78155ed7

lockd: call locks_release_private to cleanup per-filesystem state · a9e61e25

由 Felix Blyakher 提交于 3月 31, 2009

For every lock request lockd creates a new file_lock object
in nlmsvc_setgrantargs() by copying the passed in file_lock with
locks_copy_lock(). A filesystem can attach it's own lock_operations
vector to the file_lock. It has to be cleaned up at the end of the
file_lock's life. However, lockd doesn't do it today, yet it
asserts in nlmclnt_release_lockargs() that the per-filesystem
state is clean.
This patch fixes it by exporting locks_release_private() and adding
it to nlmsvc_freegrantargs(), to be symmetrical to creating a
file_lock in nlmsvc_setgrantargs().
Signed-off-by: NFelix Blyakher <felixb@sgi.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

a9e61e25

24 4月, 2009 1 次提交

rpcgss: remove redundant test on unsigned · 80492e7d

由 Roel Kluin 提交于 4月 21, 2009

Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>

80492e7d

22 4月, 2009 2 次提交

hugetlbfs: return negative error code for bad mount option · c12ddba0

由 Akinobu Mita 提交于 4月 21, 2009

This fixes the following BUG:

  # mount -o size=MM -t hugetlbfs none /huge
  hugetlbfs: Bad value 'MM' for mount option 'size=MM'
  ------------[ cut here ]------------
  kernel BUG at fs/super.c:996!

Due to

	BUG_ON(!mnt->mnt_sb);

in vfs_kern_mount().

Also, remove unused #include <linux/quotaops.h>

Cc: William Irwin <wli@holomorphy.com>
Cc: <stable@kernel.org>
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c12ddba0

Btrfs: fix btrfs fallocate oops and deadlock · 546888da

由 Chris Mason 提交于 4月 21, 2009

Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock. Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.

When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer. This was triggering an assertion and
oops because the lock is supposed to be held.

The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run. btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

546888da

21 4月, 2009 22 次提交

NFS: Fix the XDR iovec calculation in nfs3_xdr_setaclargs · 83404372

由 Trond Myklebust 提交于 4月 20, 2009

Commit ae46141f (NFSv3: Fix posix ACL code)
introduces a bug in the calculation of the XDR header iovec. In the case
where we are inlining the acls, we need to adjust the length of the iovec
req->rq_svec, in addition to adjusting the total buffer length.
Tested-by: NLeonardo Chiquitto <leonardo.lists@gmail.com>
Tested-by: NSuresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

83404372

fs: Mark get_filesystem_list() as __init function. · 38e23c95

由 Tetsuo Handa 提交于 4月 09, 2009

"int get_filesystem_list(char * buf)" is called by only
"static void __init get_fs_names(char *page)".
We can mark get_filesystem_list() as "__init".
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

38e23c95

kill vfs_stat_fd / vfs_lstat_fd · 2eae7a18

由 Christoph Hellwig 提交于 4月 08, 2009

There's really no reason to keep vfs_stat_fd and vfs_lstat_fd with
Oleg's vfs_fstatat.  Use vfs_fstatat for the few cases having the
directory fd, and switch all others to vfs_stat / vfs_lstat.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2eae7a18

Separate out common fstatat code into vfs_fstatat · 0112fc22

由 Oleg Drokin 提交于 4月 08, 2009

This is a version incorporating Christoph's suggestion.

Separate out common *fstatat functionality into a single function
instead of duplicating it all over the code.
Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0112fc22

ecryptfs: use memdup_user() · fd56d242

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fd56d242

ncpfs: use memdup_user() · a9482ebc

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user()
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a9482ebc

xfs: use memdup_user() · 0e639bde

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user()
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0e639bde

sysfs: use memdup_user() · 1c8542c7

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1c8542c7

btrfs: use memdup_user() · dae7b665

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user().

Note this changes some GFP_NOFS to GFP_KERNEL, since copy_from_user() may
cause pagefault, it's pointless to pass GFP_NOFS to kmalloc().
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dae7b665

xattr: use memdup_user() · 3939fcde

由 Li Zefan 提交于 4月 08, 2009

Remove open-coded memdup_user()
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3939fcde

A
autofs4: use memchr() in invalid_string() · 3eac8778
由 Al Viro 提交于 4月 07, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3eac8778

Fix i_mutex vs. readdir handling in nfsd · 2f9092e1

由 David Woodhouse 提交于 4月 20, 2009

Commit 14f7dd63 ("Copy XFS readdir hack into nfsd code") introduced a
bug to generic code which had been extant for a long time in the XFS
version -- it started to call through into lookup_one_len() and hence
into the file systems' ->lookup() methods without i_mutex held on the
directory.

This patch fixes it by locking the directory's i_mutex again before
calling the filldir functions. The original deadlocks which commit
14f7dd63 was designed to avoid are still avoided, because they were due
to fs-internal locking, not i_mutex.

While we're at it, fix the return type of nfsd_buffered_readdir() which
should be a __be32 not an int -- it's an NFS errno, not a Linux errno.
And return nfserrno(-ENOMEM) when allocation fails, not just -ENOMEM.
Sparse would have caught that, if it wasn't so busy bitching about
__cold__.

Commit 05f4f678 ("nfsd4: don't do lookup within readdir in recovery
code") introduced a similar problem with calling lookup_one_len()
without i_mutex, which this patch also addresses. To fix that, it was
necessary to fix the called functions so that they expect i_mutex to be
held; that part was done by J. Bruce Fields.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Umm-I-can-live-with-that-by: NAl Viro <viro@zeniv.linux.org.uk>
Reported-by: NJ. R. Okajima <hooanon05@yahoo.co.jp>
Tested-by: NJ. Bruce Fields <bfields@citi.umich.edu>
LKML-Reference: <8036.1237474444@jrobl>
Cc: stable@kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2f9092e1

fs/compat_ioctl: fix build when !BLOCK · 1ba0c7db

由 Alexander Beregalov 提交于 4月 20, 2009

In file included from fs/compat_ioctl.c:61:
include/linux/loop.h:59: error: field 'lo_bio_list' has incomplete type
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1ba0c7db

Fix autofs_expire() · 117aff74

由 Al Viro 提交于 4月 18, 2009

mnt should remain the same for all iterations through the list;
as it is, if we have a busy mount, mnt follows into it and isn't
restored for the next iteration.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

117aff74

No need for crossing to mountpoint in audit_tag_tree() · 24b6f16e

由 Al Viro 提交于 4月 18, 2009

is_under() will DTRT anyway.  And yes, is_subdir() behaviour
is intentional.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

24b6f16e

Safer nfsd_cross_mnt() · 1644ccc8

由 Al Viro 提交于 4月 18, 2009

AFAICS, we have a subtle bug there: if we have crossed mountpoint
*and* it got mount --move'd away, we'll be holding only one
reference to fs containing dentry - exp->ex_path.mnt.  IOW, we
ought to dput() before exp_put().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1644ccc8

Touch all affected namespaces on propagation of mount · e5d67f07

由 Al Viro 提交于 4月 07, 2009

We shouldn't just touch the namespace of current process
Caught-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e5d67f07

Fix AUTOFS_DEV_IOCTL_REQUESTER_CMD · cf2706a3

由 Al Viro 提交于 4月 07, 2009

Missing conversion from kernel to userland dev_t; this sucker
breaks as soon as we get sufficiently many autofs mounts for
new_encode_dev(s_dev) != s_dev.

Note: this is the minimal fix.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cf2706a3

Btrfs: use the right node in reada_for_balance · 8c594ea8

由 Chris Mason 提交于 4月 20, 2009

reada_for_balance was using the wrong index into the path node array,
so it wasn't reading the right blocks.  We never directly used the
results of the read done by this function because the btree search is
started over at the end.

This fixes reada_for_balance to reada in the correct node and to
avoid searching past the last slot in the node.  It also makes sure to
hold the parent lock while we are finding the nodes to read.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8c594ea8

Btrfs: fix oops on page->mapping->host during writepage · 11c8349b

由 Chris Mason 提交于 4月 20, 2009

The extent_io writepage call updates the writepage index in the inode
as it makes progress.  But, it was doing the update after unlocking the page,
which isn't legal because page->mapping can't be trusted once the page
is unlocked.

This lead to an oops, especially common with compression turned on.  The
fix here is to update the writeback index before unlocking the page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11c8349b

Btrfs: add a priority queue to the async thread helpers · d313d7a3

由 Chris Mason 提交于 4月 20, 2009

Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
higher priority.  But, the checksumming helper threads prevent it
from being fully effective.

There are two problems.  First, a big queue of pending checksumming
will delay the synchronous IO behind other lower priority writes.  Second,
the checksumming uses an ordered async work queue.  The ordering makes sure
that IOs are sent to the block layer in the same order they are sent
to the checksumming threads.  Usually this gives us less seeky IO.

But, when we start mixing IO priorities, the lower priority IO can delay
the higher priority IO.

This patch solves both problems by adding a high priority list to the async
helper threads, and a new btrfs_set_work_high_prio(), which is used
to make put a new async work item onto the higher priority list.

The ordering is still done on high priority IO, but all of the high
priority bios are ordered separately from the low priority bios.  This
ordering is purely an IO optimization, it is not involved in data
or metadata integrity.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d313d7a3

Btrfs: use WRITE_SYNC for synchronous writes · ffbd517d

由 Chris Mason 提交于 4月 20, 2009

Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future.  This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.

Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios.  The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.

This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ffbd517d

20 4月, 2009 1 次提交

GFS2: Fix page_mkwrite() return code · e56985da

由 Steven Whitehouse 提交于 4月 20, 2009

This allows for the possibility of returning VM_FAULT_OOM as
well as VM_FAULT_SIGBUS. This ensures that the correct action
is taken.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e56985da

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功