提交 · ac55fdc408039b425a2fa3cbcaed7444e5339f9a · openeuler / raspberrypi-kernel

13 11月, 2012 9 次提交

nfsd: move the confirmed and unconfirmed hlists to a rbtree · ac55fdc4

由 Jeff Layton 提交于 11月 12, 2012

The current code requires that we md5 hash the name in order to store
the client in the confirmed and unconfirmed trees. Change it instead
to store the clients in a pair of rbtrees, and simply compare the
cl_names directly instead of hashing them. This also necessitates that
we add a new flag to the clp->cl_flags field to indicate which tree
the client is currently in.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ac55fdc4

nfsd: don't search for client by hash on legacy reboot recovery gracedone · 0ce0c2b5

由 Jeff Layton 提交于 11月 12, 2012

When nfsd starts, the legacy reboot recovery code creates a tracking
struct for each directory in the v4recoverydir. When the grace period
ends, it basically does a "readdir" on the directory again, and matches
each dentry in there to an existing client id to see if it should be
removed or not. If the matching client doesn't exist, or hasn't
reclaimed its state then it will remove that dentry.

This is pretty inefficient since it involves doing a lot of hash-bucket
searching. It also means that we have to keep relying on being able to
search for a nfs4_client by md5 hashed cl_recdir name.

Instead, add a pointer to the nfs4_client that indicates the association
between the nfs4_client_reclaim and nfs4_client. When a reclaim operation
comes in, we set the pointer to make that association. On gracedone, the
legacy client tracker will keep the recdir around iff:

1/ there is a reclaim record for the directory

...and...

2/ there's an association between the reclaim record and a client record
-- that is, a create or check operation was performed on the client that
matches that directory.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0ce0c2b5

nfsd: make nfs4_client_to_reclaim return a pointer to the reclaim record · 772a9bbb

由 Jeff Layton 提交于 11月 12, 2012

Later callers will need to make changes to the record.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

772a9bbb

nfsd: break out reclaim record removal into separate function · ce30e539

由 Jeff Layton 提交于 11月 12, 2012

We'll need to be able to call this from nfs4recover.c eventually.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ce30e539

nfsd: have nfsd4_find_reclaim_client take a char * argument · 278c931c

由 Jeff Layton 提交于 11月 12, 2012

Currently, it takes a client pointer, but later we're going to need to
search for these records without knowing whether a matching client even
exists.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

278c931c

nfsd: warn about impending removal of nfsdcld upcall · 8b0554e9

由 Jeff Layton 提交于 11月 12, 2012

Let's shoot for removing the nfsdcld upcall in 3.10. Most likely,
no one is actually using it so I don't expect this warning to
fire often (except maybe on misconfigured systems).
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8b0554e9

nfsd: pass info about the legacy recoverydir in environment variables · f3aa7e24

由 Jeff Layton 提交于 11月 12, 2012

The usermodehelper upcall program can then decide to use this info as
a (one-way) transition mechanism to the new scheme. When a "check"
upcall occurs and the client doesn't exist in the database, we can
look to see whether the directory exists. If it does, then we'd add
the client to the database, remove the legacy recdir, and return
success to the kernel to allow the recovery to proceed.

For gracedone, we simply pass the v4recovery "topdir" so that the
upcall can clean it out prior to returning to the kernel.

A module parm is also added to disable the legacy conversion if
the admin chooses.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f3aa7e24

nfsd: change heuristic for selecting the client_tracking_ops · 2d77bf0a

由 Jeff Layton 提交于 11月 12, 2012

First, try to use the new usermodehelper upcall. It should succeed or
fail quickly, so there's little cost to doing so.

If it fails, and the legacy tracking dir exists, use that. If it
doesn't exist then fall back to using nfsdcld.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2d77bf0a

nfsd: add a usermodehelper upcall for NFSv4 client ID tracking · 2873d214

由 Jeff Layton 提交于 11月 12, 2012

Add a new client tracker upcall type that uses call_usermodehelper to
call out to a program. This seems to be the preferred method of
calling out to usermode these days for seldom-called upcalls. It's
simple and doesn't require a running daemon, so it should "just work"
as long as the binary is installed.

The client tracking exit operation is also changed to check for a
NULL pointer before running. The UMH upcall doesn't need to do anything
at module teardown time.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2873d214

11 11月, 2012 2 次提交

J
nfsd: remove unused argument to nfs4_has_reclaimed_state · a0af710a
由 Jeff Layton 提交于 11月 09, 2012
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
a0af710a

nfsd: fix error handling in nfsd4_remove_clid_dir · 698d8d87

由 Jeff Layton 提交于 11月 09, 2012

If the credential save fails, then we'll leak our mnt_want_write_file
reference.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

698d8d87

08 11月, 2012 13 次提交

nfsd4: backchannel should use client-provided security flavor · 12fc3e92

由 J. Bruce Fields 提交于 11月 05, 2012

For now this only adds support for AUTH_NULL.  (Previously we assumed
AUTH_UNIX.)  We'll also need AUTH_GSS, which is trickier.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

12fc3e92

nfsd4: common helper to initialize callback work · 57725155

由 J. Bruce Fields 提交于 11月 05, 2012

I've found it confusing having the only references to
nfsd4_do_callback_rpc() in a different file.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

57725155

nfsd4: implement backchannel_ctl operation · cb73a9f4

由 J. Bruce Fields 提交于 11月 01, 2012

This operation is mandatory for servers to implement.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cb73a9f4

nfsd4: use callback security parameters in create_session · c6bb3ca2

由 J. Bruce Fields 提交于 11月 01, 2012

We're currently ignoring the callback security parameters specified in
create_session, and just assuming the client wants auth_sys, because
that's all the current linux client happens to care about.  But this
could cause us callbacks to fail to a client that wanted something
different.

For now, all we're doing is no longer ignoring the uid and gid passed in
the auth_sys case.  Further patches will add support for auth_null and
gss (and possibly use more of the auth_sys information; the spec wants
us to use exactly the credential we're passed, though it's hard to
imagine why a client would care).
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c6bb3ca2

nfsd4: clean up callback security parsing · acb2887e

由 J. Bruce Fields 提交于 3月 27, 2012

Move the callback parsing into a separate function.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

acb2887e

nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes · face1502

由 J. Bruce Fields 提交于 10月 26, 2012

NFSv4 shares the same struct file across multiple writes.  (And we'd
like NFSv2 and NFSv3 to do that as well some day.)

So setting O_SYNC on the struct file as a way to request a synchronous
write doesn't work.

Instead, do a vfs_fsync_range() in that case.
Reported-by: NPeter Staubach <pstaubach@exagrid.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

face1502

nfsd: assume writeable exportabled filesystems have f_sync · fae5096a

由 J. Bruce Fields 提交于 10月 26, 2012

I don't really see how you could claim to support nfsd and not support
fsync somehow.

And in practice a quick look through the exportable filesystems suggests
the only ones without an ->fsync are read-only (efs, isofs, squashfs) or
in-memory (shmem).

Also, performing a write and then returning an error if the sync fails
(as we would do here in the wgather case) seems unhelpful to clients.

Also remove an incorrect comment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

fae5096a

nfsd4: don't BUG in delegation break callback · 7fa10cd1

由 J. Bruce Fields 提交于 10月 16, 2012

These conditions would indeed indicate bugs in the code, but if we want
to hear about them we're likely better off warning and returning than
immediately dying while holding file_lock_lock.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7fa10cd1

J
nfsd4: remove unused init_session return · 7c1f8b65
由 J. Bruce Fields 提交于 11月 01, 2012
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
7c1f8b65
J
nfsd4: helper function for getting mounted_on ino · ae7095a7
由 J. Bruce Fields 提交于 10月 01, 2012
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
ae7095a7

nfs: fix wrong object type in lockowner_slab · 3c40794b

由 Yanchuan Nian 提交于 10月 24, 2012

The object type in the cache of lockowner_slab is wrong, and it is
better to fix it.

Cc: stable@vger.kernel.org
Signed-off-by: NYanchuan Nian <ycnian@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3c40794b

nfsd4: remove unused variable in nfsd4_delegreturn() · 01f6c8fd

由 Wei Yongjun 提交于 10月 18, 2012

The variable inode is initialized but never used
otherwise, so remove the unused variable.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

01f6c8fd

exportfs: add FILEID_INVALID to indicate invalid fid_type · 216b6cbd

由 Namjae Jeon 提交于 8月 29, 2012

This commit adds FILEID_INVALID = 0xff in fid_type to
indicate invalid fid_type

It avoids using magic number 255
Signed-off-by: NNamjae Jeon <linkinjeon@gmail.com>
Signed-off-by: NVivek Trivedi <vtrivedi018@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

216b6cbd

20 10月, 2012 1 次提交

hold task->mempolicy while numa_maps scans. · 9e781440

由 KAMEZAWA Hiroyuki 提交于 10月 19, 2012

  /proc/<pid>/numa_maps scans vma and show mempolicy under
  mmap_sem. It sometimes accesses task->mempolicy which can
  be freed without mmap_sem and numa_maps can show some
  garbage while scanning.

This patch tries to take reference count of task->mempolicy at reading
numa_maps before calling get_vma_policy(). By this, task->mempolicy
will not be freed until numa_maps reaches its end.

V2->v3
  -  updated comments to be more verbose.
  -  removed task_lock() in numa_maps code.
V1->V2
  -  access task->mempolicy only once and remember it.  Becase kernel/exit.c
     can overwrite it.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9e781440

19 10月, 2012 1 次提交

fs, xattr: fix bug when removing a name not in xattr list · 43385846

由 David Rientjes 提交于 10月 17, 2012

Commit 38f38657 ("xattr: extract simple_xattr code from tmpfs") moved
some code from tmpfs but introduced a subtle bug along the way.

If the name passed to simple_xattr_remove() does not exist in the list of
xattrs, then it is possible to call kfree(new_xattr) when new_xattr is
actually initialized to itself on the stack via uninitialized_var().

This causes a BUG() since the memory was not allocated via the slab
allocator and was not bypassed through to the page allocator because it
was too large.

Initialize the local variable to NULL so the kfree() never takes place.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NHugh Dickins <hughd@google.com>
Acked-by: NAristeu Rozanski <aris@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

43385846

17 10月, 2012 3 次提交

NLM: nlm_lookup_file() may return NLMv4-specific error codes · cd0b16c1

由 Trond Myklebust 提交于 10月 13, 2012

If the filehandle is stale, or open access is denied for some reason,
nlm_fopen() may return one of the NLMv4-specific error codes nlm4_stale_fh
or nlm4_failed. These get passed right through nlm_lookup_file(),
and so when nlmsvc_retrieve_args() calls the latter, it needs to filter
the result through the cast_status() machinery.

Failure to do so, will trigger the BUG_ON() in encode_nlm_stat...
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Reported-by: NLarry McVoy <lm@bitmover.com>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cd0b16c1

mm, mempolicy: fix printing stack contents in numa_maps · 32f8516a

由 David Rientjes 提交于 10月 16, 2012

When reading /proc/pid/numa_maps, it's possible to return the contents of
the stack where the mempolicy string should be printed if the policy gets
freed from beneath us.

This happens because mpol_to_str() may return an error the
stack-allocated buffer is then printed without ever being stored.

There are two possible error conditions in mpol_to_str():

 - if the buffer allocated is insufficient for the string to be stored,
   and

 - if the mempolicy has an invalid mode.

The first error condition is not triggered in any of the callers to
mpol_to_str(): at least 50 bytes is always allocated on the stack and this
is sufficient for the string to be written.  A future patch should convert
this into BUILD_BUG_ON() since we know the maximum strlen possible, but
that's not -rc material.

The second error condition is possible if a race occurs in dropping a
reference to a task's mempolicy causing it to be freed during the read().
The slab poison value is then used for the mode and mpol_to_str() returns
-EINVAL.

This race is only possible because get_vma_policy() believes that
mm->mmap_sem protects task->mempolicy, which isn't true.  The exit path
does not hold mm->mmap_sem when dropping the reference or setting
task->mempolicy to NULL: it uses task_lock(task) instead.

Thus, it's required for the caller of a task mempolicy to hold
task_lock(task) while grabbing the mempolicy and reading it.  Callers with
a vma policy store their mempolicy earlier and can simply increment the
reference count so it's guaranteed not to be freed.
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

32f8516a

fix a leak in replace_fd() users · 45525b26

由 Al Viro 提交于 10月 16, 2012

replace_fd() began with "eats a reference, tries to insert into
descriptor table" semantics; at some point I'd switched it to
much saner current behaviour ("try to insert into descriptor
table, grabbing a new reference if inserted; caller should do
fput() in any case"), but forgot to update the callers.
Mea culpa...

[Spotted by Pavel Roskin, who has really weird system with pipe-fed
coredumps as part of what he considers a normal boot ;-)]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

45525b26

13 10月, 2012 9 次提交

J
procfs: don't need a PATH_MAX allocation to hold a string representation of an int · f81700bd
由 Jeff Layton 提交于 10月 10, 2012
```
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f81700bd

vfs: embed struct filename inside of names_cache allocation if possible · 7950e385

由 Jeff Layton 提交于 10月 10, 2012

In the common case where a name is much smaller than PATH_MAX, an extra
allocation for struct filename is unnecessary. Before allocating a
separate one, try to embed the struct filename inside the buffer first. If
it turns out that that's not long enough, then fall back to allocating a
separate struct filename and redoing the copy.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7950e385

audit: make audit_inode take struct filename · adb5c247

由 Jeff Layton 提交于 10月 10, 2012

Keep a pointer to the audit_names "slot" in struct filename.

Have all of the audit_inode callers pass a struct filename ponter to
audit_inode instead of a string pointer. If the aname field is already
populated, then we can skip walking the list altogether and just use it
directly.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

adb5c247

vfs: make path_openat take a struct filename pointer · 669abf4e

由 Jeff Layton 提交于 10月 10, 2012

...and fix up the callers. For do_file_open_root, just declare a
struct filename on the stack and fill out the .name field. For
do_filp_open, make it also take a struct filename pointer, and fix up its
callers to call it appropriately.

For filp_open, add a variant that takes a struct filename pointer and turn
filp_open into a wrapper around it.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

669abf4e

vfs: turn do_path_lookup into wrapper around struct filename variant · 873f1eed

由 Jeff Layton 提交于 10月 10, 2012

...and make the user_path callers use that variant instead.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

873f1eed

audit: allow audit code to satisfy getname requests from its names_list · 7ac86265

由 Jeff Layton 提交于 10月 10, 2012

Currently, if we call getname() on a userland string more than once,
we'll get multiple copies of the string and multiple audit_names
records.

Add a function that will allow the audit_names code to satisfy getname
requests using info from the audit_names list, avoiding a new allocation
and audit_names records.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7ac86265

vfs: define struct filename and have getname() return it · 91a27b2a

由 Jeff Layton 提交于 10月 10, 2012

getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.

For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.

This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.

Later, we'll add other information to the struct as it becomes
convenient.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

91a27b2a

btrfs: Fix compilation with user namespace support enabled · e9069f47

由 Eric W. Biederman 提交于 10月 12, 2012

When compiling with user namespace support btrfs fails like:

fs/btrfs/tree-log.c: In function ‘fill_inode_item’:
fs/btrfs/tree-log.c:2955:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_uid’
fs/btrfs/ctree.h:2026:1: note: expected ‘u32’ but argument is of type ‘kuid_t’
fs/btrfs/tree-log.c:2956:2: error: incompatible type for argument 3 of ‘btrfs_set_inode_gid’
fs/btrfs/ctree.h:2027:1: note: expected ‘u32’ but argument is of type ‘kgid_t’

Fix this by using i_uid_read and i_gid_read in

Cc: Chris Mason <chris.mason@fusionio.com>
Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

e9069f47

userns: Fix posix_acl_file_xattr_userns gid conversion · ea1fd777

由 Eric W. Biederman 提交于 10月 09, 2012

The code needs to be from_kgid(make_kgid(...)...) not
from_kuid(make_kgid(...)...). Doh!
Reported-by: NJan Kara <jack@suse.cz>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ea1fd777

12 10月, 2012 2 次提交

vfs: unexport getname and putname symbols · 8e377d15

由 Jeff Layton 提交于 10月 10, 2012

I see no callers in module code.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8e377d15

audit: overhaul __audit_inode_child to accomodate retrying · 4fa6b5ec

由 Jeff Layton 提交于 10月 10, 2012

In order to accomodate retrying path-based syscalls, we need to add a
new "type" argument to audit_inode_child. This will tell us whether
we're looking for a child entry that represents a create or a delete.

If we find a parent, don't automatically assume that we need to create a
new entry. Instead, use the information we have to try to find an
existing entry first. Update it if one is found and create a new one if
not.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4fa6b5ec