提交 · db6e182c17cb1a7069f7f8924721ce58ac05d9a3 · openanolis / cloud-kernel

11 12月, 2012 5 次提交

nfsd: pass net to nfsd_init_socks() · db6e182c

由 Stanislav Kinsbursky 提交于 12月 10, 2012

Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

db6e182c

nfsd: use "init_net" for portmapper · f7fb86c6

由 Stanislav Kinsbursky 提交于 12月 10, 2012

There could be a situation, when NFSd was started in one network namespace, but
stopped in another one.
This will trigger kernel panic, because RPCBIND client is stored on per-net
NFSd data, and will be NULL on NFSd shutdown.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f7fb86c6

nfsd: avoid permission checks on EXCLUSIVE_CREATE replay · 7007c90f

由 Neil Brown 提交于 12月 07, 2012

With NFSv4, if we create a file then open it we explicit avoid checking
the permissions on the file during the open because the fact that we
created it ensures we should be allow to open it (the create and the
open should appear to be a single operation).

However if the reply to an EXCLUSIVE create gets lots and the client
resends the create, the current code will perform the permission check -
because it doesn't realise that it did the open already..

This patch should fix this.

Note that I haven't actually seen this cause a problem.  I was just
looking at the code trying to figure out a different EXCLUSIVE open
related issue, and this looked wrong.

(Fix confirmed with pynfs 4.0 test OPEN4--bfields)

Cc: stable@kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>
[bfields: use OWNER_OVERRIDE and update for 4.1]
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7007c90f

SUNRPC: remove redundant "linux/nsproxy.h" includes · 756933ee

由 Stanislav Kinsbursky 提交于 12月 04, 2012

This is a cleanup patch.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

756933ee

nfsd: make NFSv4 recovery client tracking options per net · 9a9c6478

由 Stanislav Kinsbursky 提交于 12月 04, 2012

Pointer to client tracking operations - client_tracking_ops - have to be
containerized, because different environment can support different trackers
(for example, legacy tracker currently is not suported in container).
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

9a9c6478

04 12月, 2012 6 次提交

nfsd4: lockt, release_lockowner should renew clients · 9b2ef62b

由 J. Bruce Fields 提交于 12月 03, 2012

Fix nfsd4_lockt and release_lockowner to lookup the referenced client,
so that it can renew it, or correctly return "expired", as appropriate.

Also share some code while we're here.
Reported-by: NFrank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

9b2ef62b

svcrpc: support multiple-fragment rpc's · 836fbadb

由 J. Bruce Fields 提交于 12月 03, 2012

Over TCP, RPC's are preceded by a single 4-byte field telling you how
long the rpc is (in bytes). The spec also allows you to send an RPC in
multiple such records (the high bit of the length field is used to tell
you whether this is the final record).

We've survived for years without supporting this because in practice the
clients we care about don't use it. But the userland rpc libraries do,
and every now and then an experimental client will run into this. (Most
recently I noticed it while trying to write a pynfs check.) And we're
really on the wrong side of the spec here--let's fix this.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

836fbadb

svcrpc: track rpc data length separately from sk_tcplen · 8af345f5

由 J. Bruce Fields 提交于 12月 03, 2012

Keep a separate field, sk_datalen, that tracks only the data contained
in a fragment, not including the fragment header.

For now, this is always just max(0, sk_tcplen - 4), but after we allow
multiple fragments sk_datalen will accumulate the total rpc data size
while sk_tcplen only tracks progress receiving the current fragment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8af345f5

svcrpc: fix off-by-4 error in "incomplete TCP record" dprintk · 6a72ae2e

由 J. Bruce Fields 提交于 12月 03, 2012

The full reclen doesn't include the fragment header, but sk_tcplen does.
Fix this to make it an apples-to-apples comparison.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6a72ae2e

svcrpc: delay minimum-rpc-size check till later · ad46ccf0

由 J. Bruce Fields 提交于 12月 03, 2012

Soon we want to support multiple fragments, in which case it may be
legal for a single fragment to be smaller than 8 bytes, so we'll want to
delay this check till we've reached the last fragment.

Also fix an outdated comment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ad46ccf0

svcrpc: don't byte-swap sk_reclen in place · cc248d4b

由 J. Bruce Fields 提交于 12月 03, 2012

Byte-swapping in place is always a little dubious.

Let's instead define this field to always be big-endian, and do the
swapping on demand where we need it.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cc248d4b

03 12月, 2012 10 次提交

NFSD: Forget state for a specific client · 6c1e82a4

由 Bryan Schumaker 提交于 11月 29, 2012

Write the client's ip address to any state file and all appropriate
state for that client will be forgotten.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6c1e82a4

NFSD: Add a custom file operations structure for fault injection · d7cc431e

由 Bryan Schumaker 提交于 11月 29, 2012

Controlling the read and write functions allows me to add in "forget
client w.x.y.z", since we won't be limited to reading and writing only
u64 values.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d7cc431e

NFSD: Reading a fault injection file prints a state count · 184c1847

由 Bryan Schumaker 提交于 11月 29, 2012

I also log basic information that I can figure out about the type of
state (such as number of locks for each client IP address).  This can be
useful for checking that state was actually dropped and later for
checking if the client was able to recover.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

184c1847

NFSD: Fault injection operations take a per-client forget function · 8ce54e0d

由 Bryan Schumaker 提交于 11月 29, 2012

The eventual goal is to forget state based on ip address, so it makes
sense to call this function in a for-each-client loop until the correct
amount of state is forgotten. I also use this patch as an opportunity
to rename the forget function from "func()" to "forget()".
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8ce54e0d

NFSD: Clean up forgetting and recalling delegations · 269de30f

由 Bryan Schumaker 提交于 11月 29, 2012

Once I have a client, I can easily use its delegation list rather than
searching the file hash table for delegations to remove.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

269de30f

NFSD: Clean up forgetting openowners · 4dbdbda8

由 Bryan Schumaker 提交于 11月 29, 2012

Using "forget_n_state()" forces me to implement the code needed to
forget a specific client's openowners.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4dbdbda8

NFSD: Clean up forgetting locks · fc29171f

由 Bryan Schumaker 提交于 11月 29, 2012

I use the new "forget_n_state()" function to iterate through each client
first when searching for locks.  This may slow down forgetting locks a
little bit, but it implements most of the code needed to forget a
specified client's locks.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

fc29171f

NFSD: Clean up forgetting clients · 44e34da6

由 Bryan Schumaker 提交于 11月 29, 2012

I added in a generic for-each loop that takes a pass over the client_lru
list for the current net namespace and calls some function. The next few
patches will update other operations to use this function as well. A value
of 0 still means "forget everything that is found".
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

44e34da6

NFSD: Lock state before calling fault injection function · 04395839

由 Bryan Schumaker 提交于 11月 29, 2012

Each function touches state in some way, so getting the lock earlier
can help simplify code.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

04395839

J
nfsd4: discard some unused nfsd4_verify xdr code · e5f95703
由 J. Bruce Fields 提交于 11月 30, 2012
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
e5f95703

29 11月, 2012 1 次提交

NFSD: Fold fault_inject.h into state.h · f3c7521f

由 Bryan Schumaker 提交于 11月 27, 2012

There were only a small number of functions in this file and since they
all affect stored state I think it makes sense to put them in state.h
instead.  I also dropped most static inline declarations since there are
no callers when fault injection is not enabled.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f3c7521f

28 11月, 2012 13 次提交

nfsd: make NFSv4 grace time per net · 5284b44e

由 Stanislav Kinsbursky 提交于 11月 27, 2012

Grace time is a part of NFSv4 state engine, which is constructed per network
namespace.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5284b44e

nfsd: make NFSv4 lease time per net · 3d733711

由 Stanislav Kinsbursky 提交于 11月 27, 2012

Lease time is a part of NFSv4 state engine, which is constructed per network
namespace.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3d733711

nfsd: remove redundant declarations · 864aee5c

由 Stanislav Kinsbursky 提交于 11月 27, 2012

This is a cleanup patch. Functions nfsd_pool_stats_open() and
nfsd_pool_stats_release() are declared in fs/nfsd/nfsd.h.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

864aee5c

nfsd: recovery - make in_grace per net · f141f79d

由 Stanislav Kinsbursky 提交于 11月 26, 2012

Flag in_grace is a part of client tracking state, which is network namesapce
aware. So let'a replace global static variable with per-net one.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f141f79d

nfsd: recovery - make rec_file per net · 3a073369

由 Stanislav Kinsbursky 提交于 11月 26, 2012

Opening and closing of this file is done in client tracking init and exit
operations.
Client tracking is done in network namespace context already. So let's make
this file opened and closed per network context - this will simlify it's
management.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3a073369

nfsd: call state init and shutdown twice · f252bc68

由 Stanislav Kinsbursky 提交于 11月 26, 2012

Split NFSv4 state init and shutdown into two different calls: per-net one and
generic one.
Per-net cwinit/shutdown pair have to be called for any namespace, generic pair
- only once on NSFd kthreads start and shutdown respectively.

Refresh of diff-nfsd-call-state-init-twice
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f252bc68

nfsd: cleanup NFSd state start a bit · d85ed443

由 Stanislav Kinsbursky 提交于 11月 26, 2012

This patch renames nfs4_state_start_net() into nfs4_state_create_net(), where
get_net() now performed.
Also it introduces new nfs4_state_start_net(), which is now responsible for
state creation and initializing all per-net data and which is now called from
nfs4_state_start().
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d85ed443

nfsd: cleanup NFSd state shutdown a bit · 4dce0ac9

由 Stanislav Kinsbursky 提交于 11月 26, 2012

This patch renames __nfs4_state_shutdown_net() into nfs4_state_shutdown_net(),
__nfs4_state_shutdown() into nfs4_state_shutdown_net() and moves all network
related shutdown operations to nfs4_state_shutdown_net().
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4dce0ac9

nfsd: make delegations shutdown network namespace aware · 4e37a7c2

由 Stanislav Kinsbursky 提交于 11月 26, 2012

NFSv4 delegations are stored in global list. But they are nfs4_client
dependent, which is network namespace aware already.
State shutdown and laundromat are done per network namespace as well.
So, delegations unhash have to be done in network namespace context.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4e37a7c2

nfsd: make client_lock per net · c9a49628

由 Stanislav Kinsbursky 提交于 11月 26, 2012

This lock protects the client lru list and session hash table, which are
allocated per network namespace already.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c9a49628

nfsd4: remove state lock from nfs4_state_shutdown · ec28e02c

由 Stanislav Kinsbursky 提交于 11月 21, 2012

Protection of __nfs4_state_shutdown() with nfs4_lock_state() looks redundant.

This function is called by the last NFSd thread on it's exit and state lock
protects actually two functions (del_recall_lru is protected by recall_lock):
1) nfsd4_client_tracking_exit
2) __nfs4_state_shutdown_net

"nfsd4_client_tracking_exit" doesn't require state lock protection, because it's
state can be modified only by tracker callbacks.
Here a re they:
1) create: is called only from nfsd4_proc_compound.
2) remove: is called from either nfsd4_proc_compound or nfs4_laundromat.
3) check: is called only from nfsd4_proc_compound.
4) grace_done; called only from nfs4_laundromat.

nfsd4_proc_compound is called onll by NFSd kthread, which is exiting right
now.
nfs4_laundromat is called by laundry_wq. But laundromat_work was canceled
already.

"__nfs4_state_shutdown_net" also doesn't require state lock protection,
because all NFSd kthreads are dead, and no race can happen with NFSd start,
because "nfsd_up" flag is still set.
Moreover, all Nfsd shutdown is protected with global nfsd_mutex.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ec28e02c

nfsd4: remove state lock from nfsd4_load_reboot_recovery_data · dba88ba5

由 J. Bruce Fields 提交于 11月 16, 2012

That function is only called under nfsd_mutex: we know that because the
only caller is nfsd_svc, via

        nfsd_svc
          nfsd_startup
            nfs4_state_start
              nfsd4_client_tracking_init
                client_tracking_ops->init == nfsd4_load_reboot_recovery_data

The shared state accessed here includes:

        - user_recovery_dirname: used here, modified only by
          nfs4_reset_recoverydir, which can be verified to only be
          called under nfsd_mutex.
        - filesystem state, protected by i_mutex (handwaving slightly
	  here)
        - rec_file, reclaim_str_hashtbl, reclaim_str_hashtbl_size: other
          than here, used only from code called from nfsd or laundromat
          threads, both of which should be started only after this runs
          (see nfsd_svc) and stopped before this could run again (see
          nfsd_shutdown, called from nfsd_last_thread).
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

dba88ba5

nfsd4: return badname, not inval, on "." or "..", or "/" · a36b1725

由 J. Bruce Fields 提交于 11月 25, 2012

The spec requires badname, not inval, in these cases.

Some callers want us to return enoent, but I can see no justification
for that.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a36b1725

26 11月, 2012 5 次提交

nfsd4: downgrade some fs/nfsd/nfs4state.c BUG's · 063b0fb9

由 J. Bruce Fields 提交于 11月 25, 2012

Linus has pointed out that indiscriminate use of BUG's can make it
harder to diagnose bugs because they can bring a machine down, often
before we manage to get any useful debugging information to the logs.
(Consider, for example, a BUG() that fires in a workqueue, or while
holding a spinlock).

Most of these BUG's won't do much more than kill an nfsd thread, but it
would still probably be safer to get out the warning without dying.

There's still more of this to do in nfsd/.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

063b0fb9

nfsd4: delay filling in write iovec array till after xdr decoding · ffe1137b

由 J. Bruce Fields 提交于 11月 15, 2012

Our server rejects compounds containing more than one write operation.
It's unclear whether this is really permitted by the spec; with 4.0,
it's possibly OK, with 4.1 (which has clearer limits on compound
parameters), it's probably not OK. No client that we're aware of has
ever done this, but in theory it could be useful.

The source of the limitation: we need an array of iovecs to pass to the
write operation. In the worst case that array of iovecs could have
hundreds of elements (the maximum rwsize divided by the page size), so
it's too big to put on the stack, or in each compound op. So we instead
keep a single such array in the compound argument.

We fill in that array at the time we decode the xdr operation.

But we decode every op in the compound before executing any of them. So
once we've used that array we can't decode another write.

If we instead delay filling in that array till the time we actually
perform the write, we can reuse it.

Another option might be to switch to decoding compound ops one at a
time. I considered doing that, but it has a number of other side
effects, and I'd rather fix just this one problem for now.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ffe1137b

J
nfsd4: move more write parameters into xdr argument · 70cc7f75
由 J. Bruce Fields 提交于 11月 16, 2012
```
In preparation for moving some of this elsewhere.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
70cc7f75

nfsd4: reorganize write decoding · 5a80a54d

由 J. Bruce Fields 提交于 11月 16, 2012

In preparation for moving some of it elsewhere.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5a80a54d

nfsd4: simplify reading of opnum · 8a61b18c

由 J. Bruce Fields 提交于 11月 16, 2012

The comment here is totally bogus:
	- OP_WRITE + 1 is RELEASE_LOCKOWNER.  Maybe there was some older
	  version of the spec in which that served as a sort of
	  OP_ILLEGAL?  No idea, but it's clearly wrong now.
	- In any case, I can't see that the spec says anything about
	  what to do if the client sends us less ops than promised.
	  It's clearly nutty client behavior, and we should do
	  whatever's easiest: returning an xdr error (even though it
	  won't be consistent with the error on the last op returned)
	  seems fine to me.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8a61b18c

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功