提交 · 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e · openanolis / cloud-kernel

18 12月, 2012 1 次提交

由 J. Bruce Fields 提交于 12月 10, 2012

Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3a28e331

11 12月, 2012 1 次提交

SUNRPC: remove redundant "linux/nsproxy.h" includes · 756933ee

由 Stanislav Kinsbursky 提交于 12月 04, 2012

This is a cleanup patch.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

756933ee

04 12月, 2012 5 次提交

svcrpc: support multiple-fragment rpc's · 836fbadb

由 J. Bruce Fields 提交于 12月 03, 2012

Over TCP, RPC's are preceded by a single 4-byte field telling you how
long the rpc is (in bytes). The spec also allows you to send an RPC in
multiple such records (the high bit of the length field is used to tell
you whether this is the final record).

We've survived for years without supporting this because in practice the
clients we care about don't use it. But the userland rpc libraries do,
and every now and then an experimental client will run into this. (Most
recently I noticed it while trying to write a pynfs check.) And we're
really on the wrong side of the spec here--let's fix this.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

836fbadb

svcrpc: track rpc data length separately from sk_tcplen · 8af345f5

由 J. Bruce Fields 提交于 12月 03, 2012

Keep a separate field, sk_datalen, that tracks only the data contained
in a fragment, not including the fragment header.

For now, this is always just max(0, sk_tcplen - 4), but after we allow
multiple fragments sk_datalen will accumulate the total rpc data size
while sk_tcplen only tracks progress receiving the current fragment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8af345f5

svcrpc: fix off-by-4 error in "incomplete TCP record" dprintk · 6a72ae2e

由 J. Bruce Fields 提交于 12月 03, 2012

The full reclen doesn't include the fragment header, but sk_tcplen does.
Fix this to make it an apples-to-apples comparison.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6a72ae2e

svcrpc: delay minimum-rpc-size check till later · ad46ccf0

由 J. Bruce Fields 提交于 12月 03, 2012

Soon we want to support multiple fragments, in which case it may be
legal for a single fragment to be smaller than 8 bytes, so we'll want to
delay this check till we've reached the last fragment.

Also fix an outdated comment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ad46ccf0

svcrpc: don't byte-swap sk_reclen in place · cc248d4b

由 J. Bruce Fields 提交于 12月 03, 2012

Byte-swapping in place is always a little dubious.

Let's instead define this field to always be big-endian, and do the
swapping on demand where we need it.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cc248d4b

08 11月, 2012 1 次提交

svcrpc: demote some printks to a dprintk · 7032a3dd

由 J. Bruce Fields 提交于 10月 09, 2012

In general I'd rather random bad behavior on the network won't trigger a
printk.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7032a3dd

18 10月, 2012 1 次提交

SUNRPC: Prevent kernel stack corruption on long values of flush · 212ba906

由 Sasha Levin 提交于 7月 17, 2012

The buffer size in read_flush() is too small for the longest possible values
for it. This can lead to a kernel stack corruption:

[   43.047329] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff833e64b4
[   43.047329]
[   43.049030] Pid: 6015, comm: trinity-child18 Tainted: G        W    3.5.0-rc7-next-20120716-sasha #221
[   43.050038] Call Trace:
[   43.050435]  [<ffffffff836c60c2>] panic+0xcd/0x1f4
[   43.050931]  [<ffffffff833e64b4>] ? read_flush.isra.7+0xe4/0x100
[   43.051602]  [<ffffffff810e94e6>] __stack_chk_fail+0x16/0x20
[   43.052206]  [<ffffffff833e64b4>] read_flush.isra.7+0xe4/0x100
[   43.052951]  [<ffffffff833e6500>] ? read_flush_pipefs+0x30/0x30
[   43.053594]  [<ffffffff833e652c>] read_flush_procfs+0x2c/0x30
[   43.053596]  [<ffffffff812b9a8c>] proc_reg_read+0x9c/0xd0
[   43.053596]  [<ffffffff812b99f0>] ? proc_reg_write+0xd0/0xd0
[   43.053596]  [<ffffffff81250d5b>] do_loop_readv_writev+0x4b/0x90
[   43.053596]  [<ffffffff81250fd6>] do_readv_writev+0xf6/0x1d0
[   43.053596]  [<ffffffff812510ee>] vfs_readv+0x3e/0x60
[   43.053596]  [<ffffffff812511b8>] sys_readv+0x48/0xb0
[   43.053596]  [<ffffffff8378167d>] system_call_fastpath+0x1a/0x1f
Signed-off-by: NSasha Levin <levinsasha928@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

212ba906

02 10月, 2012 4 次提交

SUNRPC: Introduce rpc_clone_client_set_auth() · ba9b584c

由 Chuck Lever 提交于 9月 14, 2012

An ULP is supposed to be able to replace a GSS rpc_auth object with
another GSS rpc_auth object using rpcauth_create().  However,
rpcauth_create() in 3.5 reliably fails with -EEXIST in this case.
This is because when gss_create() attempts to create the upcall pipes,
sometimes they are already there.  For example if a pipe FS mount
event occurs, or a previous GSS flavor was in use for this rpc_clnt.

It turns out that's not the only problem here.  While working on a
fix for the above problem, we noticed that replacing an rpc_clnt's
rpc_auth is not safe, since dereferencing the cl_auth field is not
protected in any way.

So we're deprecating the ability of rpcauth_create() to switch an
rpc_clnt's security flavor during normal operation.  Instead, let's
add a fresh API that clones an rpc_clnt and gives the clone a new
flavor before it's used.

This makes immediate use of the new __rpc_clone_client() helper.

This can be used in a similar fashion to rpcauth_create() when a
client is hunting for the correct security flavor.  Instead of
replacing an rpc_clnt's security flavor in a loop, the ULP replaces
the whole rpc_clnt.

To fix the -EEXIST problem, any ULP logic that relies on replacing
an rpc_clnt's rpc_auth with rpcauth_create() must be changed to use
this API instead.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

ba9b584c

SUNRPC: Refactor rpc_clone_client() · 1b63a751

由 Chuck Lever 提交于 9月 14, 2012

rpc_clone_client() does most of the same tasks as rpc_new_client(),
so there is an opportunity for code re-use.  Create a generic helper
that makes it easy to clone an RPC client while replacing any of the
clnt's parameters.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1b63a751

SUNRPC: Use __func__ in dprintk() in auth_gss.c · 632f0d05

由 Chuck Lever 提交于 9月 14, 2012

Clean up: Some function names have changed, but debugging messages
were never updated.  Automate the construction of the function name
in debugging messages.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

632f0d05

SUNRPC: Clean up dprintk messages in rpc_pipe.c · d8af9bc1

由 Chuck Lever 提交于 9月 14, 2012

Clean up: The blank space in front of the message must be spaces.
Tabs show up on the console as a graphical character.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d8af9bc1

29 9月, 2012 3 次提交

SUNRPC: Limit the rpciod workqueue concurrency · 9b96ce71

由 Trond Myklebust 提交于 9月 28, 2012

We shouldn't need more than 1 worker thread per cpu, since rpciod
is designed to run without sleeping in most cases.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

9b96ce71

SUNRPC: Get rid of the redundant xprt->shutdown bit field · d19751e7

由 Trond Myklebust 提交于 9月 11, 2012

It is only set after everyone has dereferenced the transport,
and serves no useful purpose: setting it is racy, so all the
socket code, etc still needs to be able to cope with the cases
where they miss reading it.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d19751e7

SUNRPC: Optimise away unnecessary data moves in xdr_align_pages · a11a2bf4

由 Trond Myklebust 提交于 8月 02, 2012

We only have to call xdr_shrink_pagelen() if the remaining RPC
message does not fit in the page buffer length that we supplied
to xdr_align_pages().
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a11a2bf4

27 9月, 2012 1 次提交

SUNRPC: Fix the return value of xdr_align_pages() · 8a9a8b83

由 Trond Myklebust 提交于 8月 01, 2012

The callers of xdr_align_pages() expect it to return the number of bytes
of actual XDR data remaining in the pages.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8a9a8b83

25 9月, 2012 1 次提交

SUNRPC: Set alloc_slot for backchannel tcp ops · 84e28a30

由 Bryan Schumaker 提交于 9月 24, 2012

f39c1bfb (SUNRPC: Fix a UDP transport
regression) introduced the "alloc_slot" function for xprt operations,
but never created one for the backchannel operations.  This patch fixes
a null pointer dereference when mounting NFS over v4.1.

Call Trace:
 [<ffffffffa0207957>] ? xprt_reserve+0x47/0x50 [sunrpc]
 [<ffffffffa02023a4>] call_reserve+0x34/0x60 [sunrpc]
 [<ffffffffa020e280>] __rpc_execute+0x90/0x400 [sunrpc]
 [<ffffffffa020e61a>] rpc_async_schedule+0x2a/0x40 [sunrpc]
 [<ffffffff81073589>] process_one_work+0x139/0x500
 [<ffffffff81070e70>] ? alloc_worker+0x70/0x70
 [<ffffffffa020e5f0>] ? __rpc_execute+0x400/0x400 [sunrpc]
 [<ffffffff81073d1e>] worker_thread+0x15e/0x460
 [<ffffffff8145c839>] ? preempt_schedule+0x49/0x70
 [<ffffffff81073bc0>] ? rescuer_thread+0x230/0x230
 [<ffffffff81079603>] kthread+0x93/0xa0
 [<ffffffff81465d04>] kernel_thread_helper+0x4/0x10
 [<ffffffff81079570>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff81465d00>] ? gs_change+0x13/0x13
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

84e28a30

20 9月, 2012 1 次提交

SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT · a519fc7a

由 Trond Myklebust 提交于 9月 12, 2012

Instead of doing a shutdown() call, we need to do an actual close().
Ditto if/when the server is sending us junk RPC headers.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: NSimon Kirby <sim@hostway.ca>
Cc: stable@vger.kernel.org

a519fc7a

10 9月, 2012 1 次提交

nfsd: remove unused listener-removal interfaces · eccf50c1

由 J. Bruce Fields 提交于 8月 15, 2012

You can use nfsd/portlist to give nfsd additional sockets to listen on.
In theory you can also remove listening sockets this way.  But nobody's
ever done that as far as I can tell.

Also this was partially broken in 2.6.25, by
a217813f "knfsd: Support adding
transports by writing portlist file".

(Note that we decide whether to take the "delfd" case by checking for a
digit--but what's actually expected in that case is something made by
svc_one_sock_name(), which won't begin with a digit.)

So, let's just rip out this stuff.
Acked-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

eccf50c1

07 9月, 2012 1 次提交

SUNRPC: Fix a UDP transport regression · f39c1bfb

由 Trond Myklebust 提交于 9月 07, 2012

Commit 43cedbf0 (SUNRPC: Ensure that
we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing
hangs in the case of NFS over UDP mounts.

Since neither the UDP or the RDMA transport mechanism use dynamic slot
allocation, we can skip grabbing the socket lock for those transports.
Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA
case.

Note that the NFSv4.1 back channel assigns the slot directly
through rpc_run_bc_task, so we can ignore that case.
Reported-by: NDick Streefland <dick.streefland@altium.nl>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]

f39c1bfb

22 8月, 2012 12 次提交

svcrpc: split up svc_handle_xprt · 65b2e665

由 J. Bruce Fields 提交于 8月 18, 2012

Move initialization of newly accepted socket into a helper.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

65b2e665

svcrpc: break up svc_recv · 6797fa5a

由 J. Bruce Fields 提交于 8月 18, 2012

Matter of taste, I suppose, but svc_recv breaks up naturally into:

	allocate pages and setup arg
	dequeue (wait for, if necessary) next socket
	do something with that socket

And I find it easier to read when it doesn't go on for pages and pages.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6797fa5a

svcrpc: make svc_xprt_received static · 6741019c

由 J. Bruce Fields 提交于 8月 17, 2012

Note this isn't used outside svc_xprt.c.

May as well move it so we don't need a declaration while we're here.

Also remove an outdated comment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6741019c

svcrpc: make xpo_recvfrom return only >=0 · 9f9d2ebe

由 J. Bruce Fields 提交于 8月 17, 2012

The only errors returned from xpo_recvfrom have been -EAGAIN and
-EAFNOSUPPORT.  The latter was removed by a previous patch.  That leaves
only -EAGAIN, which is treated just like 0 by the caller (svc_recv).

So, just ditch -EAGAIN and return 0 instead.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

9f9d2ebe

svcrpc: don't bother checking bad svc_addr_len result · af6d5721

由 J. Bruce Fields 提交于 8月 21, 2012

None of the callers should see an unsupported address family (only one
of them even bothers to check for that case), so just check for the
buggy case in svc_addr_len and don't bother elsewhere.
Acked-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

af6d5721

svcrpc: minor udp code cleanup · f23abfdb

由 J. Bruce Fields 提交于 8月 17, 2012

Order the code in a more boring way.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f23abfdb

J
svcrpc: share some setup of listening sockets · 39b55301
由 J. Bruce Fields 提交于 8月 14, 2012
```
There's some duplicate code here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
39b55301

svcrpc: make svc_create_xprt enqueue on clearing XPT_BUSY · c3341966

由 J. Bruce Fields 提交于 8月 14, 2012

Whenever we clear XPT_BUSY we should call svc_xprt_enqueue().  Without
that we may fail to notice any events (such as new connections) that
arrived while XPT_BUSY was set.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c3341966

workqueue: make deferrable delayed_work initializer names consistent · 203b42f7

由 Tejun Heo 提交于 8月 21, 2012

Initalizers for deferrable delayed_work are confused.

* __DEFERRED_WORK_INITIALIZER()
* DECLARE_DEFERRED_WORK()
* INIT_DELAYED_WORK_DEFERRABLE()

Rename them to

* __DEFERRABLE_WORK_INITIALIZER()
* DECLARE_DEFERRABLE_WORK()
* INIT_DEFERRABLE_WORK()

This patch doesn't cause any functional changes.
Signed-off-by: NTejun Heo <tj@kernel.org>

203b42f7

svcrpc: clean up control flow · a8e10078

由 J. Bruce Fields 提交于 8月 13, 2012

Mainly, use the kernel standard

	err = -ERROR;
	if (something_bad)
		goto out;
	normal case;

rather than

	if (something_bad)
		err = -ERROR
	else {
		normal case;
	}
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a8e10078

svcrpc: standardize svc_setup_socket return convention · 72c35376

由 J. Bruce Fields 提交于 8月 13, 2012

Use the kernel-standard ptr-or-error return convention instead of
passing a pointer to the error.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

72c35376

svcrpc: fix xpt_list traversal locking on shutdown · 719f8bcc

由 J. Bruce Fields 提交于 8月 13, 2012

Server threads are not running at this point, but svc_age_temp_xprts
still may be, so we need this locking.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

719f8bcc

21 8月, 2012 3 次提交

svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping · d10f27a7

由 J. Bruce Fields 提交于 8月 17, 2012

The rpc server tries to ensure that there will be room to send a reply
before it receives a request.

It does this by tracking, in xpt_reserved, an upper bound on the total
size of the replies that is has already committed to for the socket.

Currently it is adding in the estimate for a new reply *before* it
checks whether there is space available.  If it finds that there is not
space, it then subtracts the estimate back out.

This may lead the subsequent svc_xprt_enqueue to decide that there is
space after all.

The results is a svc_recv() that will repeatedly return -EAGAIN, causing
server threads to loop without doing any actual work.

Cc: stable@vger.kernel.org
Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
Tested-by: NMichael Tokarev <mjt@tls.msk.ru>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d10f27a7

svcrpc: sends on closed socket should stop immediately · f06f00a2

由 J. Bruce Fields 提交于 8月 20, 2012

svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
However, the XPT_CLOSE won't be acted on immediately.  Meanwhile other
threads could send further replies before the socket is really shut
down.  This can manifest as data corruption: for example, if a truncated
read reply is followed by another rpc reply, that second reply will look
to the client like further read data.

Symptoms were data corruption preceded by svc_tcp_sendto logging
something like

	kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket

Cc: stable@vger.kernel.org
Reported-by: NMalahal Naineni <malahal@us.ibm.com>
Tested-by: NMalahal Naineni <malahal@us.ibm.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f06f00a2

svcrpc: fix BUG() in svc_tcp_clear_pages · be1e4444

由 J. Bruce Fields 提交于 8月 09, 2012

Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is
consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if
sk_tcplen would lead us to expect n pages of data).

svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen.
This is OK, since both functions are serialized by XPT_BUSY.  However,
that means the inconsistency must be repaired before dropping XPT_BUSY.

Therefore we should be ensuring that svc_tcp_save_pages repairs the
problem before exiting svc_tcp_recv_record on error.

Symptoms were a BUG() in svc_tcp_clear_pages.

Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

be1e4444

01 8月, 2012 1 次提交

nfs: enable swap on NFS · a564b8f0

由 Mel Gorman 提交于 7月 31, 2012

Implement the new swapfile a_ops for NFS and hook up ->direct_IO.  This
will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
->connect() method.

PF_MEMALLOC should allow the allocation of struct socket and related
objects and the early (re)setting of SOCK_MEMALLOC should allow us to
receive the packets required for the TCP connection buildup.

[jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
[dfeng@redhat.com: Fix handling of multiple swap files]
[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a564b8f0

31 7月, 2012 3 次提交

SUNRPC: return negative value in case rpcbind client creation error · caea33da

由 Stanislav Kinsbursky 提交于 7月 20, 2012

Without this patch kernel will panic on LockD start, because lockd_up() checks
lockd_up_net() result for negative value.
From my pow it's better to return negative value from rpcbind routines instead
of replacing all such checks like in lockd_up().
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.0]

caea33da

nfs: skip commit in releasepage if we're freeing memory for fs-related reasons · 5cf02d09

由 Jeff Layton 提交于 7月 23, 2012

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

5cf02d09

sunrpc: clarify comments on rpc_make_runnable · 506026c3

由 Jeff Layton 提交于 7月 23, 2012

rpc_make_runnable is not generally called with the queue lock held, unless
it's waking up a task that has been sitting on a waitqueue. This is safe
when the task has not entered the FSM yet, but the comments don't really
spell this out.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

506026c3

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功