提交 · 06ea0bfe6e6043cb56a78935a19f6f8ebc636226 · openanolis / cloud-kernel

11 2月, 2014 1 次提交

SUNRPC: Fix races in xs_nospace() · 06ea0bfe

由 Trond Myklebust 提交于 2月 11, 2014

When a send failure occurs due to the socket being out of buffer space,
we call xs_nospace() in order to have the RPC task wait until the
socket has drained enough to make it worth while trying again.
The current patch fixes a race in which the socket is drained before
we get round to setting up the machinery in xs_nospace(), and which
is reported to cause hangs.

Link: http://lkml.kernel.org/r/20140210170315.33dfc621@notabene.brown
Fixes: a9a6b52e (SUNRPC: Don't start the retransmission timer...)
Reported-by: NNeil Brown <neilb@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

06ea0bfe

15 1月, 2014 1 次提交

net: replace macros net_random and net_srandom with direct calls to prandom · 63862b5b

由 Aruna-Hewapathirane 提交于 1月 11, 2014

This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.
Signed-off-by: NAruna-Hewapathirane <aruna.hewapathirane@gmail.com>
Suggested-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63862b5b

01 1月, 2014 2 次提交
- T
  SUNRPC: Add tracepoint for socket errors · e8353c76
  由 Trond Myklebust 提交于 12月 31, 2013
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  e8353c76
- T
  SUNRPC: Report connection error values to rpc_tasks on the pending queue · 2118071d
  由 Trond Myklebust 提交于 12月 31, 2013
```
Currently we only report EAGAIN, which is not descriptive enough for
softconn tasks.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  2118071d
11 12月, 2013 1 次提交

sunrpc: fix some typos · 28303ca3

由 Weng Meiling 提交于 11月 30, 2013

Signed-off-by: NWeng Meiling <wengmeiling.weng@huawei.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

28303ca3

13 11月, 2013 1 次提交
- J
  sunrpc: comment typo fix · f06c3d2b
  由 J. Bruce Fields 提交于 9月 17, 2013
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  f06c3d2b
09 11月, 2013 1 次提交

SUNRPC: Fix a data corruption issue when retransmitting RPC calls · a6b31d18

由 Trond Myklebust 提交于 11月 08, 2013

The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.

1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
   and so the RPC client times out.

3) The client issues a second sendpage of the page data as part of
   an RPC call retransmission.

4) The reply to the first transmission arrives from the server
   _before_ the client hardware has emptied the TCP socket send
   buffer.
5) After processing the reply, the RPC state machine rules that
   the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
   pages to store something else (e.g. a new write).

7) The client NIC drains the TCP socket send buffer. Since the
   page data has now changed, it reads a corrupted version of the
   initial RPC call, and puts it on the wire.

This patch fixes the problem in the following manner:

The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

a6b31d18

31 10月, 2013 2 次提交

SUNRPC: Cleanup xs_destroy() · a1311d87

由 Trond Myklebust 提交于 10月 31, 2013

There is no longer any need for a separate xs_local_destroy() helper.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a1311d87

SUNRPC: close a rare race in xs_tcp_setup_socket. · 93dc41bd

由 NeilBrown 提交于 10月 31, 2013

We have one report of a crash in xs_tcp_setup_socket.
The call path to the crash is:

  xs_tcp_setup_socket -> inet_stream_connect -> lock_sock_nested.

The 'sock' passed to that last function is NULL.

The only way I can see this happening is a concurrent call to
xs_close:

  xs_close -> xs_reset_transport -> sock_release -> inet_release

inet_release sets:
   sock->sk = NULL;
inet_stream_connect calls
   lock_sock(sock->sk);
which gets NULL.

All calls to xs_close are protected by XPRT_LOCKED as are most
activations of the workqueue which runs xs_tcp_setup_socket.
The exception is xs_tcp_schedule_linger_timeout.

So presumably the timeout queued by the later fires exactly when some
other code runs xs_close().

To protect against this we can move the cancel_delayed_work_sync()
call from xs_destory() to xs_close().

As xs_close is never called from the worker scheduled on
->connect_worker, this can never deadlock.
Signed-off-by: NNeilBrown <neilb@suse.de>
[Trond: Make it safe to call cancel_delayed_work_sync() on AF_LOCAL sockets]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

93dc41bd

29 10月, 2013 1 次提交

sunrpc: comment typo fix · e3bfab18

由 J. Bruce Fields 提交于 10月 02, 2013

Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

e3bfab18

02 10月, 2013 2 次提交

T
SUNRPC: Only update the TCP connect cookie on a successful connect · 8b71798c
由 Trond Myklebust 提交于 9月 26, 2013
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
8b71798c

SUNRPC: Enable the keepalive option for TCP sockets · 7f260e85

由 Trond Myklebust 提交于 9月 24, 2013

For NFSv4 we want to avoid retransmitting RPC calls unless the TCP
connection breaks. However we still want to detect TCP connection
breakage as soon as possible. Do this by setting the keepalive option
with the idle timeout and count set to the 'timeo' and 'retrans' mount
options.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7f260e85

05 9月, 2013 1 次提交

SUNRPC: Add tracepoints to help debug socket connection issues · 40b5ea0c

由 Trond Myklebust 提交于 9月 04, 2013

Add client side debugging to help trace socket connection/disconnection
and unexpected state change issues.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

40b5ea0c

25 7月, 2013 1 次提交

net: add sk_stream_is_writeable() helper · 64dc6130

由 Eric Dumazet 提交于 7月 22, 2013

Several call sites use the hardcoded following condition :

sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)

Lets use a helper because TCP_NOTSENT_LOWAT support will change this
condition for TCP sockets.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64dc6130

13 6月, 2013 1 次提交

net: Convert uses of typedef ctl_table to struct ctl_table · fe2c6338

由 Joe Perches 提交于 6月 11, 2013

Reduce the uses of this unnecessary typedef.

Done via perl script:

$ git grep --name-only -w ctl_table net | \
  xargs perl -p -i -e '\
	sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
        s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

Reflow the modified lines that now exceed 80 columns.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe2c6338

15 5月, 2013 1 次提交

sunrpc: server back channel needs no rpcbind method · 2fccbd9c

由 J. Bruce Fields 提交于 9月 24, 2012

XPRT_BOUND is set on server backchannel xprts by xs_setup_bc_tcp()
(using xprt_set_bound()), and is never cleared, so ->rpcbind() will
never need to be called.
Reported-by: N"Myklebust, Trond" <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2fccbd9c

26 4月, 2013 1 次提交

SUNRPC: attempt AF_LOCAL connect on setup · 7073ea87

由 J. Bruce Fields 提交于 2月 21, 2013

In the gss-proxy case, setup time is when I know I'll have the right
namespace for the connect.

In other cases, it might be useful to get any connection errors
earlier--though actually in practice it doesn't make any difference for
rpcbind.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7073ea87

15 4月, 2013 1 次提交

SUNRPC: Allow rpc_create() to request that TCP slots be unlimited · b7993ceb

由 Trond Myklebust 提交于 4月 14, 2013

This is mainly for use by NFSv4.1, where the session negotiation
ultimately wants to decide how many RPC slots we can fill.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b7993ceb

26 3月, 2013 1 次提交

SUNRPC: Report network/connection errors correctly for SOFTCONN rpc tasks · 3ed5e2a2

由 Trond Myklebust 提交于 3月 04, 2013

In the case of a SOFTCONN rpc task, we really want to ensure that it
reports errors like ENETUNREACH back to the caller. Currently, only
some of these errors are being reported back (connect errors are not),
and they are being converted by the RPC layer into EIO.
Reported-by: NJan Engelhardt <jengelh@inai.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

3ed5e2a2

10 3月, 2013 1 次提交

sunrpc: don't attempt to cancel unitialized work · 190b1ecf

由 J. Bruce Fields 提交于 3月 08, 2013

As of dc107402 "SUNRPC: make AF_LOCAL connect synchronous", we no longer initialize connect_worker in the
AF_LOCAL case, resulting in warnings like:

    WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() Hardware name: Bochs
    ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x20
    Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd sunrpc
    Pid: 4816, comm: nfsd Tainted: G        W    3.8.0-rc2-00049-gdc107402 #801
    Call Trace:
     [<ffffffff8156ec00>] ? free_obj_work+0x60/0xa0
     [<ffffffff81046aaf>] warn_slowpath_common+0x7f/0xc0
     [<ffffffff81046ba6>] warn_slowpath_fmt+0x46/0x50
     [<ffffffff8156eccc>] debug_print_object+0x8c/0xb0
     [<ffffffff81055030>] ? timer_debug_hint+0x10/0x10
     [<ffffffff8156f7e3>] debug_object_assert_init+0xe3/0x120
     [<ffffffff81057ebb>] del_timer+0x2b/0x80
     [<ffffffff8109c4e6>] ? mark_held_locks+0x86/0x110
     [<ffffffff81065a29>] try_to_grab_pending+0xd9/0x150
     [<ffffffff81065b57>] __cancel_work_timer+0x27/0xc0
     [<ffffffff81065c03>] cancel_delayed_work_sync+0x13/0x20
     [<ffffffffa0007067>] xs_destroy+0x27/0x80 [sunrpc]
     [<ffffffffa00040d8>] xprt_destroy+0x78/0xa0 [sunrpc]
     [<ffffffffa0006241>] xprt_put+0x21/0x30 [sunrpc]
     [<ffffffffa00030cf>] rpc_free_client+0x10f/0x1a0 [sunrpc]
     [<ffffffffa0002ff3>] ? rpc_free_client+0x33/0x1a0 [sunrpc]
     [<ffffffffa0002f7e>] rpc_release_client+0x6e/0xb0 [sunrpc]
     [<ffffffffa000325d>] rpc_shutdown_client+0xfd/0x1b0 [sunrpc]
     [<ffffffffa0017196>] rpcb_put_local+0x106/0x130 [sunrpc]
    ...
Acked-by: N"Myklebust, Trond" <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

190b1ecf

01 3月, 2013 1 次提交

SUNRPC: make AF_LOCAL connect synchronous · dc107402

由 J. Bruce Fields 提交于 2月 20, 2013

It doesn't appear that anyone actually needs to connect asynchronously.

Also, using a workqueue for the connect means we lose the namespace
information from the original process.  This is a problem since there's
no way to explicitly pass in a filesystem namespace for resolution of an
AF_LOCAL address.
Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

dc107402

05 2月, 2013 1 次提交

sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to addr.h · 5976687a

由 Jeff Layton 提交于 2月 04, 2013

These routines are used by server and client code, so having them in a
separate header would be best.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5976687a

01 2月, 2013 4 次提交

T
SUNRPC: Pass pointers to struct rpc_xprt to the congestion window · 6a24dfb6
由 Trond Myklebust 提交于 1月 08, 2013
```
Avoid access to task->tk_xprt
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
6a24dfb6
T
SUNRPC: Fix an RCU dereference in xs_local_rpcbind · 3dc0da27
由 Trond Myklebust 提交于 1月 08, 2013
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
3dc0da27

SUNRPC: Pass a pointer to struct rpc_xprt to the connect callback · 1b092092

由 Trond Myklebust 提交于 1月 08, 2013

Avoid another RCU dereference by passing the pointer to struct rpc_xprt
from the caller.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1b092092

SUNRPC: Eliminate task->tk_xprt accesses that bypass rcu_dereference() · a4f0835c

由 Trond Myklebust 提交于 1月 08, 2013

tk_xprt is just a shortcut for tk_client->cl_xprt, however cl_xprt is
defined as an __rcu variable. Replace dereferences of tk_xprt with
non-rcu dereferences where it is safe to do so.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a4f0835c

16 12月, 2012 2 次提交

T
SUNRPC: variable 'svsk' is unused in function bc_send_request · 1efc2878
由 Trond Myklebust 提交于 12月 15, 2012
```
Silence a compile time warning.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
1efc2878

SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket · 4a20a988

由 Trond Myklebust 提交于 12月 15, 2012

Silence the unnecessary warning "unhandled error (111) connecting to..."
and convert it to a dprintk for debugging purposes.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4a20a988

05 11月, 2012 2 次提交

SUNRPC: remove BUG_ON from bc_malloc · b8a13d03

由 Weston Andros Adamson 提交于 10月 23, 2012

Replace BUG_ON() with WARN_ON_ONCE() and NULL return - the caller will handle
this like a memory allocation failure.
Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b8a13d03

SUNRPC: remove BUG_ONs from *_reclassify_socket* · 1b7a1819

由 Weston Andros Adamson 提交于 10月 23, 2012

Replace multiple BUG_ON() calls with WARN_ON_ONCE() and early return when
sanity checking socket ownership (lock). The bind call will fail if the
socket was unsuccessfully reclassified.
Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1b7a1819

24 10月, 2012 4 次提交

SUNRPC: Get rid of the xs_error_report socket callback · f878b657

由 Trond Myklebust 提交于 10月 22, 2012

Chris Perl reports that we're seeing races between the wakeup call in
xs_error_report and the connect attempts. Basically, Chris has shown
that in certain circumstances, the call to xs_error_report causes the
rpc_task that is responsible for reconnecting to wake up early, thus
triggering a disconnect and retry.

Since the sk->sk_error_report() calls in the socket layer are always
followed by a tcp_done() in the cases where we care about waking up
the rpc_tasks, just let the state_change callbacks take responsibility
for those wake ups.
Reported-by: NChris Perl <chris.perl@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Tested-by: NChris Perl <chris.perl@gmail.com>

f878b657

SUNRPC: Prevent races in xs_abort_connection() · 4bc1e68e

由 Trond Myklebust 提交于 10月 23, 2012

The call to xprt_disconnect_done() that is triggered by a successful
connection reset will trigger another automatic wakeup of all tasks
on the xprt->pending rpc_wait_queue. In particular it will cause an
early wake up of the task that called xprt_connect().

All we really want to do here is clear all the socket-specific state
flags, so we split that functionality out of xs_sock_mark_closed()
into a helper that can be called by xs_abort_connection()
Reported-by: NChris Perl <chris.perl@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Tested-by: NChris Perl <chris.perl@gmail.com>

4bc1e68e

Revert "SUNRPC: Ensure we close the socket on EPIPE errors too..." · b9d2bb2e

由 Trond Myklebust 提交于 10月 23, 2012

This reverts commit 55420c24.
Now that we clear the connected flag when entering TCP_CLOSE_WAIT,
the deadlock described in this commit is no longer possible.
Instead, the resulting call to xs_tcp_shutdown() can interfere
with pending reconnection attempts.
Reported-by: NChris Perl <chris.perl@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Tested-by: NChris Perl <chris.perl@gmail.com>

b9d2bb2e

SUNRPC: Clear the connect flag when socket state is TCP_CLOSE_WAIT · d0bea455

由 Trond Myklebust 提交于 10月 23, 2012

This is needed to ensure that we call xprt_connect() upon the next
call to call_connect().
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Tested-by: NChris Perl <chris.perl@gmail.com>

d0bea455

29 9月, 2012 1 次提交

SUNRPC: Get rid of the redundant xprt->shutdown bit field · d19751e7

由 Trond Myklebust 提交于 9月 11, 2012

It is only set after everyone has dereferenced the transport,
and serves no useful purpose: setting it is racy, so all the
socket code, etc still needs to be able to cope with the cases
where they miss reading it.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d19751e7

25 9月, 2012 1 次提交

SUNRPC: Set alloc_slot for backchannel tcp ops · 84e28a30

由 Bryan Schumaker 提交于 9月 24, 2012

f39c1bfb (SUNRPC: Fix a UDP transport
regression) introduced the "alloc_slot" function for xprt operations,
but never created one for the backchannel operations.  This patch fixes
a null pointer dereference when mounting NFS over v4.1.

Call Trace:
 [<ffffffffa0207957>] ? xprt_reserve+0x47/0x50 [sunrpc]
 [<ffffffffa02023a4>] call_reserve+0x34/0x60 [sunrpc]
 [<ffffffffa020e280>] __rpc_execute+0x90/0x400 [sunrpc]
 [<ffffffffa020e61a>] rpc_async_schedule+0x2a/0x40 [sunrpc]
 [<ffffffff81073589>] process_one_work+0x139/0x500
 [<ffffffff81070e70>] ? alloc_worker+0x70/0x70
 [<ffffffffa020e5f0>] ? __rpc_execute+0x400/0x400 [sunrpc]
 [<ffffffff81073d1e>] worker_thread+0x15e/0x460
 [<ffffffff8145c839>] ? preempt_schedule+0x49/0x70
 [<ffffffff81073bc0>] ? rescuer_thread+0x230/0x230
 [<ffffffff81079603>] kthread+0x93/0xa0
 [<ffffffff81465d04>] kernel_thread_helper+0x4/0x10
 [<ffffffff81079570>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff81465d00>] ? gs_change+0x13/0x13
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

84e28a30

20 9月, 2012 1 次提交

SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT · a519fc7a

由 Trond Myklebust 提交于 9月 12, 2012

Instead of doing a shutdown() call, we need to do an actual close().
Ditto if/when the server is sending us junk RPC headers.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: NSimon Kirby <sim@hostway.ca>
Cc: stable@vger.kernel.org

a519fc7a

07 9月, 2012 1 次提交

SUNRPC: Fix a UDP transport regression · f39c1bfb

由 Trond Myklebust 提交于 9月 07, 2012

Commit 43cedbf0 (SUNRPC: Ensure that
we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing
hangs in the case of NFS over UDP mounts.

Since neither the UDP or the RDMA transport mechanism use dynamic slot
allocation, we can skip grabbing the socket lock for those transports.
Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA
case.

Note that the NFSv4.1 back channel assigns the slot directly
through rpc_run_bc_task, so we can ignore that case.
Reported-by: NDick Streefland <dick.streefland@altium.nl>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]

f39c1bfb

01 8月, 2012 1 次提交

nfs: enable swap on NFS · a564b8f0

由 Mel Gorman 提交于 7月 31, 2012

Implement the new swapfile a_ops for NFS and hook up ->direct_IO.  This
will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
->connect() method.

PF_MEMALLOC should allow the allocation of struct socket and related
objects and the early (re)setting of SOCK_MEMALLOC should allow us to
receive the packets required for the TCP connection buildup.

[jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
[dfeng@redhat.com: Fix handling of multiple swap files]
[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a564b8f0

31 7月, 2012 1 次提交

nfs: skip commit in releasepage if we're freeing memory for fs-related reasons · 5cf02d09

由 Jeff Layton 提交于 7月 23, 2012

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

5cf02d09

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功