提交 · 917937025a955e239e5cdcc62b6ca9a5ef9e5e48 · openeuler / raspberrypi-kernel

26 11月, 2014 7 次提交

xprtrdma: Display async errors · 7ff11de1

由 Chuck Lever 提交于 11月 08, 2014

An async error upcall is a hard error, and should be reported in
the system log.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

7ff11de1

xprtrdma: Enable pad optimization · d5440e27

由 Chuck Lever 提交于 11月 08, 2014

The Linux NFS/RDMA server used to reject NFSv3 WRITE requests when
pad optimization was enabled. That bug was fixed by commit
e560e3b5 ("svcrdma: Add zero padding if the client doesn't send
it").

We can now enable pad optimization on the client, which helps
performance and is supported now by both Linux and Solaris servers.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d5440e27

xprtrdma: Re-write rpcrdma_flush_cqs() · 5c166bef

由 Chuck Lever 提交于 11月 08, 2014

Currently rpcrdma_flush_cqs() attempts to avoid code duplication,
and simply invokes rpcrdma_recvcq_upcall and rpcrdma_sendcq_upcall.

1. rpcrdma_flush_cqs() can run concurrently with provider upcalls.
   Both flush_cqs() and the upcalls were invoking ib_poll_cq() in
   different threads using the same wc buffers (ep->rep_recv_wcs
   and ep->rep_send_wcs), added by commit 1c00dd07 ("xprtrmda:
   Reduce calls to ib_poll_cq() in completion handlers").

   During transport disconnect processing, this sometimes resulted
   in the same reply getting added to the rpcrdma_tasklets_g list
   more than once, which corrupted the list.

2. The upcall functions drain only a limited number of CQEs,
   thanks to the poll budget added by commit 8301a2c0
   ("xprtrdma: Limit work done by completion handler").

Fixes: a7bc211a ("xprtrdma: On disconnect, don't ignore ... ")
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=276Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5c166bef

xprtrdma: Refactor tasklet scheduling · f1a03b76

由 Chuck Lever 提交于 11月 08, 2014

Restore the separate function that schedules the reply handling
tasklet. I need to call it from two different paths.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f1a03b76

xprtrdma: unmap all FMRs during transport disconnect · 467c9674

由 Chuck Lever 提交于 11月 08, 2014

When using RPCRDMA_MTHCAFMR memory registration, after a few
transport disconnect / reconnect cycles, ib_map_phys_fmr() starts to
return EINVAL because the provider has exhausted its map pool.

Make sure that all FMRs are unmapped during transport disconnect,
and that ->send_request remarshals them during an RPC retransmit.
This resets the transport's MRs to ensure that none are leaked
during a disconnect.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

467c9674

xprtrdma: Cap req_cqinit · e7104a2a

由 Chuck Lever 提交于 11月 08, 2014

Recent work made FRMR registration and invalidation completions
unsignaled. This greatly reduces the adapter interrupt rate.

Every so often, however, a posted send Work Request is allowed to
signal. Otherwise, the provider's Work Queue will wrap and the
workload will hang.

The number of Work Requests that are allowed to remain unsignaled is
determined by the value of req_cqinit. Currently, this is set to the
size of the send Work Queue divided by two, minus 1.

For FRMR, the send Work Queue is the maximum number of concurrent
RPCs (currently 32) times the maximum number of Work Requests an
RPC might use (currently 7, though some adapters may need more).

For mlx4, this is 224 entries. This leaves completion signaling
disabled for 111 send Work Requests.

Some providers hold back dispatching Work Requests until a CQE is
generated.  If completions are disabled, then no CQEs are generated
for quite some time, and that can stall the Work Queue.

I've seen this occur running xfstests generic/113 over NFSv4, where
eventually, posting a FAST_REG_MR Work Request fails with -ENOMEM
because the Work Queue has overflowed. The connection is dropped
and re-established.

Cap the rep_cqinit setting so completions are not left turned off
for too long.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=269Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e7104a2a

xprtrdma: Return an errno from rpcrdma_register_external() · 92b98361

由 Chuck Lever 提交于 11月 08, 2014

The RPC/RDMA send_request method and the chunk registration code
expects an errno from the registration function. This allows
the upper layers to distinguish between a recoverable failure
(for example, temporary memory exhaustion) and a hard failure
(for example, a bug in the registration logic).
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

92b98361

25 11月, 2014 1 次提交

sunrpc: eliminate RPC_DEBUG · f895b252

由 Jeff Layton 提交于 11月 17, 2014

It's always set to whatever CONFIG_SUNRPC_DEBUG is, so just use that.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f895b252

30 9月, 2014 1 次提交

svcrdma: advertise the correct max payload · 7e5be288

由 Steve Wise 提交于 9月 23, 2014

Svcrdma currently advertises 1MB, which is too large. The correct value
is the minimum of RPCSVC_MAXPAYLOAD and the max scatter-gather allowed
in an NFSRDMA IO chunk * the host page size. This bug is usually benign
because the Linux X64 NFSRDMA client correctly limits the payload size to
the correct value (64*4096 = 256KB). But if the Linux client is PPC64
with a 64KB page size, then the client will indeed use a payload size
that will overflow the server.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7e5be288

25 9月, 2014 1 次提交

NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page() · 1aff5256

由 NeilBrown 提交于 9月 24, 2014

Now that nfs_release_page() doesn't block indefinitely, other deadlock
avoidance mechanisms aren't needed.
 - it doesn't hurt for kswapd to block occasionally.  If it doesn't
   want to block it would clear __GFP_WAIT.  The current_is_kswapd()
   was only added to avoid deadlocks and we have a new approach for
   that.
 - memory allocation in the SUNRPC layer can very rarely try to
   ->releasepage() a page it is trying to handle.  The deadlock
   is removed as nfs_release_page() doesn't block indefinitely.

So we don't need to set PF_FSTRANS for sunrpc network operations any
more.
Signed-off-by: NNeilBrown <neilb@suse.de>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1aff5256

06 8月, 2014 1 次提交

svcrdma: remove rdma_create_qp() failure recovery logic · d1e458fe

由 Steve Wise 提交于 7月 31, 2014

In svc_rdma_accept(), if rdma_create_qp() fails, there is useless
logic to try and call rdma_create_qp() again with reduced sge depths.
The assumption, I guess, was that perhaps the initial sge depths
chosen were too big.  However they initial depths are selected based
on the rdma device attribute max_sge returned from ib_query_device().
If rdma_create_qp() fails, it would not be because the max_send_sge and
max_recv_sge values passed in exceed the device's max.  So just remove
this code.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d1e458fe

01 8月, 2014 21 次提交

xprtrdma: Handle additional connection events · 8079fb78

由 Chuck Lever 提交于 7月 29, 2014

Commit 38ca83a5 added RDMA_CM_EVENT_TIMEWAIT_EXIT. But that status
is relevant only for consumers that re-use their QPs on new
connections. xprtrdma creates a fresh QP on reconnection, so that
event should be explicitly ignored.

Squelch the alarming "unexpected CM event" message.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8079fb78

xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro · a779ca5f

由 Chuck Lever 提交于 7月 29, 2014

Clean up.

RPCRDMA_PERSISTENT_REGISTRATION was a compile-time switch between
RPCRDMA_REGISTER mode and RPCRDMA_ALLPHYSICAL mode.  Since
RPCRDMA_REGISTER has been removed, there's no need for the extra
conditional compilation.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a779ca5f

xprtrdma: Make rpcrdma_ep_disconnect() return void · 282191cb

由 Chuck Lever 提交于 7月 29, 2014

Clean up: The return code is used only for dprintk's that are
already redundant.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

282191cb

xprtrdma: Schedule reply tasklet once per upcall · bb96193d

由 Chuck Lever 提交于 7月 29, 2014

Minor optimization: grab rpcrdma_tk_lock_g and disable hard IRQs
just once after clearing the receive completion queue.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bb96193d

xprtrdma: Allocate each struct rpcrdma_mw separately · 2e84522c

由 Chuck Lever 提交于 7月 29, 2014

Currently rpcrdma_buffer_create() allocates struct rpcrdma_mw's as
a single contiguous area of memory. It amounts to quite a bit of
memory, and there's no requirement for these to be carved from a
single piece of contiguous memory.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2e84522c

xprtrdma: Rename frmr_wr · f590e878

由 Chuck Lever 提交于 7月 29, 2014

Clean up: Name frmr_wr after the opcode of the Work Request,
consistent with the send and local invalidation paths.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f590e878

xprtrdma: Disable completions for LOCAL_INV Work Requests · dab7e3b8

由 Chuck Lever 提交于 7月 29, 2014

Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_INVALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

dab7e3b8

xprtrdma: Disable completions for FAST_REG_MR Work Requests · 05055722

由 Chuck Lever 提交于 7月 29, 2014

Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_VALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

05055722

xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external() · 440ddad5

由 Chuck Lever 提交于 7月 29, 2014

Any FRMR arriving in rpcrdma_register_frmr_external() is now
guaranteed to be either invalid, or to be targeted by a queued
LOCAL_INV that will invalidate it before the adapter processes
the FAST_REG_MR being built here.

The problem with current arrangement of chaining a LOCAL_INV to the
FAST_REG_MR is that if the transport is not connected, the LOCAL_INV
is flushed and the FAST_REG_MR is flushed. This leaves the FRMR
valid with the old rkey. But rpcrdma_register_frmr_external() has
already bumped the in-memory rkey.

Next time through rpcrdma_register_frmr_external(), a LOCAL_INV and
FAST_REG_MR is attempted again because the FRMR is still valid. But
the rkey no longer matches the hardware's rkey, and a memory
management operation error occurs.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

440ddad5

xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request · ddb6bebc

由 Chuck Lever 提交于 7月 29, 2014

When a LOCAL_INV Work Request is flushed, it leaves an FRMR in the
VALID state. This FRMR can be returned by rpcrdma_buffer_get(), and
must be knocked down in rpcrdma_register_frmr_external() before it
can be re-used.

Instead, capture these in rpcrdma_buffer_get(), and reset them.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ddb6bebc

xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect · 9f9d802a

由 Chuck Lever 提交于 7月 29, 2014

FAST_REG_MR Work Requests update a Memory Region's rkey. Rkey's are
used to block unwanted access to the memory controlled by an MR. The
rkey is passed to the receiver (the NFS server, in our case), and is
also used by xprtrdma to invalidate the MR when the RPC is complete.

When a FAST_REG_MR Work Request is flushed after a transport
disconnect, xprtrdma cannot tell whether the WR actually hit the
adapter or not. So it is indeterminant at that point whether the
existing rkey is still valid.

After the transport connection is re-established, the next
FAST_REG_MR or LOCAL_INV Work Request against that MR can sometimes
fail because the rkey value does not match what xprtrdma expects.

The only reliable way to recover in this case is to deregister and
register the MR before it is used again. These operations can be
done only in a process context, so handle it in the transport
connect worker.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9f9d802a

xprtrdma: Properly handle exhaustion of the rb_mws list · c2922c02

由 Chuck Lever 提交于 7月 29, 2014

If the rb_mws list is exhausted, clean up and return NULL so that
call_allocate() will delay and try again.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c2922c02

xprtrdma: Chain together all MWs in same buffer pool · 3111d72c

由 Chuck Lever 提交于 7月 29, 2014

During connection loss recovery, need to visit every MW in a
buffer pool. Any MW that is in use by an RPC will not be on the
rb_mws list.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3111d72c

xprtrdma: Back off rkey when FAST_REG_MR fails · c93e986a

由 Chuck Lever 提交于 7月 29, 2014

If posting a FAST_REG_MR Work Reqeust fails, revert the rkey update
to avoid subsequent IB_WC_MW_BIND_ERR completions.
Suggested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c93e986a

xprtrdma: Unclutter struct rpcrdma_mr_seg · 0dbb4108

由 Chuck Lever 提交于 7月 29, 2014

Clean ups:
 - make it obvious that the rl_mw field is a pointer -- allocated
   separately, not as part of struct rpcrdma_mr_seg
 - promote "struct {} frmr;" to a named type
 - promote the state enum to a named type
 - name the MW state field the same way other fields in
   rpcrdma_mw are named
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0dbb4108

xprtrdma: Don't invalidate FRMRs if registration fails · 539431a4

由 Chuck Lever 提交于 7月 29, 2014

If FRMR registration fails, it's likely to transition the QP to the
error state. Or, registration may have failed because the QP is
_already_ in ERROR.

Thus calling rpcrdma_deregister_external() in
rpcrdma_create_chunks() is useless in FRMR mode: the LOCAL_INVs just
get flushed.

It is safe to leave existing registrations: when FRMR registration
is tried again, rpcrdma_register_frmr_external() checks if each FRMR
is already/still VALID, and knocks it down first if it is.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

539431a4

xprtrdma: On disconnect, don't ignore pending CQEs · a7bc211a

由 Chuck Lever 提交于 7月 29, 2014

xprtrdma is currently throwing away queued completions during
a reconnect. RPC replies posted just before connection loss, or
successful completions that change the state of an FRMR, can be
missed.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a7bc211a

xprtrdma: Update rkeys after transport reconnect · 6ab59945

由 Chuck Lever 提交于 7月 29, 2014

Various reports of:

  rpcrdma_qp_async_error_upcall: QP error 3 on device mlx4_0
		ep ffff8800bfd3e848

Ensure that rkeys in already-marshalled RPC/RDMA headers are
refreshed after the QP has been replaced by a reconnect.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=249Suggested-by: NSelvin Xavier <Selvin.Xavier@Emulex.Com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6ab59945

xprtrdma: Limit data payload size for ALLPHYSICAL · 43e95988

由 Chuck Lever 提交于 7月 29, 2014

When the client uses physical memory registration, each page in the
payload gets its own array entry in the RPC/RDMA header's chunk list.

Therefore, don't advertise a maximum payload size that would require
more array entries than can fit in the RPC buffer where RPC/RDMA
headers are built.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=248Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

43e95988

xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs · 73806c88

由 Chuck Lever 提交于 7月 29, 2014

Ensure ia->ri_id remains valid while invoking dma_unmap_page() or
posting LOCAL_INV during a transport reconnect. Otherwise,
ia->ri_id->device or ia->ri_id->qp is NULL, which triggers a panic.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=259
Fixes: ec62f40d 'xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting'
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

73806c88

xprtrdma: Fix panic in rpcrdma_register_frmr_external() · 5fc83f47

由 Chuck Lever 提交于 7月 29, 2014

seg1->mr_nsegs is not yet initialized when it is used to unmap
segments during an error exit. Use the same unmapping logic for
all error exits.

"if (frmr_wr.wr.fast_reg.length < len) {" used to be a BUG_ON check.
The broken code will never be executed under normal operation.

Fixes: c977dea2 (xprtrdma: Remove BUG_ON() call sites)
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Tested-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5fc83f47

23 7月, 2014 2 次提交

svcrdma: Add zero padding if the client doesn't send it · e560e3b5

由 Chuck Lever 提交于 7月 22, 2014

See RFC 5666 section 3.7: clients don't have to send zero XDR
padding.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=246Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e560e3b5

xprtrdma: Fix DMA-API-DEBUG warning by checking dma_map result · bf858ab0

由 Yan Burman 提交于 6月 19, 2014

Fix the following warning when DMA-API debug is enabled by checking ib_dma_map_single result:
[ 1455.345548] ------------[ cut here ]------------
[ 1455.346863] WARNING: CPU: 3 PID: 3929 at /home/yanb/kernel/net-next/lib/dma-debug.c:1140 check_unmap+0x4e5/0x990()
[ 1455.349350] mlx4_core 0000:00:07.0: DMA-API: device driver failed to check map error[device address=0x000000007c9f2090] [size=2656 bytes] [mapped as single]
[ 1455.349350] Modules linked in: xprtrdma netconsole configfs nfsv3 nfs_acl ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm autofs4 auth_rpcgss oid_registry nfsv4 nfs fscache lockd sunrpc dm_mirror dm_region_hash dm_log microcode pcspkr mlx4_ib ib_sa ib_mad ib_core ib_addr mlx4_en ipv6 ptp pps_core vxlan mlx4_core virtio_balloon cirrus ttm drm_kms_helper drm sysimgblt sysfillrect syscopyarea i2c_piix4 i2c_core button ext3 jbd virtio_blk virtio_net virtio_pci virtio_ring virtio uhci_hcd ata_generic ata_piix libata
[ 1455.349350] CPU: 3 PID: 3929 Comm: mount.nfs Not tainted 3.15.0-rc1-dbg+ #13
[ 1455.349350] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[ 1455.349350]  0000000000000474 ffff880069dcf628 ffffffff8151c341 ffffffff817b69d8
[ 1455.349350]  ffff880069dcf678 ffff880069dcf668 ffffffff8105b5fc 0000000069dcf658
[ 1455.349350]  ffff880069dcf778 ffff88007b0c9f00 ffffffff8255ec40 0000000000000a60
[ 1455.349350] Call Trace:
[ 1455.349350]  [<ffffffff8151c341>] dump_stack+0x52/0x81
[ 1455.349350]  [<ffffffff8105b5fc>] warn_slowpath_common+0x8c/0xc0
[ 1455.349350]  [<ffffffff8105b6e6>] warn_slowpath_fmt+0x46/0x50
[ 1455.349350]  [<ffffffff812e6305>] check_unmap+0x4e5/0x990
[ 1455.349350]  [<ffffffff81521fb0>] ? _raw_spin_unlock_irq+0x30/0x60
[ 1455.349350]  [<ffffffff812e6a0a>] debug_dma_unmap_page+0x5a/0x60
[ 1455.349350]  [<ffffffffa0389583>] rpcrdma_deregister_internal+0xb3/0xd0 [xprtrdma]
[ 1455.349350]  [<ffffffffa038a639>] rpcrdma_buffer_destroy+0x69/0x170 [xprtrdma]
[ 1455.349350]  [<ffffffffa03872ff>] xprt_rdma_destroy+0x3f/0xb0 [xprtrdma]
[ 1455.349350]  [<ffffffffa04a95ff>] xprt_destroy+0x6f/0x80 [sunrpc]
[ 1455.349350]  [<ffffffffa04a9625>] xprt_put+0x15/0x20 [sunrpc]
[ 1455.349350]  [<ffffffffa04a899a>] rpc_free_client+0x8a/0xe0 [sunrpc]
[ 1455.349350]  [<ffffffffa04a8a58>] rpc_release_client+0x68/0xa0 [sunrpc]
[ 1455.349350]  [<ffffffffa04a9060>] rpc_shutdown_client+0xb0/0xc0 [sunrpc]
[ 1455.349350]  [<ffffffffa04a8f5d>] ? rpc_ping+0x5d/0x70 [sunrpc]
[ 1455.349350]  [<ffffffffa04a91ab>] rpc_create_xprt+0xbb/0xd0 [sunrpc]
[ 1455.349350]  [<ffffffffa04a9273>] rpc_create+0xb3/0x160 [sunrpc]
[ 1455.349350]  [<ffffffff81129749>] ? __probe_kernel_read+0x69/0xb0
[ 1455.349350]  [<ffffffffa053851c>] nfs_create_rpc_client+0xdc/0x100 [nfs]
[ 1455.349350]  [<ffffffffa0538cfa>] nfs_init_client+0x3a/0x90 [nfs]
[ 1455.349350]  [<ffffffffa05391c8>] nfs_get_client+0x478/0x5b0 [nfs]
[ 1455.349350]  [<ffffffffa0538e50>] ? nfs_get_client+0x100/0x5b0 [nfs]
[ 1455.349350]  [<ffffffff81172c6d>] ? kmem_cache_alloc_trace+0x24d/0x260
[ 1455.349350]  [<ffffffffa05393f3>] nfs_create_server+0xf3/0x4c0 [nfs]
[ 1455.349350]  [<ffffffffa0545ff0>] ? nfs_request_mount+0xf0/0x1a0 [nfs]
[ 1455.349350]  [<ffffffffa031c0c3>] nfs3_create_server+0x13/0x30 [nfsv3]
[ 1455.349350]  [<ffffffffa0546293>] nfs_try_mount+0x1f3/0x230 [nfs]
[ 1455.349350]  [<ffffffff8108ea21>] ? get_parent_ip+0x11/0x50
[ 1455.349350]  [<ffffffff812d6343>] ? __this_cpu_preempt_check+0x13/0x20
[ 1455.349350]  [<ffffffff810d632b>] ? try_module_get+0x6b/0x190
[ 1455.349350]  [<ffffffffa05449f7>] nfs_fs_mount+0x187/0x9d0 [nfs]
[ 1455.349350]  [<ffffffffa0545940>] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1455.349350]  [<ffffffffa0543b20>] ? nfs_auth_info_match+0x40/0x40 [nfs]
[ 1455.349350]  [<ffffffff8117e360>] mount_fs+0x20/0xe0
[ 1455.349350]  [<ffffffff811a1c16>] vfs_kern_mount+0x76/0x160
[ 1455.349350]  [<ffffffff811a29a8>] do_mount+0x428/0xae0
[ 1455.349350]  [<ffffffff811a30f0>] SyS_mount+0x90/0xe0
[ 1455.349350]  [<ffffffff8152af52>] system_call_fastpath+0x16/0x1b
[ 1455.349350] ---[ end trace f1f31572972e211d ]---
Signed-off-by: NYan Burman <yanb@mellanox.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bf858ab0

18 7月, 2014 1 次提交

svcrdma: Select NFSv4.1 backchannel transport based on forward channel · 3c45ddf8

由 Chuck Lever 提交于 7月 16, 2014

The current code always selects XPRT_TRANSPORT_BC_TCP for the back
channel, even when the forward channel was not TCP (eg, RDMA). When
a 4.1 mount is attempted with RDMA, the server panics in the TCP BC
code when trying to send CB_NULL.

Instead, construct the transport protocol number from the forward
channel transport or'd with XPRT_TRANSPORT_BC. Transports that do
not support bi-directional RPC will not have registered a "BC"
transport, causing create_backchannel_client() to fail immediately.

Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3c45ddf8

12 7月, 2014 1 次提交

svcrdma: send_write() must not overflow the device's max sge · 25594290

由 Steve Wise 提交于 7月 09, 2014

Function send_write() must stop creating sges when it reaches the device
max and return the amount sent in the RDMA Write to the caller.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

25594290

07 6月, 2014 2 次提交

svcrdma: Fence LOCAL_INV work requests · 83710fc7

由 Steve Wise 提交于 6月 05, 2014

Fencing forces the invalidate to only happen after all prior send
work requests have been completed.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Reported by : Devesh Sharma <Devesh.Sharma@Emulex.Com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

83710fc7

svcrdma: refactor marshalling logic · 0bf48289

由 Steve Wise 提交于 5月 28, 2014

This patch refactors the NFSRDMA server marshalling logic to
remove the intermediary map structures.  It also fixes an existing bug
where the NFSRDMA server was not minding the device fast register page
list length limitations.
Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>

0bf48289

04 6月, 2014 2 次提交

xprtrdma: Disconnect on registration failure · c93c6223

由 Chuck Lever 提交于 5月 28, 2014

If rpcrdma_register_external() fails during request marshaling, the
current RPC request is killed. Instead, this RPC should be retried
after reconnecting the transport instance.

The most likely reason for registration failure with FRMR is a
failed post_send, which would be due to a remote transport
disconnect or memory exhaustion. These issues can be recovered
by a retry.

Problems encountered in the marshaling logic itself will not be
corrected by trying again, so these should still kill a request.

Now that we've added a clean exit for marshaling errors, take the
opportunity to defang some BUG_ON's.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c93c6223

xprtrdma: Remove BUG_ON() call sites · c977dea2

由 Chuck Lever 提交于 5月 28, 2014

If an error occurs in the marshaling logic, fail the RPC request
being processed, but leave the client running.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c977dea2