提交 · fb43d17210baa538e58fc83d2d0f8a32399db73b · openanolis / cloud-kernel

06 2月, 2016 3 次提交

T
SUNRPC: Use the multipath iterator to assign a transport to each task · fb43d172
由 Trond Myklebust 提交于 1月 30, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
fb43d172

SUNRPC: Make rpc_clnt store the multipath iterators · ad01b2c6

由 Trond Myklebust 提交于 1月 30, 2016

This is a pre-patch for the RPC multipath code. It sets up the storage in
struct rpc_clnt for the multipath code.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ad01b2c6

SUNRPC: Add a structure to track multiple transports · 80b14d5e

由 Trond Myklebust 提交于 2月 14, 2015

In order to support multipathing/trunking we will need the ability to
track multiple transports. This patch sets up a basic structure for
doing so.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

80b14d5e

01 2月, 2016 4 次提交

SUNRPC: Make freeing of struct xprt rcu-safe · fda1bfef

由 Trond Myklebust 提交于 2月 14, 2015

Have it call kfree_rcu() to ensure that we can use it on rcu-protected
lists.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fda1bfef

SUNRPC: Uninline xprt_get(); It isn't performance critical. · 30c5116b

由 Trond Myklebust 提交于 2月 24, 2015

Also allow callers to pass NULL arguments to xprt_get() and xprt_put().
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

30c5116b

SUNRPC: Reorder rpc_task to put waitqueue related info in same cachelines · 5edd1051

由 Trond Myklebust 提交于 2月 25, 2015

Try to group all the data required by the waitqueues, their timers and timer
callbacks into the same cachelines for performance. With this reordering,
"pahole" reports the following structure on x86_64:

struct rpc_task {
        atomic_t                   tk_count;             /*     0     4 */
        int                        tk_status;            /*     4     4 */
        struct list_head           tk_task;              /*     8    16 */
        void                       (*tk_callback)(struct rpc_task *); /*    24
        void                       (*tk_action)(struct rpc_task *); /*    32
        long unsigned int          tk_timeout;           /*    40     8 */
        long unsigned int          tk_runstate;          /*    48     8 */
        struct rpc_wait_queue *    tk_waitqueue;         /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        union {
                struct work_struct tk_work;              /*          64 */
                struct rpc_wait    tk_wait;              /*          56 */
        } u;                                             /*    64    64 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        struct rpc_message         tk_msg;               /*   128    32 */
        void *                     tk_calldata;          /*   160     8 */
        const struct rpc_call_ops  * tk_ops;             /*   168     8 */
        struct rpc_clnt *          tk_client;            /*   176     8 */
        struct rpc_rqst *          tk_rqstp;             /*   184     8 */
        /* --- cacheline 3 boundary (192 bytes) --- */
        struct workqueue_struct *  tk_workqueue;         /*   192     8 */
        ktime_t                    tk_start;             /*   200     8 */
        pid_t                      tk_owner;             /*   208     4 */
        short unsigned int         tk_flags;             /*   212     2 */
        short unsigned int         tk_timeouts;          /*   214     2 */
        short unsigned int         tk_pid;               /*   216     2 */
        unsigned char              tk_priority:2;        /*   218: 6  1 */
        unsigned char              tk_garb_retry:2;      /*   218: 4  1 */
        unsigned char              tk_cred_retry:2;      /*   218: 2  1 */
        unsigned char              tk_rebind_retry:2;    /*   218: 0  1 */

        /* size: 224, cachelines: 4, members: 24 */
        /* padding: 5 */
        /* last cacheline: 32 bytes */
};

whereas on i386, it reports everything fitting into the 1st cacheline.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5edd1051

T
SUNRPC: Remove unused function rpc_task_reset_client · 58f13692
由 Trond Myklebust 提交于 1月 30, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
58f13692

20 1月, 2016 8 次提交

svc_rdma: use local_dma_lkey · 5fe1043d

由 Christoph Hellwig 提交于 1月 07, 2016

We now alwasy have a per-PD local_dma_lkey available.  Make use of that
fact in svc_rdma and stop registering our own MR.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Acked-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5fe1043d

svcrdma: Add class for RDMA backwards direction transport · 5d252f90

由 Chuck Lever 提交于 1月 07, 2016

To support the server-side of an NFSv4.1 backchannel on RDMA
connections, add a transport class that enables backward
direction messages on an existing forward channel connection.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5d252f90

svcrdma: Define maximum number of backchannel requests · 03fe9931

由 Chuck Lever 提交于 1月 07, 2016

Extra resources for handling backchannel requests have to be
pre-allocated when a transport instance is created. Set up
additional fields in svcxprt_rdma to track these resources.

The max_requests fields are elements of the RPC-over-RDMA
protocol, so they should be u32. To ensure that unsigned
arithmetic is used everywhere, some other fields in the
svcxprt_rdma struct are updated.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

03fe9931

svcrdma: Make map_xdr non-static · ba986c96

由 Chuck Lever 提交于 1月 07, 2016

Pre-requisite to use map_xdr in the backchannel code.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ba986c96

svcrdma: Add gfp flags to svc_rdma_post_recv() · 39b09a1a

由 Chuck Lever 提交于 1月 07, 2016

svc_rdma_post_recv() allocates pages for receive buffers on-demand.
It uses GFP_KERNEL so the allocator tries hard, and may sleep. But
I'm about to add a call to svc_rdma_post_recv() from a function
that may not sleep.

Since all svc_rdma_post_recv() call sites can tolerate its failure,
allow it to fail if the page allocator returns nothing. Longer term,
receive buffers, being a finite resource per-connection, should be
pre-allocated and re-used.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

39b09a1a

svcrdma: Remove unused req_map and ctxt kmem_caches · 71810ef3

由 Chuck Lever 提交于 1月 07, 2016

Clean up.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

71810ef3

svcrdma: Improve allocation of struct svc_rdma_req_map · 2fe81b23

由 Chuck Lever 提交于 1月 07, 2016

To ensure this allocation cannot fail and will not sleep,
pre-allocate the req_map structures per-connection.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2fe81b23

svcrdma: Improve allocation of struct svc_rdma_op_ctxt · cc886c9f

由 Chuck Lever 提交于 1月 07, 2016

When the maximum payload size of NFS READ and WRITE was increased
by commit cc9a903d ("svcrdma: Change maximum server payload back
to RPCSVC_MAXPAYLOAD"), the size of struct svc_rdma_op_ctxt
increased to over 6KB (on x86_64). That makes allocating one of
these from a kmem_cache more likely to fail in situations when
system memory is exhausted.

Since I'm about to add a caller where this allocation must always
work _and_ it cannot sleep, pre-allocate ctxts for each connection.

Another motivation for this change is that NFSv4.x servers are
required by specification not to drop NFS requests. Pre-allocating
memory resources reduces the likelihood of a drop.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: NBruce Fields <bfields@fieldses.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cc886c9f

23 12月, 2015 1 次提交

sunrpc: Add a function to close temporary transports immediately · c3d4879e

由 Scott Mayhew 提交于 12月 11, 2015

Add a function svc_age_temp_xprts_now() to close temporary transports
whose xpt_local matches the address passed in server_addr immediately
instead of waiting for them to be closed by the timer function.

The function is intended to be used by notifier_blocks that will be
added to nfsd and lockd that will run when an ip address is deleted.

This will eliminate the ACK storms and client hangs that occur in
HA-NFS configurations where nfsd & lockd is left running on the cluster
nodes all the time and the NFS 'service' is migrated back and forth
within a short timeframe.
Signed-off-by: NScott Mayhew <smayhew@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c3d4879e

25 11月, 2015 1 次提交

nfsd4: fix gss-proxy 4.1 mounts for some AD principals · 414ca017

由 J. Bruce Fields 提交于 11月 20, 2015

The principal name on a gss cred is used to setup the NFSv4.0 callback,
which has to have a client principal name to authenticate to.

That code wants the name to be in the form servicetype@hostname.
rpc.svcgssd passes down such names (and passes down no principal name at
all in the case the principal isn't a service principal).

gss-proxy always passes down the principal name, and passes it down in
the form servicetype/hostname@REALM.  So we've been munging the name
gss-proxy passes down into the format the NFSv4.0 callback code expects,
or throwing away the name if we can't.

Since the introduction of the MACH_CRED enforcement in NFSv4.1, we've
also been using the principal name to verify that certain operations are
done as the same principal as was used on the original EXCHANGE_ID call.

For that application, the original name passed down by gss-proxy is also
useful.

Lack of that name in some cases was causing some kerberized NFSv4.1
mount failures in an Active Directory environment.

This fix only works in the gss-proxy case.  The fix for legacy
rpc.svcgssd would be more involved, and rpc.svcgssd already has other
problems in the AD case.
Reported-and-tested-by: NJames Ralston <ralston@pobox.com>
Acked-by: NSimo Sorce <simo@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

414ca017

03 11月, 2015 3 次提交

NFS: Enable client side NFSv4.1 backchannel to use other transports · 76566773

由 Chuck Lever 提交于 10月 24, 2015

Forechannel transports get their own "bc_up" method to create an
endpoint for the backchannel service.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
[Anna Schumaker: Add forward declaration of struct net to xprt.h]
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

76566773

svcrdma: Add backward direction service for RPC/RDMA transport · 94684319

由 Chuck Lever 提交于 10月 24, 2015

On NFSv4.1 mount points, the Linux NFS client uses this transport
endpoint to receive backward direction calls and route replies back
to the NFSv4.1 server.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Acked-by: N"J. Bruce Fields" <bfields@fieldses.org>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

94684319

SUNRPC: Abstract backchannel operations · 42e5c3e2

由 Chuck Lever 提交于 10月 24, 2015

xprt_{setup,destroy}_backchannel() won't be adequate for RPC/RMDA
bi-direction. In particular, receive buffers have to be pre-
registered and posted in order to receive incoming backchannel
requests.

Add a virtual function call to allow the insertion of appropriate
backchannel setup and destruction methods for each transport.

In addition, freeing a backchannel request is a little different
for RPC/RDMA. Introduce an rpc_xprt_op to handle the difference.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

42e5c3e2

29 10月, 2015 1 次提交

svcrdma: Port to new memory registration API · 412a15c0

由 Sagi Grimberg 提交于 10月 13, 2015

Instead of maintaining a fastreg page list, keep an sg table
and convert an array of pages to a sg list. Then call ib_map_mr_sg
and construct ib_reg_wr.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Acked-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NSelvin Xavier <selvin.xavier@avagotech.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

412a15c0

24 10月, 2015 1 次提交

sunrpc/cache: make cache flushing more reliable. · 77862036

由 Neil Brown 提交于 10月 16, 2015

The caches used to store sunrpc authentication information can be
flushed by writing a timestamp to a file in /proc.

This timestamp has a one-second resolution and any entry in cache that
was last_refreshed *before* that time is treated as expired.

This is problematic as it is not possible to reliably flush the cache
without interrupting NFS service.
If the current time is written to the "flush" file, any entry that was
added since the current second started will still be treated as valid.
If one second beyond than the current time is written to the file
then no entries can be valid until the second ticks over.  This will
mean that no NFS request will be handled for up to 1 second.

To resolve this issue we make two changes:

1/ treat an entry as expired if the timestamp when it was last_refreshed
  is before *or the same as* the expiry time.  This means that current
  code which writes out the current time will now flush the cache
  reliably.

2/ when a new entry in added to the cache -  set the last_refresh timestamp
  to 1 second *beyond* the current flush time, when that not in the
  past.
  This ensures that newly added entries will always be valid.

Now that we have a very reliable way to flush the cache, and also
since we are using "since-boot" timestamps which are monotonic,
change cache_purge() to set the smallest future flush_time which
will work, and leave it there: don't revert to '1'.

Also disable the setting of the 'flush_time' far into the future.
That has never been useful and is now awkward as it would cause
last_refresh times to be strange.
Finally: if a request is made to set the 'flush_time' to the current
second, assume the intent is to flush the cache and advance it, if
necessary, to 1 second beyond the current 'flush_time' so that all
active entries will be deemed to be expired.

As part of this we need to add a 'cache_detail' arg to cache_init()
and cache_fresh_locked() so they can find the current ->flush_time.
Signed-off-by: NNeilBrown <neilb@suse.com>
Reported-by: NOlaf Kirch <okir@suse.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

77862036

08 10月, 2015 1 次提交

SUNRPC: Move TCP receive data path into a workqueue context · edc1b01c

由 Trond Myklebust 提交于 10月 05, 2015

Stream protocols such as TCP can often build up a backlog of data to be
read due to ordering. Combine this with the fact that some workloads such
as NFS read()-intensive workloads need to receive a lot of data per RPC
call, and it turns out that receiving the data from inside a softirq
context can cause starvation.

The following patch moves the TCP data receive into a workqueue context.
We still end up calling tcp_read_sock(), but we do so from a process
context, meaning that softirqs are enabled for most of the time.

With this patch, I see a doubling of read bandwidth when running a
multi-threaded iozone workload between a virtual client and server setup.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

edc1b01c

18 9月, 2015 1 次提交

SUNRPC: Ensure that we wait for connections to complete before retrying · 0fdea1e8

由 Trond Myklebust 提交于 9月 16, 2015

Commit 718ba5b8, moved the responsibility for unlocking the socket to
xs_tcp_setup_socket, meaning that the socket will be unlocked before we
know that it has finished trying to connect. The following patch is based on
an initial patch by Russell King to ensure that we delay clearing the
XPRT_CONNECTING flag until we either know that we failed to initiate
a connection attempt, or the connection attempt itself failed.

Fixes: 718ba5b8 ("SUNRPC: Add helpers to prevent socket create from racing")
Reported-by: NRussell King <linux@arm.linux.org.uk>
Reported-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Tested-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Tested-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

0fdea1e8

29 8月, 2015 1 次提交

svcrdma: Use max_sge_rd for destination read depths · bc3fe2e3

由 Steve Wise 提交于 7月 27, 2015

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

bc3fe2e3

18 8月, 2015 3 次提交

SUNRPC: Drop double-underscores from __rpc_cmp_addr6() · 9fba8e30

由 Trond Myklebust 提交于 8月 17, 2015

Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Fixes: 7b0ce60c ("SUNRPC: Drop double-underscores from rpc_cmp_addr{4|6}()")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9fba8e30

SUNRPC: Add an rpc_cmp_addr_port() function · 58cc8a55

由 Anna Schumaker 提交于 7月 13, 2015

This function is to help determine if two sockaddrs are really the same
socket.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

58cc8a55

SUNRPC: Drop double-underscores from rpc_cmp_addr{4|6}() · 7b0ce60c

由 Anna Schumaker 提交于 7月 13, 2015

I'm planning on using these functions inside the client, so remove the
underscores to make it feel like I'm using a public interface.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7b0ce60c

13 8月, 2015 3 次提交

sunrpc: Switch to using hash list instead single list · 129e5824

由 Kinglong Mee 提交于 7月 27, 2015

Switch using list_head for cache_head in cache_detail,
it is useful of remove an cache_head entry directly from cache_detail.

v8, using hash list, not head list
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

129e5824

sunrpc/nfsd: Remove redundant code by exports seq_operations functions · c8c081b7

由 Kinglong Mee 提交于 7月 27, 2015

Nfsd has implement a site of seq_operations functions as sunrpc's cache.
Just exports sunrpc's codes, and remove nfsd's redundant codes.

v8, same as v6
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c8c081b7

sunrpc: increase UNX_MAXNODENAME from 32 to __NEW_UTS_LEN bytes · 24a9a961

由 Jeff Layton 提交于 8月 03, 2015

The current limit of 32 bytes artificially limits the name string that
we end up stuffing into NFSv4.x client ID blobs. If you have multiple
hosts with long hostnames that only differ near the end, then this can
cause NFSv4 client ID collisions.

Linux nodenames are actually limited to __NEW_UTS_LEN bytes (64), so use
that as the limit instead. Also, use XDR_QUADLEN to specify the slack
length, just for clarity and in case someone in the future changes this
to something not evenly divisible by 4.
Reported-by: NMichael Skralivetsky <michael.skralivetsky@primarydata.com>
Signed-off-by: NJeff Layton <jeff.layton@primarydata.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

24a9a961

11 8月, 2015 8 次提交

nfsd/sunrpc: factor svc_rqst allocation and freeing from sv_nrthreads refcounting · 1b6dc1df

由 Jeff Layton 提交于 6月 08, 2015

In later patches, we'll want to be able to allocate and free svc_rqst
structures without monkeying with the serv->sv_nrthreads refcount.

Factor those pieces out of their respective functions.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1b6dc1df

nfsd/sunrpc: move pool_mode definitions into svc.h · d70bc0c6

由 Jeff Layton 提交于 6月 08, 2015

In later patches, we're going to need to allow code external to svc.c
to figure out what pool_mode is in use. Move these definitions into
svc.h to prepare for that.

Also, make the svc_pool_map object available and exported so that other
modules can peek in there to get insight into what pool mode is in use.
Likewise, export svc_pool_map_get/put function to make it safe to do so.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d70bc0c6

nfsd/sunrpc: abstract out svc_set_num_threads to sv_ops · 598e2359

由 Jeff Layton 提交于 6月 08, 2015

Add an operation that will do setup of the service. In the case of a
classic thread-based service that means starting up threads. In the case
of a workqueue-based service, the setup will do something different.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirliey.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

598e2359

nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation · b9e13cdf

由 Jeff Layton 提交于 6月 08, 2015

For now, all services use svc_xprt_do_enqueue, but once we add
workqueue-based service support, we'll need to do something different.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b9e13cdf

nfsd/sunrpc: move sv_module parm into sv_ops · 758f62ff

由 Jeff Layton 提交于 6月 08, 2015

...not technically an operation, but it's more convenient and cleaner
to pass the module pointer in this struct.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

758f62ff

nfsd/sunrpc: move sv_function into sv_ops · c369014f

由 Jeff Layton 提交于 6月 08, 2015

Since we now have a container for holding svc_serv operations, move the
sv_function into it as well.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c369014f

nfsd/sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it · ea126e74

由 Jeff Layton 提交于 6月 08, 2015

In later patches we'll need to abstract out more operations on a
per-service level, besides sv_shutdown and sv_function.

Declare a new svc_serv_ops struct to hold these operations, and move
sv_shutdown into this struct.
Signed-off-by: NShirley Ma <shirley.ma@oracle.com>
Acked-by: NJeff Layton <jlayton@primarydata.com>
Tested-by: NShirley Ma <shirley.ma@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ea126e74

svcrdma: Change maximum server payload back to RPCSVC_MAXPAYLOAD · cc9a903d

由 Chuck Lever 提交于 8月 07, 2015

Both commit 0380a3f3 ("svcrdma: Add a separate "max data segs"
macro for svcrdma") and commit 7e5be288 ("svcrdma: advertise
the correct max payload") are incorrect. This commit reverts both
changes, restoring the server's maximum payload size to 1MB.

Commit 7e5be288 based the server's maximum payload on the
_client's_ RPCRDMA_MAX_DATA_SEGS value. That was wrong.

Commit 0380a3f3 tried to fix this so that the client maximum
payload size could be raised without affecting the server, but
managed to confuse matters more on the server side.

More importantly, limiting the advertised maximum payload size was
meant to be a workaround, not the actual fix. We need to revisit

  https://bugzilla.linux-nfs.org/show_bug.cgi?id=270

A Linux client on a platform with 64KB pages can overrun and crash
an x86_64 NFS/RDMA server when the r/wsize is 1MB. An x86/64 Linux
client seems to work fine using 1MB reads and writes when the Linux
server's maximum payload size is restored to 1MB.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=270
Fixes: 0380a3f3 ("svcrdma: Add a separate "max data segs" macro")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cc9a903d

06 8月, 2015 1 次提交

xprtrdma: Increase default credit limit · 061dff29

由 Chuck Lever 提交于 8月 03, 2015

In preparation for similar increases on NFS/RDMA servers, bump the
advertised credit limit for RPC/RDMA to 128. This allocates some
extra resources, but the client will continue to allow only the
number of RPCs in flight that the server requests via its advertised
credit limit.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-By: NSagi Grimberg <sagig@mellanox.com>
Tested-by: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

061dff29

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功