- 04 6月, 2014 9 次提交
-
-
由 Chuck Lever 提交于
Clean up: All remaining callers of rpcrdma_deregister_external() pass NULL as the last argument, so remove that argument. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
If the selected memory registration mode is not supported by the underlying provider/HCA, the NFS mount command reports that there was an invalid mount option, and fails. This is misleading. Reporting a problem allocating memory is a lot closer to the truth. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
An audit of in-kernel RDMA providers that do not support the FRMR memory registration shows that several of them support MTHCAFMR. Prefer MTHCAFMR when FRMR is not supported. If MTHCAFMR is not supported, only then choose ALLPHYSICAL. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
All kernel RDMA providers except amso1100 support either MTHCAFMR or FRMR, both of which are faster than REGISTER. amso1100 can continue to use ALLPHYSICAL. The only other ULP consumer in the kernel that uses the reg_phys_mr verb is Lustre. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The MEMWINDOWS and MEMWINDOWS_ASYNC memory registration modes were intended as stop-gap modes before the introduction of FRMR. They are now considered obsolete. MEMWINDOWS_ASYNC is also considered unsafe because it can leave client memory registered and exposed for an indeterminant time after each I/O. At this point, the MEMWINDOWS modes add needless complexity, so remove them. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Clean up: This memory registration mode is slow and was never meant for use in production environments. Remove it to reduce implementation complexity. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
An IB provider can invoke rpcrdma_conn_func() in an IRQ context, thus rpcrdma_conn_func() cannot be allowed to directly invoke generic RPC functions like xprt_wake_pending_tasks(). Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Allen Andrews 提交于
Two memory region leaks were found during testing: 1. rpcrdma_buffer_create: While allocating RPCRDMA_FRMR's ib_alloc_fast_reg_mr is called and then ib_alloc_fast_reg_page_list is called. If ib_alloc_fast_reg_page_list returns an error it bails out of the routine dropping the last ib_alloc_fast_reg_mr frmr region creating a memory leak. Added code to dereg the last frmr if ib_alloc_fast_reg_page_list fails. 2. rpcrdma_buffer_destroy: While cleaning up, the routine will only free the MR's on the rb_mws list if there are rb_send_bufs present. However, in rpcrdma_buffer_create while the rb_mws list is being built if one of the MR allocation requests fail after some MR's have been allocated on the rb_mws list the routine never gets to create any rb_send_bufs but instead jumps to the rpcrdma_buffer_destroy routine which will never free the MR's on rb_mws list because the rb_send_bufs were never created. This leaks all the MR's on the rb_mws list that were created prior to one of the MR allocations failing. Issue(2) was seen during testing. Our adapter had a finite number of MR's available and we created enough connections to where we saw an MR allocation failure on our Nth NFS connection request. After the kernel cleaned up the resources it had allocated for the Nth connection we noticed that FMR's had been leaked due to the coding error described above. Issue(1) was seen during a code review while debugging issue(2). Signed-off-by: NAllen Andrews <allen.andrews@emulex.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Steve Wise 提交于
Some rdma devices don't support a fast register page list depth of at least RPCRDMA_MAX_DATA_SEGS. So xprtrdma needs to chunk its fast register regions according to the minimum of the device max supported depth or RPCRDMA_MAX_DATA_SEGS. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 29 3月, 2014 3 次提交
-
-
由 Jeff Layton 提交于
The xdr_off value in dma_map_xdr gets passed to ib_dma_map_page as the offset into the page to be mapped. This calculation does not correctly take into account the case where the data starts at some offset into the page. Increment the xdr_off by the page_base to ensure that it is respected. Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
There are two entirely separate modules under xprtrdma/ and there's no reason that enabling one should automatically enable the other. Add config options for each one so they can be enabled/disabled separately. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Tom Tucker 提交于
The server regression was caused by the addition of rq_next_page (afc59400). There were a few places that were missed with the update of the rq_respages array. Signed-off-by: NTom Tucker <tom@ogc.us> Tested-by: NSteve Wise <swise@ogc.us> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 28 3月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
It retries in 1s, not 1000 jiffies. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 18 3月, 2014 2 次提交
-
-
由 Chuck Lever 提交于
The use of KERN_INFO causes garbage characters to appear when debugging is enabled. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Chuck Lever 提交于
After commit a11a2bf4, "SUNRPC: Optimise away unnecessary data moves in xdr_align_pages", Thu Aug 2 13:21:43 2012, READs larger than a few hundred bytes via NFS/RDMA no longer work. This commit exposed a long-standing bug in rpcrdma_inline_fixup(). I reproduce this with an rsize=4096 mount using the cthon04 basic tests. Test 5 fails with an EIO error. For my reproducer, kernel log shows: NFS: server cheating in read reply: count 4096 > recvd 0 rpcrdma_inline_fixup() is zeroing the xdr_stream::page_len field, and xdr_align_pages() is now returning that value to the READ XDR decoder function. That field is set up by xdr_inline_pages() by the READ XDR encoder function. As far as I can tell, it is supposed to be left alone after that, as it describes the dimensions of the reply xdr_stream, not the contents of that stream. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=68391Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 15 7月, 2013 1 次提交
-
-
由 Dan Carpenter 提交于
My static checker marks everything from ntohl() as untrusted and it complains we could have an underflow problem doing: return (u32 *)&ary->wc_array[nchunks]; Also on 32 bit systems the upper bound check could overflow. Cc: stable@vger.kernel.org Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 13 6月, 2013 1 次提交
-
-
由 Joe Perches 提交于
Reduce the uses of this unnecessary typedef. Done via perl script: $ git grep --name-only -w ctl_table net | \ xargs perl -p -i -e '\ sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \ s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge' Reflow the modified lines that now exceed 80 columns. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 2月, 2013 1 次提交
-
-
由 Shani Michaeli 提交于
This patch enhances the IB core support for Memory Windows (MWs). MWs allow an application to have better/flexible control over remote access to memory. Two types of MWs are supported, with the second type having two flavors: Type 1 - associated with PD only Type 2A - associated with QPN only Type 2B - associated with PD and QPN Applications can allocate a MW once, and then repeatedly bind the MW to different ranges in MRs that are associated to the same PD. Type 1 windows are bound through a verb, while type 2 windows are bound by posting a work request. The 32-bit memory key is composed of a 24-bit index and an 8-bit key. The key is changed with each bind, thus allowing more control over the peer's use of the memory key. The changes introduced are the following: * add memory window type enum and a corresponding parameter to ib_alloc_mw. * type 2 memory window bind work request support. * create a struct that contains the common part of the bind verb struct ibv_mw_bind and the bind work request into a single struct. * add the ib_inc_rkey helper function to advance the tag part of an rkey. Consumer interface details: * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support for these features. Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B memory windows. It can set neither to indicate it doesn't support type 2 windows at all. * modify existing provides and consumers code to the new param of ib_alloc_mw and the ib_mw_bind_info structure Signed-off-by: NHaggai Eran <haggaie@mellanox.com> Signed-off-by: NShani Michaeli <shanim@mellanox.com> Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 05 2月, 2013 1 次提交
-
-
由 Jeff Layton 提交于
These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: NJeff Layton <jlayton@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 01 2月, 2013 2 次提交
-
-
由 Trond Myklebust 提交于
Avoid another RCU dereference by passing the pointer to struct rpc_xprt from the caller. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
tk_xprt is just a shortcut for tk_client->cl_xprt, however cl_xprt is defined as an __rcu variable. Replace dereferences of tk_xprt with non-rcu dereferences where it is safe to do so. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 18 12月, 2012 1 次提交
-
-
由 J. Bruce Fields 提交于
It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 29 9月, 2012 1 次提交
-
-
由 Trond Myklebust 提交于
It is only set after everyone has dereferenced the transport, and serves no useful purpose: setting it is racy, so all the socket code, etc still needs to be able to cope with the cases where they miss reading it. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 07 9月, 2012 1 次提交
-
-
由 Trond Myklebust 提交于
Commit 43cedbf0 (SUNRPC: Ensure that we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing hangs in the case of NFS over UDP mounts. Since neither the UDP or the RDMA transport mechanism use dynamic slot allocation, we can skip grabbing the socket lock for those transports. Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA case. Note that the NFSv4.1 back channel assigns the slot directly through rpc_run_bc_task, so we can ignore that case. Reported-by: NDick Streefland <dick.streefland@altium.nl> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [>= 3.1]
-
- 22 8月, 2012 1 次提交
-
-
由 J. Bruce Fields 提交于
Note this isn't used outside svc_xprt.c. May as well move it so we don't need a declaration while we're here. Also remove an outdated comment. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 31 7月, 2012 1 次提交
-
-
由 Jeff Layton 提交于
We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
-
- 21 3月, 2012 2 次提交
-
-
由 Tom Tucker 提交于
The xprtrdma FRMR mapping logic assumes that a segment is <= PAGE_SIZE. This is not true for NFS4. Signed-off-by: NTom Tucker <tom@ogc.us> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Tom Tucker 提交于
The client side RDMA transport will bug check if it receives a duplicate reply, instead we should simply drop the duplicate reply. Signed-off-by: NTom Tucker <tom@ogc.us> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 20 3月, 2012 1 次提交
-
-
由 Cong Wang 提交于
Signed-off-by: NCong Wang <amwang@redhat.com>
-
- 18 2月, 2012 1 次提交
-
-
由 Tom Tucker 提交于
The svcrdma transport was un-marshalling requests in-place. This resulted in sparse warnings due to __beXX data containing both NBO and HBO data. The code has been restructured to do byte-swapping as the header is parsed instead of when the header is validated immediately after receipt. Also moved extern declarations for the workqueue and memory pools to the private header file. Signed-off-by: NTom Tucker <tom@ogc.us> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 07 12月, 2011 1 次提交
-
-
由 Stanislav Kinsbursky 提交于
This patch makes svc_xprt inherit network namespace link from its socket. Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 01 11月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
These files are non modular, but need to export symbols using the macros now living in export.h -- call out the include so that things won't break when we remove the implicit presence of module.h from everywhere. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 27 7月, 2011 1 次提交
-
-
由 Arun Sharma 提交于
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: NArun Sharma <asharma@fb.com> Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 7月, 2011 1 次提交
-
-
由 Steve Dickson 提交于
Our performance team has noticed that increasing RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly increases throughput when using the RDMA transport. Signed-off-by: NSteve Dickson <steved@redhat.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 18 7月, 2011 2 次提交
-
-
由 Trond Myklebust 提交于
Allow the number of available slots to grow with the TCP window size. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
This throttles the allocation of new slots when the socket is busy reconnecting and/or is out of buffer space. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 07 6月, 2011 1 次提交
-
-
由 Alexey Dobriyan 提交于
* remove interrupt.g inclusion from netdevice.h -- not needed * fixup fallout, add interrupt.h and hardirq.h back where needed. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 5月, 2011 1 次提交
-
-
由 Sean Hefty 提交于
The RDMA CM currently infers the QP type from the port space selected by the user. In the future (eg with RDMA_PS_IB or XRC), there may not be a 1-1 correspondence between port space and QP type. For netlink export of RDMA CM state, we want to export the QP type to userspace, so it is cleaner to explicitly associate a QP type to an ID. Modify rdma_create_id() to allow the user to specify the QP type, and use it to make our selections of datagram versus connected mode. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <roland@purestorage.com>
-
- 10 5月, 2011 1 次提交
-
-
由 Justin P. Mattock 提交于
- kenrel -> kernel - whetehr -> whether - ttt -> tt - sss -> ss Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 18 3月, 2011 1 次提交
-
-
由 Jesper Juhl 提交于
We leak the memory allocated to 'ctxt' when we return after 'ib_dma_mapping_error()' returns !=0. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-