- 08 10月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
This patch split up struct ib_send_wr so that all non-trivial verbs use their own structure which embedds struct ib_send_wr. This dramaticly shrinks the size of a WR for most common operations: sizeof(struct ib_send_wr) (old): 96 sizeof(struct ib_send_wr): 48 sizeof(struct ib_rdma_wr): 64 sizeof(struct ib_atomic_wr): 96 sizeof(struct ib_ud_wr): 88 sizeof(struct ib_fast_reg_wr): 88 sizeof(struct ib_bind_mw_wr): 96 sizeof(struct ib_sig_handover_wr): 80 And with Sagi's pending MR rework the fast registration WR will also be down to a reasonable size: sizeof(struct ib_fastreg_wr): 64 Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt] Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc] Tested-by: NHaggai Eran <haggaie@mellanox.com> Tested-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NSteve Wise <swise@opengridcomputing.com>
-
- 25 9月, 2015 1 次提交
-
-
由 Chuck Lever 提交于
The core API has changed so that devices that do not have a global DMA lkey automatically create an mr, per-PD, and make that lkey available. The global DMA lkey interface is going away in favor of the per-PD DMA lkey. The per-PD DMA lkey is always available. Convert xprtrdma to use the device's per-PD DMA lkey for regbufs, no matter which memory registration scheme is in use. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Cc: linux-nfs <linux-nfs@vger.kernel.org> Acked-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 31 8月, 2015 1 次提交
-
-
由 Sagi Grimberg 提交于
Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 06 8月, 2015 1 次提交
-
-
由 Chuck Lever 提交于
Untangle the end of rpcrdma_ia_open() by moving DMA MR set-up, which is different for each registration method, to the .ro_open functions. This is refactoring only. No behavior change is expected. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 13 6月, 2015 6 次提交
-
-
由 Chuck Lever 提交于
Reduce resource consumption per-transport to make way for increasing the credit limit and maximum r/wsize. Pre-allocate fewer MRs. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NDevesh Sharma <devesh.sharma@avagotech.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Reviewed-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
/proc/lock_stat showed contention between rpcrdma_buffer_get/put and the MR allocation functions during I/O intensive workloads. Now that MRs are no longer allocated in rpcrdma_buffer_get(), there's no reason the rb_mws list has to be managed using the same lock as the send/receive buffers. Split that lock. The new lock does not need to disable interrupts because buffer get/put is never called in an interrupt context. struct rpcrdma_buffer is re-arranged to ensure rb_mwlock and rb_mws are always in a different cacheline than rb_lock and the buffer pointers. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Reviewed-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
An RPC can exit at any time. When it does so, xprt_rdma_free() is called, and it calls ->op_unmap(). If ->ro_reset() is running due to a transport disconnect, the two methods can race while processing the same rpcrdma_mw. The results are unpredictable. Because of this, in previous patches I've altered ->ro_map() to handle MR reset. ->ro_reset() is no longer needed and can be removed. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NDevesh Sharma <devesh.sharma@avagotech.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Reviewed-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Acquiring 64 MRs in rpcrdma_buffer_get() while holding the buffer pool lock is expensive, and unnecessary because most modern adapters can transfer 100s of KBs of payload using just a single MR. Instead, acquire MRs one-at-a-time as chunks are registered, and return them to rb_mws immediately during deregistration. Note: commit 539431a4 ("xprtrdma: Don't invalidate FRMRs if registration fails") is reverted: There is now a valid case where registration can fail (with -ENOMEM) but the QP is still in RTS. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Reviewed-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
After a transport disconnect, FRMRs can be left in an undetermined state. In particular, the MR's rkey is no good. Currently, FRMRs are fixed up by the transport connect worker, but that can race with ->ro_unmap if an RPC happens to exit while the transport connect worker is running. A better way of dealing with broken FRMRs is to detect them before they are re-used by ->ro_map. Such FRMRs are either already invalid or are owned by the sending RPC, and thus no race with ->ro_unmap is possible. Introduce a mechanism for handing broken FRMRs to a workqueue to be reset in a context that is appropriate for allocating resources (ie. an ib_alloc_fast_reg_mr() API call). This mechanism is not yet used, but will be in subsequent patches. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-By: NDevesh Sharma <devesh.sharma@avagotech.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Reviewed-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The connect worker can replace ri_id, but prevents ri_id->device from changing during the lifetime of a transport instance. The old ID is kept around until a new ID is created and the ->device is confirmed to be the same. Cache a copy of ri_id->device in rpcrdma_ia and in rpcrdma_rep. The cached copy can be used safely in code that does not serialize with the connect worker. Other code can use it to save an extra address generation (one pointer dereference instead of two). Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Reviewed-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 19 5月, 2015 1 次提交
-
-
由 Sagi Grimberg 提交于
Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NAnna Schumaker <anna.schumaker@netapp.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 31 3月, 2015 10 次提交
-
-
由 Chuck Lever 提交于
These functions are called in a loop for each page transferred via RDMA READ or WRITE. Extract loop invariants and inline them to reduce CPU overhead. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Allow each memory registration mode to plug in a callout that handles the completion of a memory registration operation. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The open op determines the size of various transport data structures based on device capabilities and memory registration mode. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Memory Region objects associated with a transport instance are destroyed before the instance is shutdown and destroyed. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
This method is invoked when a transport instance is about to be reconnected. Each Memory Region object is reset to its initial state. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
This method is used when setting up a new transport instance to create a pool of Memory Region objects that will be used to register memory during operation. Memory Regions are not needed for "physical" registration, since ->prepare and ->release are no-ops for that mode. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
There is very little common processing among the different external memory deregistration functions. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
There is very little common processing among the different external memory registration functions. Have rpcrdma_create_chunks() call the registration method directly. This removes a stack frame and a switch statement from the external registration path. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The max_payload computation is generalized to ensure that the payload maximum is the lesser of RPC_MAX_DATA_SEGS and the number of data segments that can be transmitted in an inline buffer. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Instead of employing switch() statements, let's use the typical Linux kernel idiom for handling behavioral variation: virtual functions. Start by defining a vector of operations for each supported memory registration mode, and by adding a source file for each mode. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-by: NDevesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: NMeghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: NVeeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-