- 14 8月, 2008 1 次提交
-
-
由 Tom Tucker 提交于
RDMA_READ completions are kept on a separate queue from the general I/O request queue. Since a separate lock is used to protect the RDMA_READ completion queue, a race exists between the dto_tasklet and the svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA bit and adds I/O to the read-completion queue. Concurrently, the recvfrom thread checks the generic queue, finds it empty and resets the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue the transport for I/O and cause the transport to "stall". The fix is to protect both lists with the same lock and set the XPT_DATA bit with this lock held. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
- 03 7月, 2008 7 次提交
-
-
由 Tom Tucker 提交于
Change the WR context pool to be shared across mount points. This reduces the RDMA transport memory footprint significantly since idle mounts don't consume WR context memory. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
When adapters have differing IRD limits, the RDMA transport will fail to connect properly. The RDMA transport should use the client's advertised inbound read limit when computing its outbound read limit. For iWARP transports, there is currently no standard for exchanging IRD/ORD during connection establishment so the 'responder_resources' field in the connect event is the local device's limit. The RDMA transport can be configured to use a smaller ORD by writing the desired number to the /proc/sys/sunrpc/svc_rdma/max_outbound_read_requests file. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
At the time __svc_rdma_free is called, we are guaranteed that all references to this transport are gone. There is, therefore, no need to protect the resource lists with a spin lock. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Add a dma map count in order to verify that all DMA mapping resources have been freed when the transport is closed. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Separate DMA unmap from context destruction and perform DMA unmapping in the SQ/RQ CQ reap functions. This is necessary to support software based RDMA implementations that actually copy the data in their ib_dma_unmap callback functions and architectures that don't have cache coherent I/O busses. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Use the new svc_rdma_req_map data type for mapping the client side memory to the server side memory. Move the DMA mapping to the context pointed to by each WR individually so that it is unmapped after the WR completes. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Create a new data structure to hold the remote client address space to local server address space mapping. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
- 19 5月, 2008 17 次提交
-
-
由 Tom Tucker 提交于
The svc_rdma_send_error function is called when an RPCRDMA protocol error is detected. This function attempts to post an error reply message. Since an error posting to a transport in error is ignored, change the return type to void. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
This race was found by inspection. Messages can be received from the peer immediately following the rdma_accept call, however, the CQ have not yet been armed and the transport address has not yet been set. Set the transport address in the connect request handler and arm the CQ prior to calling rdma_accept. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Use the ib_verbs version of the dma_unmap service in the svc_rdma_put_context function. This should support providers using software rdma. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
When the transport is closing, the DTO tasklet may queue data that never gets processed. Clean up resources associated with this I/O. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Move the destruction of the QP and CM_ID to the free path so that the QP cleanup code doesn't race with the dto_tasklet handling flushed WR. The QP reference is not needed because we now have a reference for every WR. Also add a guard in the SQ and RQ completion handlers to ignore calls generated by some providers when the QP is destroyed. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Add a reference on the transport for every outstanding WR. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Some providers may wait while destroying adapter resources. Since it is possible that the last reference is put on the dto_tasklet, the actual destroy must be scheduled as a work item. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
The rq_cq_reap function is only called from the dto_tasklet. The only resource shared with other threads is the sc_rq_dto_q. Move the spin lock to protect only this list. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Replace the one-off linked list implementation used to implement the context cache with the standard Linux list_head lists. Add a context counter to catch resource leaks. A WARN_ON will be added later to ensure that we've freed all contexts. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
An NFS_WRITE requires a set of RDMA_READ requests to fetch the write data from the client. There are two principal pieces of data that need to be tracked: the list of pages that comprise the completed RPC and the SGE of dma mapped pages to refer to this list of pages. Previously this whole bit was managed as a linked list of contexts with the context containing the page list buried in this list. This patch simplifies this processing by not keeping a linked list, but rather only a pionter from the last submitted RDMA_READ's context to the context that maps the set of pages that describe the RPC. This significantly simplifies this code path. SGE contexts are cleaned up inline in the DTO path instead of at read completion time. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
The RDMACTXT_F_READ_DONE bit is not longer used. Remove it. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
A listening endpoint isn't known to the generic transport switch until the svc_create_xprt function returns without error. Calling svc_xprt_put within the xpo_create function causes the module reference count to be erroneously decremented. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
If there is an error posting the recv WR to the RQ, free the context associated with the WR. This would leak a context when asynchronous errors occurred on the transport while conccurent threads were processing their RPC. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
The svcrdma transport takes a reference when it gets the ESTABLISHED event from the provider. This reference is supposed to be removed when the DISCONNECT event is received, however, the call to svc_xprt_put was missing in the switch statement. This results in the memory associated with the transport never being freed. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
Fix the return value on close to -ENOTCONN so caller knows to free context. Also if a thread is waiting for free SQ space, check for close when waking to avoid posting WR to a closing transport. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
The svc_rdma_send function will attempt to reap SQ WR to make room for a new request if it finds the SQ full. This function races with the dto_tasklet that also reaps SQ WR. To avoid polling and arming the CQ unnecessarily move the test_and_clear_bit of the RDMAXPRT_SQ_PENDING flag and arming of the CQ to the sq_cq_reap function. Refactor the rq_cq_reap function to match sq_cq_reap so that the code is easier to follow. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
由 Tom Tucker 提交于
The svcrdma transport provider currently allocates receive buffers to the RQ through the xpo_release_rqst method. This approach is overly complicated since it means that the rqstp rq_xprt_ctxt has to be selectively set based on whether the RPC is going to be processed immediately or deferred. Instead, just post the receive buffer when we are certain that we are replying in the send_reply function. Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
-
- 24 4月, 2008 1 次提交
-
-
由 Tom Tucker 提交于
SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send The svcrdma transport can crash if a send is waiting for an empty SQ slot and the connection is closed due to an asynchronous error. The crash is caused when svc_rdma_send attempts to send on a deleted QP. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
- 13 3月, 2008 1 次提交
-
-
由 Tom Tucker 提交于
RDMA connection shutdown on an SMP machine can cause a kernel crash due to the transport close path racing with the I/O tasklet. Additional transport references were added as follows: - A reference when on the DTO Q to avoid having the transport deleted while queued for I/O. - A reference while there is a QP able to generate events. - A reference until the DISCONNECTED event is received on the CM ID Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 2月, 2008 1 次提交
-
-
由 Tom Tucker 提交于
This file implements the core transport data management and I/O path. The I/O path for RDMA involves receiving callbacks on interrupt context. Since all the svc transport locks are _bh locks we enqueue the transport on a list, schedule a tasklet to dequeue data indications from the RDMA completion queue. The tasklet in turn takes _bh locks to enqueue receive data indications on a list for the transport. The svc_rdma_recvfrom transport function dequeues data from this list in an NFSD thread context. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Acked-by: NNeil Brown <neilb@suse.de> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-