- 13 7月, 2017 3 次提交
-
-
由 Chuck Lever 提交于
Clean up: Now that the svc_rdma_recvfrom path uses the rdma_rw API, the details of Read sink buffer registration are dealt with by the kernel's RDMA core. This cache is no longer used, and can be removed. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: The generic RDMA R/W API conversion of svc_rdma_recvfrom replaced the Register, Read, and Invalidate completion handlers. Remove the old ones, which are no longer used. These handlers shared some helper code with svc_rdma_wc_send. Fold the wc_common helper back into the one remaining completion handler. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The current svcrdma recvfrom code path has a lot of detail about registration mode and the type of port (iWARP, IB, etc). Instead, use the RDMA core's generic R/W API. This shares code with other RDMA-enabled ULPs that manages the gory details of buffer registration and the posting of RDMA Read Work Requests. Since the Read list marshaling code is being replaced, I took the opportunity to replace C structure-based XDR encoding code with more portable code that uses pointer arithmetic. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 26 4月, 2017 5 次提交
-
-
由 Chuck Lever 提交于
req_maps are no longer used by the send path and can thus be removed. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up. All RDMA Write completions are now handled by svc_rdma_wc_write_ctx. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The current svcrdma sendto code path posts one RDMA Write WR at a time. Each of these Writes typically carries a small number of pages (for instance, up to 30 pages for mlx4 devices). That means a 1MB NFS READ reply requires 9 ib_post_send() calls for the Write WRs, and one for the Send WR carrying the actual RPC Reply message. Instead, use the new rdma_rw API. The details of Write WR chain construction and memory registration are taken care of in the RDMA core. svcrdma can focus on the details of the RPC-over-RDMA protocol. This gives three main benefits: 1. All Write WRs for one RDMA segment are posted in a single chain. As few as one ib_post_send() for each Write chunk. 2. The Write path can now use FRWR to register the Write buffers. If the device's maximum page list depth is large, this means a single Write WR is needed for each RPC's Write chunk data. 3. The new code introduces support for RPCs that carry both a Write list and a Reply chunk. This combination can be used for an NFSv4 READ where the data payload is large, and thus is removed from the Payload Stream, but the Payload Stream is still larger than the inline threshold. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The plan is to replace the local bespoke code that constructs and posts RDMA Read and Write Work Requests with calls to the rdma_rw API. This shares code with other RDMA-enabled ULPs that manages the gory details of buffer registration and posting Work Requests. Some design notes: o The structure of RPC-over-RDMA transport headers is flexible, allowing multiple segments per Reply with arbitrary alignment, each with a unique R_key. Write and Send WRs continue to be built and posted in separate code paths. However, one whole chunk (with one or more RDMA segments apiece) gets exactly one ib_post_send and one work completion. o svc_xprt reference counting is modified, since a chain of rdma_rw_ctx structs generates one completion, no matter how many Write WRs are posted. o The current code builds the transport header as it is construct- ing Write WRs. I've replaced that with marshaling of transport header data items in a separate step. This is because the exact structure of client-provided segments may not align with the components of the server's reply xdr_buf, or the pages in the page list. Thus parts of each client-provided segment may be written at different points in the send path. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The Send Queue depth is temporarily reduced to 1 SQE per credit. The new rdma_rw API does an internal computation, during QP creation, to increase the depth of the Send Queue to handle RDMA Read and Write operations. This change has to come before the NFSD code paths are updated to use the rdma_rw API. Without this patch, rdma_rw_init_qp() increases the size of the SQ too much, resulting in memory allocation failures during QP creation. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 29 3月, 2017 1 次提交
-
-
由 Chuck Lever 提交于
Same change as Kinglong Mee's fix for the TCP backchannel service. Fixes: 5283b03e ("nfs/nfsd/sunrpc: enforce transport...") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 25 2月, 2017 1 次提交
-
-
由 Jeff Layton 提交于
NFSv4 requires a transport protocol with congestion control in most cases. On an IP network, that means that NFSv4 over UDP should be forbidden. The situation with RDMA is a bit more nuanced, but most RDMA transports are suitable for this. For now, we assume that all RDMA transports are suitable, but we may need to revise that at some point. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 09 2月, 2017 4 次提交
-
-
由 Chuck Lever 提交于
svcrdma calls svc_xprt_put() in its completion handlers, which currently run in IRQ context. However, svc_xprt_put() is meant to be invoked in process context, not in IRQ context. After the last transport reference is gone, it directly calls a transport release function that expects to run in process context. Change the CQ polling modes to IB_POLL_WORKQUEUE so that svcrdma invokes svc_xprt_put() only in process context. As an added benefit, bottom half-disabled spin locking can be eliminated from I/O paths. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: The free list and the dto_q list fields are never used at the same time. Reduce the size of struct svc_rdma_op_ctxt by combining these fields. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up. Commit be99bb11 ("svcrdma: Use new CQ API for RPC-over-RDMA server send CQs") removed code that used the sc_dto_q field, but neglected to remove sc_dto_q at the same time. Fixes: be99bb11 ("svcrdma: Use new CQ API for RPC-over- ...") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Replace C structure-based XDR decoding with pointer arithmetic. Pointer arithmetic is considered more portable, and is used throughout the kernel's existing XDR encoders. The gcc optimizer generates similar assembler code either way. Byte-swapping before a memory store on x86 typically results in an instruction pipeline stall. Avoid byte-swapping when encoding a new header. svcrdma currently doesn't alter a connection's credit grant value after the connection has been accepted, so it is effectively a constant. Cache the byte-swapped value in a separate field. Christoph suggested pulling the header encoding logic into the only function that uses it. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 1月, 2017 1 次提交
-
-
由 Peter Zijlstra 提交于
Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages. Kills two anti-patterns: atomic_read(&kref->refcount) kref->refcount.counter Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 12月, 2016 4 次提交
-
-
由 Chuck Lever 提交于
The current code results in: Nov 7 14:50:19 klimt kernel: svcrdma: newxprt->sc_cm_id=ffff88085590c800, newxprt->sc_pd=ffff880852a7ce00#012 cm_id->device=ffff88084dd20000, sc_pd->device=ffff88084dd20000#012 cap.max_send_wr = 272#012 cap.max_recv_wr = 34#012 cap.max_send_sge = 32#012 cap.max_recv_sge = 32 Nov 7 14:50:19 klimt kernel: svcrdma: new connection ffff880855908000 accepted with the following attributes:#12 local_ip : 10.0.0.5#012 local_port#011 : 20049#012 remote_ip : 10.0.0.2#012 remote_port : 59909#012 max_sge : 32#012 max_sge_rd : 30#012 sq_depth : 272#012 max_requests : 32#012 ord : 16 Split up the output over multiple dprintks and take the opportunity to fix the display of IPv6 addresses. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: Completion status is already reported in the individual completion handlers. Save a few bytes in struct svc_rdma_op_ctxt. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: sc_dma_used is not required for correct operation. It is simply a debugging tool to report when svcrdma has leaked DMA maps. However, manipulating an atomic has a measurable CPU cost, and DMA map accounting specific to svcrdma will be meaningless once svcrdma is converted to use the new generic r/w API. A similar kind of debug accounting can be done simply by enabling the IOMMU or by using CONFIG_DMA_API_DEBUG, CONFIG_IOMMU_DEBUG, and CONFIG_IOMMU_LEAK. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
svcrdma's current SQ accounting algorithm takes sc_lock and disables bottom-halves while posting all RDMA Read, Write, and Send WRs. This is relatively heavyweight serialization. And note that Write and Send are already fully serialized by the xpt_mutex. Using a single atomic_t should be all that is necessary to guarantee that ib_post_send() is called only when there is enough space on the send queue. This is what the other RDMA-enabled storage targets do. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 11月, 2016 1 次提交
-
-
由 Scott Mayhew 提交于
This fixes the following panic that can occur with NFSoRDMA. general protection fault: 0000 [#1] SMP Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm sg ioatdma ipmi_devintf ipmi_ssif dcdbas iTCO_wdt iTCO_vendor_support pcspkr irqbypass sb_edac shpchp dca crc32_pclmul ghash_clmulni_intel edac_core lpc_ich aesni_intel lrw gf128mul glue_helper ablk_helper mei_me mei ipmi_si cryptd wmi ipmi_msghandler acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops ttm libahci mlx5_core tg3 crct10dif_pclmul drm crct10dif_common ptp i2c_core libata crc32c_intel pps_core fjes dm_mirror dm_region_hash dm_log dm_mod CPU: 1 PID: 120 Comm: kworker/1:1 Not tainted 3.10.0-514.el7.x86_64 #1 Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.4.2 01/29/2015 Workqueue: events check_lifetime task: ffff88031f506dd0 ti: ffff88031f584000 task.ti: ffff88031f584000 RIP: 0010:[<ffffffff8168d847>] [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50 RSP: 0018:ffff88031f587ba8 EFLAGS: 00010206 RAX: 0000000000020000 RBX: 20041fac02080072 RCX: ffff88031f587fd8 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 20041fac02080072 RBP: ffff88031f587bb0 R08: 0000000000000008 R09: ffffffff8155be77 R10: ffff880322a59b00 R11: ffffea000bf39f00 R12: 20041fac02080072 R13: 000000000000000d R14: ffff8800c4fbd800 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff880322a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3c52d4547e CR3: 00000000019ba000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: 20041fac02080002 ffff88031f587bd0 ffffffff81557830 20041fac02080002 ffff88031f587c78 ffff88031f587c40 ffffffff8155ae08 000000010157df32 0000000800000001 ffff88031f587c20 ffffffff81096acb ffffffff81aa37d0 Call Trace: [<ffffffff81557830>] lock_sock_nested+0x20/0x50 [<ffffffff8155ae08>] sock_setsockopt+0x78/0x940 [<ffffffff81096acb>] ? lock_timer_base.isra.33+0x2b/0x50 [<ffffffff8155397d>] kernel_setsockopt+0x4d/0x50 [<ffffffffa0386284>] svc_age_temp_xprts_now+0x174/0x1e0 [sunrpc] [<ffffffffa03b681d>] nfsd_inetaddr_event+0x9d/0xd0 [nfsd] [<ffffffff81691ebc>] notifier_call_chain+0x4c/0x70 [<ffffffff810b687d>] __blocking_notifier_call_chain+0x4d/0x70 [<ffffffff810b68b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff815e8538>] __inet_del_ifa+0x168/0x2d0 [<ffffffff815e8cef>] check_lifetime+0x25f/0x270 [<ffffffff810a7f3b>] process_one_work+0x17b/0x470 [<ffffffff810a8d76>] worker_thread+0x126/0x410 [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460 [<ffffffff810b052f>] kthread+0xcf/0xe0 [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81696418>] ret_from_fork+0x58/0x90 [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 Code: ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb d9 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 7e 04 a0 ff b8 00 00 02 00 <f0> 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 5b 5d c3 83 e2 fe 0f RIP [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50 RSP <ffff88031f587ba8> Signed-off-by: NScott Mayhew <smayhew@redhat.com> Fixes: c3d4879e ("sunrpc: Add a function to close temporary transports immediately") Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 24 9月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or less unchecked, this moves the capability of creating a global rkey into the RDMA core, where it can be easily audited. It also prints a warning everytime this feature is used as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 23 9月, 2016 3 次提交
-
-
由 Chuck Lever 提交于
Support Remote Invalidation. A private message is exchanged with the client upon RDMA transport connect that indicates whether Send With Invalidation may be used by the server to send RPC replies. The invalidate_rkey is arbitrarily chosen from among rkeys present in the RPC-over-RDMA header's chunk lists. Send With Invalidate improves performance only when clients can recognize, while processing an RPC reply, that an rkey has already been invalidated. That has been submitted as a separate change. In the future, the RPC-over-RDMA protocol might support Remote Invalidation properly. The protocol needs to enable signaling between peers to indicate when Remote Invalidation can be used for each individual RPC. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Prepare to receive an RDMA-CM private message when handling a new connection attempt, and send a similar message as part of connection acceptance. Both sides can communicate their various implementation limits. Implementations that don't support this sideband protocol ignore it. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The ctxt's count field is overloaded to mean the number of pages in the ctxt->page array and the number of SGEs in the ctxt->sge array. Typically these two numbers are the same. However, when an inline RPC reply is constructed from an xdr_buf with a tail iovec, the head and tail often occupy the same page, but each are DMA mapped independently. In that case, ->count equals the number of pages, but it does not equal the number of SGEs. There's one more SGE, for the tail iovec. Hence there is one more DMA mapping than there are pages in the ctxt->page array. This isn't a real problem until the server's iommu is enabled. Then each RPC reply that has content in that iovec orphans a DMA mapping that consists of real resources. krb5i and krb5p always populate that tail iovec. After a couple million sent krb5i/p RPC replies, the NFS server starts behaving erratically. Reboot is needed to clear the problem. Fixes: 9d11b51c ("svcrdma: Fix send_reply() scatter/gather set-up") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 5月, 2016 3 次提交
-
-
由 Chuck Lever 提交于
If the server has forced a disconnect, the associated QP has not been moved to the Error state, and thus Receives are still posted. Ensure Receives (and any other outstanding WRs) are drained to release resources that can be freed during teardown of the svcrdma_xprt. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Since backward direction support was added, the rq_depth was increased to accommodate both forward and backward Receives. But only forward Receives need to be posted after a connection has been accepted. Receives for backward replies are posted as needed by svc_rdma_bc_sendto(). This doesn't break anything, but it means some resources are wasted. Fixes: 03fe9931 ('svcrdma: Define maximum number of ...') Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Shirley Ma 提交于
Allow both IPv4 and IPv6 to bind same port at the same time, restricts use of the IPv6 socket to IPv6 communication. Changes from v1: - Check rdma_set_afonly return value (suggested by Leon Romanovsky) Changes from v2: - Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: NShirley Ma <shirley.ma@oracle.com> Acked-by: NLeon Romanovsky <leonro@mellanox.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 02 3月, 2016 4 次提交
-
-
由 Chuck Lever 提交于
Calling ib_poll_cq() to sort through WCs during a completion is a common pattern amongst RDMA consumers. Since commit 14d3a3b2 ("IB: add a proper completion queue abstraction"), WC sorting can be handled by the IB core. By converting to this new API, svcrdma is made a better neighbor to other RDMA consumers, as it allows the core to schedule the delivery of completions more fairly amongst all active consumers. This new API also aims each completion at a function that is specific to the WR's opcode. Thus the ctxt->wr_op field and the switch in process_context is replaced by a set of methods that handle each completion type. Because each ib_cqe carries a pointer to a completion method, the core can now post operations on a consumer's QP, and handle the completions itself. The server's rdma_stat_sq_poll and rdma_stat_sq_prod metrics are no longer updated. As a clean up, the cq_event_handler, the dto_tasklet, and all associated locking is removed, as they are no longer referenced or used. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Calling ib_poll_cq() to sort through WCs during a completion is a common pattern amongst RDMA consumers. Since commit 14d3a3b2 ("IB: add a proper completion queue abstraction"), WC sorting can be handled by the IB core. By converting to this new API, svcrdma is made a better neighbor to other RDMA consumers, as it allows the core to schedule the delivery of completions more fairly amongst all active consumers. Because each ib_cqe carries a pointer to a completion method, the core can now post operations on a consumer's QP, and handle the completions itself. svcrdma receive completions no longer use the dto_tasklet. Each polled Receive WC is now handled individually in soft IRQ context. The server transport's rdma_stat_rq_poll and rdma_stat_rq_prod metrics are no longer updated. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Fix several issues with svc_rdma_send_error(): - Post a receive buffer to replace the one that was consumed by the incoming request - Posting a send should use DMA_TO_DEVICE, not DMA_FROM_DEVICE - No need to put_page _and_ free pages in svc_rdma_put_context - Make sure the sge is set up completely in case the error path goes through svc_rdma_unmap_dma() - Replace the use of ENOSYS, which has a reserved meaning Related fixes in svc_rdma_recvfrom(): - Don't leak the ctxt associated with the incoming request - Don't close the connection after sending an error reply - Let svc_rdma_send_error() figure out the right header error code As a last clean up, move svc_rdma_send_error() to svc_rdma_sendto.c with other similar functions. There is some common logic in these functions that could someday be combined to reduce code duplication. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NDevesh Sharma <devesh.sharma@broadcom.com> Tested-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: Most svc_rdma_post_recv() call sites close the transport connection when a receive cannot be posted. Wrap that in a common helper. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NDevesh Sharma <devesh.sharma@broadcom.com> Tested-by: NDevesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 20 1月, 2016 9 次提交
-
-
由 Christoph Hellwig 提交于
We now alwasy have a per-PD local_dma_lkey available. Make use of that fact in svc_rdma and stop registering our own MR. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Acked-by: NJ. Bruce Fields <bfields@redhat.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
To support the server-side of an NFSv4.1 backchannel on RDMA connections, add a transport class that enables backward direction messages on an existing forward channel connection. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
Extra resources for handling backchannel requests have to be pre-allocated when a transport instance is created. Set up additional fields in svcxprt_rdma to track these resources. The max_requests fields are elements of the RPC-over-RDMA protocol, so they should be u32. To ensure that unsigned arithmetic is used everywhere, some other fields in the svcxprt_rdma struct are updated. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
Clean up. These functions can otherwise fail, so check for page allocation failures too. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
svc_rdma_post_recv() allocates pages for receive buffers on-demand. It uses GFP_KERNEL so the allocator tries hard, and may sleep. But I'm about to add a call to svc_rdma_post_recv() from a function that may not sleep. Since all svc_rdma_post_recv() call sites can tolerate its failure, allow it to fail if the page allocator returns nothing. Longer term, receive buffers, being a finite resource per-connection, should be pre-allocated and re-used. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
To ensure this allocation cannot fail and will not sleep, pre-allocate the req_map structures per-connection. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
When the maximum payload size of NFS READ and WRITE was increased by commit cc9a903d ("svcrdma: Change maximum server payload back to RPCSVC_MAXPAYLOAD"), the size of struct svc_rdma_op_ctxt increased to over 6KB (on x86_64). That makes allocating one of these from a kmem_cache more likely to fail in situations when system memory is exhausted. Since I'm about to add a caller where this allocation must always work _and_ it cannot sleep, pre-allocate ctxts for each connection. Another motivation for this change is that NFSv4.x servers are required by specification not to drop NFS requests. Pre-allocating memory resources reduces the likelihood of a drop. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
Be sure the completed ctxt is put in every path. The xprt enqueue can take a while, so put the completed ctxt back in circulation _before_ enqueuing the xprt. Remove/disable debugging. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Chuck Lever 提交于
kzalloc is used here, so setting the atomic fields to zero is unnecessary. sc_ord is set again in handle_connect_req. The other fields are re-initialized in svc_rdma_accept(). Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Acked-by: NBruce Fields <bfields@fieldses.org> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-