- 26 4月, 2017 7 次提交
-
-
由 Chuck Lever 提交于
Now that svc_rdma_sendto has been renovated, svc_rdma_send_error can be refactored to reduce code duplication and remove C structure- based XDR encoding. It is also relocated to the source file that contains its only caller. This is a refactoring change only. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The current svcrdma sendto code path posts one RDMA Write WR at a time. Each of these Writes typically carries a small number of pages (for instance, up to 30 pages for mlx4 devices). That means a 1MB NFS READ reply requires 9 ib_post_send() calls for the Write WRs, and one for the Send WR carrying the actual RPC Reply message. Instead, use the new rdma_rw API. The details of Write WR chain construction and memory registration are taken care of in the RDMA core. svcrdma can focus on the details of the RPC-over-RDMA protocol. This gives three main benefits: 1. All Write WRs for one RDMA segment are posted in a single chain. As few as one ib_post_send() for each Write chunk. 2. The Write path can now use FRWR to register the Write buffers. If the device's maximum page list depth is large, this means a single Write WR is needed for each RPC's Write chunk data. 3. The new code introduces support for RPCs that carry both a Write list and a Reply chunk. This combination can be used for an NFSv4 READ where the data payload is large, and thus is removed from the Payload Stream, but the Payload Stream is still larger than the inline threshold. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The plan is to replace the local bespoke code that constructs and posts RDMA Read and Write Work Requests with calls to the rdma_rw API. This shares code with other RDMA-enabled ULPs that manages the gory details of buffer registration and posting Work Requests. Some design notes: o The structure of RPC-over-RDMA transport headers is flexible, allowing multiple segments per Reply with arbitrary alignment, each with a unique R_key. Write and Send WRs continue to be built and posted in separate code paths. However, one whole chunk (with one or more RDMA segments apiece) gets exactly one ib_post_send and one work completion. o svc_xprt reference counting is modified, since a chain of rdma_rw_ctx structs generates one completion, no matter how many Write WRs are posted. o The current code builds the transport header as it is construct- ing Write WRs. I've replaced that with marshaling of transport header data items in a separate step. This is because the exact structure of client-provided segments may not align with the components of the server's reply xdr_buf, or the pages in the page list. Thus parts of each client-provided segment may be written at different points in the send path. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The Send Queue depth is temporarily reduced to 1 SQE per credit. The new rdma_rw API does an internal computation, during QP creation, to increase the depth of the Send Queue to handle RDMA Read and Write operations. This change has to come before the NFSD code paths are updated to use the rdma_rw API. Without this patch, rdma_rw_init_qp() increases the size of the SQ too much, resulting in memory allocation failures during QP creation. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Introduce a helper to DMA-map a reply's transport header before sending it. This will in part replace the map vector cache. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: Move the ib_send_wr off the stack, and move common code to post a Send Work Request into a helper. This is a refactoring change only. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 J. Bruce Fields 提交于
A client can append random data to the end of an NFSv2 or NFSv3 RPC call without our complaining; we'll just stop parsing at the end of the expected data and ignore the rest. Encoded arguments and replies are stored together in an array of pages, and if a call is too large it could leave inadequate space for the reply. This is normally OK because NFS RPC's typically have either short arguments and long replies (like READ) or long arguments and short replies (like WRITE). But a client that sends an incorrectly long reply can violate those assumptions. This was observed to cause crashes. So, insist that the argument not be any longer than we expect. Also, several operations increment rq_next_page in the decode routine before checking the argument size, which can leave rq_next_page pointing well past the end of the page array, causing trouble later in svc_free_pages. As followup we may also want to rewrite the encoding routines to check more carefully that they aren't running off the end of the page array. Reported-by: NTuomas Haanpää <thaan@synopsys.com> Reported-by: NAri Kauppi <ari@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 02 3月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 2月, 2017 3 次提交
-
-
由 Jeff Layton 提交于
NFSv4 requires a transport "that is specified to avoid network congestion" (RFC 7530, section 3.1, paragraph 2). In practical terms, that means that you should not run NFSv4 over UDP. The server has never enforced that requirement, however. This patchset fixes this by adding a new flag to the svc_version that states that it has these transport requirements. With that, we can check that the transport has XPT_CONG_CTRL set before processing an RPC. If it doesn't we reject it with RPC_PROG_MISMATCH. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
NFSv4 requires a transport protocol with congestion control in most cases. On an IP network, that means that NFSv4 over UDP should be forbidden. The situation with RDMA is a bit more nuanced, but most RDMA transports are suitable for this. For now, we assume that all RDMA transports are suitable, but we may need to revise that at some point. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Jeff Layton 提交于
It's just simpler to read this way, IMO. Also, no need to explicitly set vs_hidden to false in the nfsacl ones. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 22 2月, 2017 2 次提交
-
-
由 Trond Myklebust 提交于
Create a helper function that decodes a xdr string object, allocates a memory buffer and then store it as a NUL terminated string. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Add some generic helpers for encoding/decoding opaque structures and basic u32/u64. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 10 2月, 2017 3 次提交
-
-
由 Trond Myklebust 提交于
Set the timeout for TCP connections to be 1 lease period to ensure that we don't lose our lease due to a faulty TCP connection. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
When the NFSv4 server tells us the lease period, we usually want to adjust down the timeout parameters on the TCP connection to ensure that we don't miss lease renewals due to a faulty connection. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 09 2月, 2017 8 次提交
-
-
由 Kinglong Mee 提交于
Don't found any place using the cr_magic. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Kinglong Mee 提交于
NFS_NGROUPS has been move to sunrpc, rename to UNX_NGROUPS. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Kinglong Mee 提交于
Record flush/channel/content entries is useless, remove them. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Kinglong Mee 提交于
When the first time pynfs runs after rpc/nfsd startup, always get the warning, "Got error: Connection closed" I found the problem is caused by, 1. A new startup of nfsd, rpc.mountd, etc, 2. A rpc request from client (pynfs test, or normal mounting), 3. An ip_map cache is created but invalid, so upcall to rpc.mountd, 4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd, do auth_reload(), and check_useipaddr(), 5. For the first time, old_use_ipaddr = -1, it causes rpc.mountd do write_flush that doing cache_clean, 6. The ip_map cache will be treat as expired and clean, 7. When rpc.mountd write the valid data to nfsd, a new ip_map is created and updated, the cache_check of old ip_map(doing the upcall) will return -ETIMEDOUT. 8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1 "svcauth_gss: Close connection when dropping an incoming message" NeilBrown suggest in another email, "If CACHE_VALID is not set, then there is no data in the cache item, so there is nothing to expire. So it would be nice if cache items that don't have CACHE_VALID are never treated as expired." v3, change the order of the two patches v2, change the checking of CACHE_PENDING to CACHE_VALID Reviewed-by: NNeilBrown <neilb@suse.com> Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: The free list and the dto_q list fields are never used at the same time. Reduce the size of struct svc_rdma_op_ctxt by combining these fields. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up. Commit be99bb11 ("svcrdma: Use new CQ API for RPC-over-RDMA server send CQs") removed code that used the sc_dto_q field, but neglected to remove sc_dto_q at the same time. Fixes: be99bb11 ("svcrdma: Use new CQ API for RPC-over- ...") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Replace C structure-based XDR decoding with pointer arithmetic. Pointer arithmetic is considered more portable, and is used throughout the kernel's existing XDR encoders. The gcc optimizer generates similar assembler code either way. Byte-swapping before a memory store on x86 typically results in an instruction pipeline stall. Avoid byte-swapping when encoding a new header. svcrdma currently doesn't alter a connection's credit grant value after the connection has been accepted, so it is effectively a constant. Cache the byte-swapped value in a separate field. Christoph suggested pulling the header encoding logic into the only function that uses it. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Commit 5fdca653 ("svcrdma: Renovate sendto chunk list parsing") missed a spot. svc_rdma_xdr_get_reply_hdr_len() also assumes the Write list has only one Write chunk. There's no harm in making this code more general. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 01 2月, 2017 1 次提交
-
-
由 Neil Brown 提交于
We currently handle a client PROC_DESTROY request by turning it CACHE_NEGATIVE, setting the expired time to now, and then waiting for cache_clean to clean it up later. Since we forgot to set the cache's nextcheck value, that could take up to 30 minutes. Also, though there's probably no real bug in this case, setting CACHE_NEGATIVE directly like this probably isn't a great idea in general. So let's just remove the entry from the cache directly, and move this bit of cache manipulation to a helper function. Signed-off-by: NNeil Brown <neilb@suse.com> Reported-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NAndy Adamson <andros@netapp.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 31 1月, 2017 1 次提交
-
-
由 Joe Perches 提交于
Allow line continuations to work properly with KERN_CONT. Signed-off-by: NJoe Perches <joe@perches.com> [Anna: Add fallback dprintk_cont() for when CONFIG_SUNRPC_DEBUG=n] Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 25 1月, 2017 1 次提交
-
-
由 Kinglong Mee 提交于
After removing sunrpc module, I get many kmemleak information as, unreferenced object 0xffff88003316b1e0 (size 544): comm "gssproxy", pid 2148, jiffies 4294794465 (age 4200.081s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffb0cfb58a>] kmemleak_alloc+0x4a/0xa0 [<ffffffffb03507fe>] kmem_cache_alloc+0x15e/0x1f0 [<ffffffffb0639baa>] ida_pre_get+0xaa/0x150 [<ffffffffb0639cfd>] ida_simple_get+0xad/0x180 [<ffffffffc06054fb>] nlmsvc_lookup_host+0x4ab/0x7f0 [lockd] [<ffffffffc0605e1d>] lockd+0x4d/0x270 [lockd] [<ffffffffc06061e5>] param_set_timeout+0x55/0x100 [lockd] [<ffffffffc06cba24>] svc_defer+0x114/0x3f0 [sunrpc] [<ffffffffc06cbbe7>] svc_defer+0x2d7/0x3f0 [sunrpc] [<ffffffffc06c71da>] rpc_show_info+0x8a/0x110 [sunrpc] [<ffffffffb044a33f>] proc_reg_write+0x7f/0xc0 [<ffffffffb038e41f>] __vfs_write+0xdf/0x3c0 [<ffffffffb0390f1f>] vfs_write+0xef/0x240 [<ffffffffb0392fbd>] SyS_write+0xad/0x130 [<ffffffffb0d06c37>] entry_SYSCALL_64_fastpath+0x1a/0xa9 [<ffffffffffffffff>] 0xffffffffffffffff I found, the ida information (dynamic memory) isn't cleanup. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Fixes: 2f048db4 ("SUNRPC: Add an identifier for struct rpc_clnt") Cc: stable@vger.kernel.org # v3.12+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 14 1月, 2017 1 次提交
-
-
由 Peter Zijlstra 提交于
Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages. Kills two anti-patterns: atomic_read(&kref->refcount) kref->refcount.counter Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 1月, 2017 1 次提交
-
-
由 Scott Mayhew 提交于
The inet6addr_chain is an atomic notifier chain, so we can't call anything that might sleep (like lock_sock)... instead of closing the socket from svc_age_temp_xprts_now (which is called by the notifier function), just have the rpc service threads do it instead. Cc: stable@vger.kernel.org Fixes: c3d4879e "sunrpc: Add a function to close..." Signed-off-by: NScott Mayhew <smayhew@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 01 12月, 2016 4 次提交
-
-
由 Chuck Lever 提交于
Clean up: Completion status is already reported in the individual completion handlers. Save a few bytes in struct svc_rdma_op_ctxt. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: sc_dma_used is not required for correct operation. It is simply a debugging tool to report when svcrdma has leaked DMA maps. However, manipulating an atomic has a measurable CPU cost, and DMA map accounting specific to svcrdma will be meaningless once svcrdma is converted to use the new generic r/w API. A similar kind of debug accounting can be done simply by enabling the IOMMU or by using CONFIG_DMA_API_DEBUG, CONFIG_IOMMU_DEBUG, and CONFIG_IOMMU_LEAK. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
svcrdma's current SQ accounting algorithm takes sc_lock and disables bottom-halves while posting all RDMA Read, Write, and Send WRs. This is relatively heavyweight serialization. And note that Write and Send are already fully serialized by the xpt_mutex. Using a single atomic_t should be all that is necessary to guarantee that ib_post_send() is called only when there is enough space on the send queue. This is what the other RDMA-enabled storage targets do. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The current sendto code appears to support clients that provide only one of a Read list, a Write list, or a Reply chunk. My reading of that code is that it doesn't support the following cases: - Read list + Write list - Read list + Reply chunk - Write list + Reply chunk - Read list + Write list + Reply chunk The protocol allows more than one Read or Write chunk in those lists. Some clients do send a Read list and Reply chunk simultaneously. NFSv4 WRITE uses a Read list for the data payload, and a Reply chunk because the GETATTR result in the reply can contain a large object like an ACL. Generalize one of the sendto code paths needed to support all of the above cases, and attempt to ensure that only one pass is done through the RPC Call's transport header to gather chunk list information for building the reply. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 11月, 2016 1 次提交
-
-
由 Scott Mayhew 提交于
This fixes the following panic that can occur with NFSoRDMA. general protection fault: 0000 [#1] SMP Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm sg ioatdma ipmi_devintf ipmi_ssif dcdbas iTCO_wdt iTCO_vendor_support pcspkr irqbypass sb_edac shpchp dca crc32_pclmul ghash_clmulni_intel edac_core lpc_ich aesni_intel lrw gf128mul glue_helper ablk_helper mei_me mei ipmi_si cryptd wmi ipmi_msghandler acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops ttm libahci mlx5_core tg3 crct10dif_pclmul drm crct10dif_common ptp i2c_core libata crc32c_intel pps_core fjes dm_mirror dm_region_hash dm_log dm_mod CPU: 1 PID: 120 Comm: kworker/1:1 Not tainted 3.10.0-514.el7.x86_64 #1 Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.4.2 01/29/2015 Workqueue: events check_lifetime task: ffff88031f506dd0 ti: ffff88031f584000 task.ti: ffff88031f584000 RIP: 0010:[<ffffffff8168d847>] [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50 RSP: 0018:ffff88031f587ba8 EFLAGS: 00010206 RAX: 0000000000020000 RBX: 20041fac02080072 RCX: ffff88031f587fd8 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 20041fac02080072 RBP: ffff88031f587bb0 R08: 0000000000000008 R09: ffffffff8155be77 R10: ffff880322a59b00 R11: ffffea000bf39f00 R12: 20041fac02080072 R13: 000000000000000d R14: ffff8800c4fbd800 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff880322a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3c52d4547e CR3: 00000000019ba000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: 20041fac02080002 ffff88031f587bd0 ffffffff81557830 20041fac02080002 ffff88031f587c78 ffff88031f587c40 ffffffff8155ae08 000000010157df32 0000000800000001 ffff88031f587c20 ffffffff81096acb ffffffff81aa37d0 Call Trace: [<ffffffff81557830>] lock_sock_nested+0x20/0x50 [<ffffffff8155ae08>] sock_setsockopt+0x78/0x940 [<ffffffff81096acb>] ? lock_timer_base.isra.33+0x2b/0x50 [<ffffffff8155397d>] kernel_setsockopt+0x4d/0x50 [<ffffffffa0386284>] svc_age_temp_xprts_now+0x174/0x1e0 [sunrpc] [<ffffffffa03b681d>] nfsd_inetaddr_event+0x9d/0xd0 [nfsd] [<ffffffff81691ebc>] notifier_call_chain+0x4c/0x70 [<ffffffff810b687d>] __blocking_notifier_call_chain+0x4d/0x70 [<ffffffff810b68b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff815e8538>] __inet_del_ifa+0x168/0x2d0 [<ffffffff815e8cef>] check_lifetime+0x25f/0x270 [<ffffffff810a7f3b>] process_one_work+0x17b/0x470 [<ffffffff810a8d76>] worker_thread+0x126/0x410 [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460 [<ffffffff810b052f>] kthread+0xcf/0xe0 [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81696418>] ret_from_fork+0x58/0x90 [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 Code: ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb d9 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 7e 04 a0 ff b8 00 00 02 00 <f0> 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 5b 5d c3 83 e2 fe 0f RIP [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50 RSP <ffff88031f587ba8> Signed-off-by: NScott Mayhew <smayhew@redhat.com> Fixes: c3d4879e ("sunrpc: Add a function to close temporary transports immediately") Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 01 10月, 2016 1 次提交
-
-
由 Frank Sorenson 提交于
Currently, a single hash algorithm is used to hash the auth_cred for the credcache for all rpc_auth types. Add a hash_cred() function to the rpc_authops struct to allow a hash function specific to each auth flavor. Signed-off-by: NFrank Sorenson <sorenson@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 23 9月, 2016 3 次提交
-
-
由 Chuck Lever 提交于
Support Remote Invalidation. A private message is exchanged with the client upon RDMA transport connect that indicates whether Send With Invalidation may be used by the server to send RPC replies. The invalidate_rkey is arbitrarily chosen from among rkeys present in the RPC-over-RDMA header's chunk lists. Send With Invalidate improves performance only when clients can recognize, while processing an RPC reply, that an rkey has already been invalidated. That has been submitted as a separate change. In the future, the RPC-over-RDMA protocol might support Remote Invalidation properly. The protocol needs to enable signaling between peers to indicate when Remote Invalidation can be used for each individual RPC. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Introduce data structure used by both client and server to exchange implementation details during RDMA/CM connection establishment. This is an experimental out-of-band exchange between Linux RPC-over-RDMA Version One implementations, replacing the deprecated CCP (see RFC 5666bis). The purpose of this extension is to enable prototyping of features that might be introduced in a subsequent version of RPC-over-RDMA. Suggested by Christoph Hellwig and Devesh Sharma. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
The ctxt's count field is overloaded to mean the number of pages in the ctxt->page array and the number of SGEs in the ctxt->sge array. Typically these two numbers are the same. However, when an inline RPC reply is constructed from an xdr_buf with a tail iovec, the head and tail often occupy the same page, but each are DMA mapped independently. In that case, ->count equals the number of pages, but it does not equal the number of SGEs. There's one more SGE, for the tail iovec. Hence there is one more DMA mapping than there are pages in the ctxt->page array. This isn't a real problem until the server's iommu is enabled. Then each RPC reply that has content in that iovec orphans a DMA mapping that consists of real resources. krb5i and krb5p always populate that tail iovec. After a couple million sent krb5i/p RPC replies, the NFS server starts behaving erratically. Reboot is needed to clear the problem. Fixes: 9d11b51c ("svcrdma: Fix send_reply() scatter/gather set-up") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 20 9月, 2016 2 次提交
-
-
由 Chuck Lever 提交于
The Version One default inline threshold is still 1KB. But allow testing with thresholds up to 64KB. This maximum is somewhat arbitrary. There's no fundamental architectural limit I'm aware of, but it's good to keep the size of Receive buffers reasonable. Now that Send can use a s/g list, a Send buffer is only as large as each RPC requires. Receive buffers are always the size of the inline threshold, however. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Send an RDMA-CM private message on connect, and look for one during a connection-established event. Both sides can communicate their various implementation limits. Implementations that don't support this sideband protocol ignore it. Once the client knows the server's inline threshold maxima, it can adjust the use of Reply chunks, and eliminate most use of Position Zero Read chunks. Moderately-sized I/O can be done using a pure inline RDMA Send instead of RDMA operations that require memory registration. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-