- 17 1月, 2019 1 次提交
-
-
由 Vasily Averin 提交于
commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream. if node have NFSv41+ mounts inside several net namespaces it can lead to use-after-free in svc_process_common() svc_process_common() /* Setup reply header */ rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE svc_process_common() can use incorrect rqstp->rq_xprt, its caller function bc_svc_process() takes it from serv->sv_bc_xprt. The problem is that serv is global structure but sv_bc_xprt is assigned per-netnamespace. According to Trond, the whole "let's set up rqstp->rq_xprt for the back channel" is nothing but a giant hack in order to work around the fact that svc_process_common() uses it to find the xpt_ops, and perform a couple of (meaningless for the back channel) tests of xpt_flags. All we really need in svc_process_common() is to be able to run rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr() Bruce J Fields points that this xpo_prep_reply_hdr() call is an awfully roundabout way just to do "svc_putnl(resv, 0);" in the tcp case. This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(), now it calls svc_process_common() with rqstp->rq_xprt = NULL. To adjust reply header svc_process_common() just check rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case. To handle rqstp->rq_xprt = NULL case in functions called from svc_process_common() patch intruduces net namespace pointer svc_rqst->rq_bc_net and adjust SVC_NET() definition. Some other function was also adopted to properly handle described case. Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Cc: stable@vger.kernel.org Fixes: 23c20ecd ("NFS: callback up - users counting cleanup") Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> v2: added lost extern svc_tcp_prep_reply_hdr() Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 1月, 2019 1 次提交
-
-
由 Deepa Dinamani 提交于
[ Upstream commit 3a0ed3e9619738067214871e9cb826fa23b2ddb9 ] Al Viro mentioned (Message-ID <20170626041334.GZ10672@ZenIV.linux.org.uk>) that there is probably a race condition lurking in accesses of sk_stamp on 32-bit machines. sock->sk_stamp is of type ktime_t which is always an s64. On a 32 bit architecture, we might run into situations of unsafe access as the access to the field becomes non atomic. Use seqlocks for synchronization. This allows us to avoid using spinlocks for readers as readers do not need mutual exclusion. Another approach to solve this is to require sk_lock for all modifications of the timestamps. The current approach allows for timestamps to have their own lock: sk_stamp_lock. This allows for the patch to not compete with already existing critical sections, and side effects are limited to the paths in the patch. The addition of the new field maintains the data locality optimizations from commit 9115e8cd ("net: reorganize struct sock for better data locality") Note that all the instances of the sk_stamp accesses are either through the ioctl or the syscall recvmsg. Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 04 4月, 2018 2 次提交
-
-
由 Chuck Lever 提交于
TP_printk defines a format string that is passed to user space for converting raw trace event records to something human-readable. My user space's printf (Oracle Linux 7), however, does not have a %pI format specifier. The result is that what is supposed to be an IP address in the output of "trace-cmd report" is just a string that says the field couldn't be displayed. To fix this, adopt the same approach as the client: maintain a pre- formated presentation address for occasions when %pI is not available. The location of the trace_svc_send trace point is adjusted so that rqst->rq_xprt is not NULL when the trace event is recorded. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
Clean up: Instead of returning a value that is used to set or clear a bit, just make ->xpo_secure_port mangle that bit, and return void. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 13 2月, 2018 1 次提交
-
-
由 Denys Vlasenko 提交于
Changes since v1: Added changes in these files: drivers/infiniband/hw/usnic/usnic_transport.c drivers/staging/lustre/lnet/lnet/lib-socket.c drivers/target/iscsi/iscsi_target_login.c drivers/vhost/net.c fs/dlm/lowcomms.c fs/ocfs2/cluster/tcp.c security/tomoyo/network.c Before: All these functions either return a negative error indicator, or store length of sockaddr into "int *socklen" parameter and return zero on success. "int *socklen" parameter is awkward. For example, if caller does not care, it still needs to provide on-stack storage for the value it does not need. None of the many FOO_getname() functions of various protocols ever used old value of *socklen. They always just overwrite it. This change drops this parameter, and makes all these functions, on success, return length of sockaddr. It's always >= 0 and can be differentiated from an error. Tests in callers are changed from "if (err)" to "if (err < 0)", where needed. rpc_sockname() lost "int buflen" parameter, since its only use was to be passed to kernel_getsockname() as &buflen and subsequently not used in any way. Userspace API is not changed. text data bss dec hex filename 30108430 2633624 873672 33615726 200ef6e vmlinux.before.o 30108109 2633612 873672 33615393 200ee21 vmlinux.o Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> CC: David S. Miller <davem@davemloft.net> CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org CC: linux-bluetooth@vger.kernel.org CC: linux-decnet-user@lists.sourceforge.net CC: linux-wireless@vger.kernel.org CC: linux-rdma@vger.kernel.org CC: linux-sctp@vger.kernel.org CC: linux-nfs@vger.kernel.org CC: linux-x25@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Setting values in struct sock directly is the usual method. Remove the long dead code using set_fs() and the related comment. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 03 12月, 2017 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 8月, 2017 2 次提交
-
-
由 Chuck Lever 提交于
Close an attack vector by moving the arrays of server-side transport methods to read-only memory. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Vadim Lomovtsev 提交于
While running nfs/connectathon tests kernel NULL-pointer exception has been observed due to races in svcsock.c. Race is appear when kernel accepts connection by kernel_accept (which creates new socket) and start queuing ingress packets to new socket. This happens in ksoftirq context which could run concurrently on a different core while new socket setup is not done yet. The fix is to re-order socket user data init sequence and add write/read barrier calls to be sure that we got proper values for callback pointers before actually calling them. Test results: nfs/connectathon reports '0' failed tests for about 200+ iterations. Crash log: ---<-snip->--- [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 6708.647093] pgd = ffff0000094e0000 [ 6708.650497] [00000000] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff80003, *pte=0000000000000000 [ 6708.660761] Internal error: Oops: 86000005 [#1] SMP [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_3c300909c5b3f46dcacd49aab3334af_87021] [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: G W OE 4.11.0-4.el7.aarch64 #1 [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 2017 [ 6708.767910] task: ffff810006842e80 task.stack: ffff81000689c000 [ 6708.773822] PC is at 0x0 [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] [ 6708.781866] pc : [<0000000000000000>] lr : [<ffff0000029d7378>] pstate: 60000145 [ 6708.789248] sp : ffff810ffbad3900 [ 6708.792551] x29: ffff810ffbad3900 x28: ffff000008c73d58 [ 6708.797853] x27: 0000000000000000 x26: ffff81000bbe1e00 [ 6708.803156] x25: 0000000000000020 x24: ffff800f7410bf28 [ 6708.808458] x23: ffff000008c63000 x22: ffff000008c63000 [ 6708.813760] x21: ffff800f7410bf28 x20: ffff81000bbe1e00 [ 6708.819063] x19: ffff810012412400 x18: 00000000d82a9df2 [ 6708.824365] x17: 0000000000000000 x16: 0000000000000000 [ 6708.829667] x15: 0000000000000000 x14: 0000000000000001 [ 6708.834969] x13: 0000000000000000 x12: 722e736f622e676e [ 6708.840271] x11: 00000000f814dd99 x10: 0000000000000000 [ 6708.845573] x9 : 7374687225000000 x8 : 0000000000000000 [ 6708.850875] x7 : 0000000000000000 x6 : 0000000000000000 [ 6708.856177] x5 : 0000000000000028 x4 : 0000000000000000 [ 6708.861479] x3 : 0000000000000000 x2 : 00000000e5000000 [ 6708.866781] x1 : 0000000000000000 x0 : ffff81000bbe1e00 [ 6708.872084] [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0xffff81000689c000) [ 6708.880341] Stack: (0xffff810ffbad3900 to 0xffff8100068a0000) [ 6708.886075] Call trace: [ 6708.888513] Exception stack(0xffff810ffbad3710 to 0xffff810ffbad3840) [ 6708.894942] 3700: ffff810012412400 0001000000000000 [ 6708.902759] 3720: ffff810ffbad3900 0000000000000000 0000000060000145 ffff800f79300000 [ 6708.910577] 3740: ffff000009274d00 00000000000003ea 0000000000000015 ffff000008c63000 [ 6708.918395] 3760: ffff810ffbad3830 ffff800f79300000 000000000000004d 0000000000000000 [ 6708.926212] 3780: ffff810ffbad3890 ffff0000080f88dc ffff800f79300000 000000000000004d [ 6708.934030] 37a0: ffff800f7930093c ffff000008c63000 0000000000000000 0000000000000140 [ 6708.941848] 37c0: ffff000008c2c000 0000000000040b00 ffff81000bbe1e00 0000000000000000 [ 6708.949665] 37e0: 00000000e5000000 0000000000000000 0000000000000000 0000000000000028 [ 6708.957483] 3800: 0000000000000000 0000000000000000 0000000000000000 7374687225000000 [ 6708.965300] 3820: 0000000000000000 00000000f814dd99 722e736f622e676e 0000000000000000 [ 6708.973117] [< (null)>] (null) [ 6708.977824] [<ffff0000086f9fa4>] tcp_data_queue+0x754/0xc5c [ 6708.983386] [<ffff0000086fa64c>] tcp_rcv_established+0x1a0/0x67c [ 6708.989384] [<ffff000008704120>] tcp_v4_do_rcv+0x15c/0x22c [ 6708.994858] [<ffff000008707418>] tcp_v4_rcv+0xaf0/0xb58 [ 6709.000077] [<ffff0000086df784>] ip_local_deliver_finish+0x10c/0x254 [ 6709.006419] [<ffff0000086dfea4>] ip_local_deliver+0xf0/0xfc [ 6709.011980] [<ffff0000086dfad4>] ip_rcv_finish+0x208/0x3a4 [ 6709.017454] [<ffff0000086e018c>] ip_rcv+0x2dc/0x3c8 [ 6709.022328] [<ffff000008692fc8>] __netif_receive_skb_core+0x2f8/0xa0c [ 6709.028758] [<ffff000008696068>] __netif_receive_skb+0x38/0x84 [ 6709.034580] [<ffff00000869611c>] netif_receive_skb_internal+0x68/0xdc [ 6709.041010] [<ffff000008696bc0>] napi_gro_receive+0xcc/0x1a8 [ 6709.046690] [<ffff0000014b0fc4>] nicvf_cq_intr_handler+0x59c/0x730 [nicvf] [ 6709.053559] [<ffff0000014b1380>] nicvf_poll+0x38/0xb8 [nicvf] [ 6709.059295] [<ffff000008697a6c>] net_rx_action+0x2f8/0x464 [ 6709.064771] [<ffff000008081824>] __do_softirq+0x11c/0x308 [ 6709.070164] [<ffff0000080d14e4>] irq_exit+0x12c/0x174 [ 6709.075206] [<ffff00000813101c>] __handle_domain_irq+0x78/0xc4 [ 6709.081027] [<ffff000008081608>] gic_handle_irq+0x94/0x190 [ 6709.086501] Exception stack(0xffff81000689fdf0 to 0xffff81000689ff20) [ 6709.092929] fde0: 0000810ff2ec0000 ffff000008c10000 [ 6709.100747] fe00: ffff000008c70ef4 0000000000000001 0000000000000000 ffff810ffbad9b18 [ 6709.108565] fe20: ffff810ffbad9c70 ffff8100169d3800 ffff810006843ab0 ffff81000689fe80 [ 6709.116382] fe40: 0000000000000bd0 0000ffffdf979cd0 183f5913da192500 0000ffff8a254ce4 [ 6709.124200] fe60: 0000ffff8a254b78 0000aaab10339808 0000000000000000 0000ffff8a0c2a50 [ 6709.132018] fe80: 0000ffffdf979b10 ffff000008d6d450 ffff000008c10000 ffff000008d6d000 [ 6709.139836] fea0: 0000000000000054 ffff000008cd3dbc 0000000000000000 0000000000000000 [ 6709.147653] fec0: 0000000000000000 0000000000000000 0000000000000000 ffff81000689ff20 [ 6709.155471] fee0: ffff000008085240 ffff81000689ff20 ffff000008085244 0000000060000145 [ 6709.163289] ff00: ffff81000689ff10 ffff00000813f1e4 ffffffffffffffff ffff00000813f238 [ 6709.171107] [<ffff000008082eb4>] el1_irq+0xb4/0x140 [ 6709.175976] [<ffff000008085244>] arch_cpu_idle+0x44/0x11c [ 6709.181368] [<ffff0000087bf3b8>] default_idle_call+0x20/0x30 [ 6709.187020] [<ffff000008116d50>] do_idle+0x158/0x1e4 [ 6709.191973] [<ffff000008116ff4>] cpu_startup_entry+0x2c/0x30 [ 6709.197624] [<ffff00000808e7cc>] secondary_start_kernel+0x13c/0x160 [ 6709.203878] [<0000000001bc71c4>] 0x1bc71c4 [ 6709.207967] Code: bad PC value [ 6709.211061] SMP: stopping secondary CPUs [ 6709.218830] Starting crashdump kernel... [ 6709.222749] Bye! ---<-snip>--- Signed-off-by: NVadim Lomovtsev <vlomovts@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 19 8月, 2017 1 次提交
-
-
由 Trond Myklebust 提交于
This further reduces contention with the transport_lock, and allows us to convert to using a non-bh-safe spinlock, since the list is now never accessed from a bh context. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 10 3月, 2017 1 次提交
-
-
由 Kinglong Mee 提交于
The xprt for backchannel is created separately, not in TCP/UDP code. It needs the XPT_CONG_CTRL flag set on it too--otherwise requests on the NFSv4.1 backchannel are rjected in svc_process_common(): 1191 if (versp->vs_need_cong_ctrl && 1192 !test_bit(XPT_CONG_CTRL, &rqstp->rq_xprt->xpt_flags)) 1193 goto err_bad_vers; Fixes: 5283b03e ("nfs/nfsd/sunrpc: enforce transport...") Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 28 2月, 2017 1 次提交
-
-
由 Alexey Dobriyan 提交于
Now that %z is standartised in C99 there is no reason to support %Z. Unlike %L it doesn't even make format strings smaller. Use BUILD_BUG_ON in a couple ATM drivers. In case anyone didn't notice lib/vsprintf.o is about half of SLUB which is in my opinion is quite an achievement. Hopefully this patch inspires someone else to trim vsprintf.c more. Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 2月, 2017 1 次提交
-
-
由 Jeff Layton 提交于
NFSv4 requires a transport protocol with congestion control in most cases. On an IP network, that means that NFSv4 over UDP should be forbidden. The situation with RDMA is a bit more nuanced, but most RDMA transports are suitable for this. For now, we assume that all RDMA transports are suitable, but we may need to revise that at some point. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 26 12月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
ktime is a union because the initial implementation stored the time in scalar nanoseconds on 64 bit machine and in a endianess optimized timespec variant for 32bit machines. The Y2038 cleanup removed the timespec variant and switched everything to scalar nanoseconds. The union remained, but become completely pointless. Get rid of the union and just keep ktime_t as simple typedef of type s64. The conversion was done with coccinelle and some manual mopping up. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
-
- 25 12月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 11月, 2016 1 次提交
-
-
由 Scott Mayhew 提交于
This fixes the following panic that can occur with NFSoRDMA. general protection fault: 0000 [#1] SMP Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm sg ioatdma ipmi_devintf ipmi_ssif dcdbas iTCO_wdt iTCO_vendor_support pcspkr irqbypass sb_edac shpchp dca crc32_pclmul ghash_clmulni_intel edac_core lpc_ich aesni_intel lrw gf128mul glue_helper ablk_helper mei_me mei ipmi_si cryptd wmi ipmi_msghandler acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops ttm libahci mlx5_core tg3 crct10dif_pclmul drm crct10dif_common ptp i2c_core libata crc32c_intel pps_core fjes dm_mirror dm_region_hash dm_log dm_mod CPU: 1 PID: 120 Comm: kworker/1:1 Not tainted 3.10.0-514.el7.x86_64 #1 Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.4.2 01/29/2015 Workqueue: events check_lifetime task: ffff88031f506dd0 ti: ffff88031f584000 task.ti: ffff88031f584000 RIP: 0010:[<ffffffff8168d847>] [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50 RSP: 0018:ffff88031f587ba8 EFLAGS: 00010206 RAX: 0000000000020000 RBX: 20041fac02080072 RCX: ffff88031f587fd8 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 20041fac02080072 RBP: ffff88031f587bb0 R08: 0000000000000008 R09: ffffffff8155be77 R10: ffff880322a59b00 R11: ffffea000bf39f00 R12: 20041fac02080072 R13: 000000000000000d R14: ffff8800c4fbd800 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff880322a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3c52d4547e CR3: 00000000019ba000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: 20041fac02080002 ffff88031f587bd0 ffffffff81557830 20041fac02080002 ffff88031f587c78 ffff88031f587c40 ffffffff8155ae08 000000010157df32 0000000800000001 ffff88031f587c20 ffffffff81096acb ffffffff81aa37d0 Call Trace: [<ffffffff81557830>] lock_sock_nested+0x20/0x50 [<ffffffff8155ae08>] sock_setsockopt+0x78/0x940 [<ffffffff81096acb>] ? lock_timer_base.isra.33+0x2b/0x50 [<ffffffff8155397d>] kernel_setsockopt+0x4d/0x50 [<ffffffffa0386284>] svc_age_temp_xprts_now+0x174/0x1e0 [sunrpc] [<ffffffffa03b681d>] nfsd_inetaddr_event+0x9d/0xd0 [nfsd] [<ffffffff81691ebc>] notifier_call_chain+0x4c/0x70 [<ffffffff810b687d>] __blocking_notifier_call_chain+0x4d/0x70 [<ffffffff810b68b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff815e8538>] __inet_del_ifa+0x168/0x2d0 [<ffffffff815e8cef>] check_lifetime+0x25f/0x270 [<ffffffff810a7f3b>] process_one_work+0x17b/0x470 [<ffffffff810a8d76>] worker_thread+0x126/0x410 [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460 [<ffffffff810b052f>] kthread+0xcf/0xe0 [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81696418>] ret_from_fork+0x58/0x90 [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 Code: ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb d9 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 7e 04 a0 ff b8 00 00 02 00 <f0> 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 5b 5d c3 83 e2 fe 0f RIP [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50 RSP <ffff88031f587ba8> Signed-off-by: NScott Mayhew <smayhew@redhat.com> Fixes: c3d4879e ("sunrpc: Add a function to close temporary transports immediately") Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 08 11月, 2016 1 次提交
-
-
由 Paolo Abeni 提交于
A new argument is added to __skb_recv_datagram to provide an explicit skb destructor, invoked under the receive queue lock. The UDP protocol uses such argument to perform memory reclaiming on dequeue, so that the UDP protocol does not set anymore skb->desctructor. Instead explicit memory reclaiming is performed at close() time and when skbs are removed from the receive queue. The in kernel UDP protocol users now need to call a skb_recv_udp() variant instead of skb_recv_datagram() to properly perform memory accounting on dequeue. Overall, this allows acquiring only once the receive queue lock on dequeue. Tested using pktgen with random src port, 64 bytes packet, wire-speed on a 10G link as sender and udp_sink as the receiver, using an l4 tuple rxhash to stress the contention, and one or more udp_sink instances with reuseport. nr sinks vanilla patched 1 440 560 3 2150 2300 6 3650 3800 9 4450 4600 12 6250 6450 v1 -> v2: - do rmem and allocated memory scheduling under the receive lock - do bulk scheduling in first_packet_length() and in udp_destruct_sock() - avoid the typdef for the dequeue callback Suggested-by: NEric Dumazet <edumazet@google.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 10月, 2016 1 次提交
-
-
由 Paolo Abeni 提交于
Completely avoid default sock memory accounting and replace it with udp-specific accounting. Since the new memory accounting model encapsulates completely the required locking, remove the socket lock on both enqueue and dequeue, and avoid using the backlog on enqueue. Be sure to clean-up rx queue memory on socket destruction, using udp its own sk_destruct. Tested using pktgen with random src port, 64 bytes packet, wire-speed on a 10G link as sender and udp_sink as the receiver, using an l4 tuple rxhash to stress the contention, and one or more udp_sink instances with reuseport. nr readers Kpps (vanilla) Kpps (patched) 1 170 440 3 1250 2150 6 3000 3650 9 4200 4450 12 5700 6250 v4 -> v5: - avoid unneeded test in first_packet_length v3 -> v4: - remove useless sk_rcvqueues_full() call v2 -> v3: - do not set the now unsed backlog_rcv callback v1 -> v2: - add memory pressure support - fixed dropwatch accounting for ipv6 Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 8月, 2016 2 次提交
-
-
由 Trond Myklebust 提交于
This modification is useful for debugging issues that happen while the socket is being initialised. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Trond Myklebust 提交于
We're seeing traces of the following form: [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2 [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80 [10952.396362] nfsd: connect from 10.2.6.1, port=187 [10952.396364] svc: svc_setup_socket ffff8800b99bcf00 [10952.396368] setting up TCP socket for reading [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800) [10952.396373] svc: transport ffff8803eb10a000 put into queue [10952.396375] svc: transport ffff88042ba4a000 put into queue [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000) [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2 [10952.396381] svc_recv: found XPT_CLOSE [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000) [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000) [10952.396399] svc: svc_sock_detach(ffff8803eb10a000) [10952.396412] svc: svc_sock_free(ffff8803eb10a000) i.e. an immediate close of the socket after initialisation. The culprit appears to be the test at the end of svc_tcp_init, which checks if the newly created socket is in the TCP_ESTABLISHED state, and immediately closes it if not. The evidence appears to suggest that the socket might still be in the SYN_RECV state at this time. The fix is to check for both states, and then to add a check in svc_tcp_state_change() to ensure we don't close the socket when it transitions into TCP_ESTABLISHED. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 7月, 2016 4 次提交
-
-
由 Trond Myklebust 提交于
The current server rpc tcp code attempts to predict how much writeable socket space will be available to a given RPC call before accepting it for processing. On a 40GigE network, we've found this throttles individual clients long before the network or disk is saturated. The server may handle more clients easily, but the bandwidth of individual clients is still artificially limited. Instead of trying (and failing) to predict how much writeable socket space will be available to the RPC call, just fall back to the simple model of deferring processing until the socket is uncongested. This may increase the risk of fast clients starving slower clients; in such cases, the previous patch allows setting a hard per-connection limit. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Trond Myklebust 提交于
Don't call svc_xprt_enqueue() if the XPT_DATA flag is already set. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Trond Myklebust 提交于
Rather than code up our own versions of the socket callbacks, just call the defaults. This also allows us to merge svc_udp_data_ready() and svc_tcp_data_ready(). Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Trond Myklebust 提交于
Prevent callbacks from triggering while we're detaching the socket. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 14 4月, 2016 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
sock_owned_by_user should not be used without socket lock held. It seems to be a common practice to check .owned before lock reclassification, so provide a little help to abstract this check away. Cc: linux-cifs@vger.kernel.org Cc: linux-bluetooth@vger.kernel.org Cc: linux-nfs@vger.kernel.org Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 4月, 2016 1 次提交
-
-
由 Willem de Bruijn 提交于
Commit e6afc8ac modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing") Reported-by: NFranklin S Cooper Jr. <fcooper@ti.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Tested-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2015 1 次提交
-
-
由 J. Bruce Fields 提交于
We're missing memory barriers in net/sunrpc/svcsock.c in some spots we'd expect them. But it doesn't appear they're necessary in our case, and this is likely a hot path--for now just document the odd behavior. Kosuke Tatsukawa found this issue while looking through the linux source code for places calling waitqueue_active() before wake_up*(), but without preceding memory barriers, after sending a patch to fix a similar issue in drivers/tty/n_tty.c (Details about the original issue can be found here: https://lkml.org/lkml/2015/9/28/849). Reported-by: NKosuke Tatsukawa <tatsu@ab.jp.nec.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 10 11月, 2015 1 次提交
-
-
由 Stefan Hajnoczi 提交于
The svc_setup_socket() function does set the send and receive buffer sizes, so the comment is out-of-date: Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 24 10月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
If we're sending more pages via kernel_sendpage(), then set MSG_SENDPAGE_NOTLAST. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 12 4月, 2015 1 次提交
-
-
由 Al Viro 提交于
it's equal to iov_iter_count(&msg->msg_iter) in all cases Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 12月, 2014 1 次提交
-
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 20 11月, 2014 1 次提交
-
-
由 Trond Myklebust 提交于
Both xprt_lookup_rqst() and xprt_complete_rqst() require that you take the transport lock in order to avoid races with xprt_transmit(). Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Reviewed-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 29 8月, 2014 1 次提交
-
-
由 Chuck Lever 提交于
xprt_lookup_rqst() and bc_send_request() display a byte-swapped XID, but receive_cb_reply() does not. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 18 8月, 2014 1 次提交
-
-
由 Trond Myklebust 提交于
We really do not want to do ioctls in the server's fast path. Instead, let's use the fact that we managed to read a full record as the indicator that we should try to read the socket again. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 30 7月, 2014 2 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Trond Myklebust 提交于
If requests are queued in the socket inbuffer waiting for an svc_tcp_has_wspace() requirement to be satisfied, then we do not want to clear the SOCK_NOSPACE flag until we've satisfied that requirement. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 18 7月, 2014 1 次提交
-
-
由 Chuck Lever 提交于
The current code always selects XPRT_TRANSPORT_BC_TCP for the back channel, even when the forward channel was not TCP (eg, RDMA). When a 4.1 mount is attempted with RDMA, the server panics in the TCP BC code when trying to send CB_NULL. Instead, construct the transport protocol number from the forward channel transport or'd with XPRT_TRANSPORT_BC. Transports that do not support bi-directional RPC will not have registered a "BC" transport, causing create_backchannel_client() to fail immediately. Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 31 5月, 2014 1 次提交
-
-
由 Kinglong Mee 提交于
When debugging, rpc prints messages from dprintk(KERN_WARNING ...) with "^A4" prefixed, [ 2780.339988] ^A4nfsd: connect from unprivileged port: 127.0.0.1, port=35316 Trond tells, > dprintk != printk. We have NEVER supported dprintk(KERN_WARNING...) This patch removes using of dprintk with KERN_WARNING. Signed-off-by: NKinglong Mee <kinglongmee@gmail.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 23 5月, 2014 2 次提交
-
-
由 NeilBrown 提交于
If an incoming NFS request is coming from the local host, then nfsd will need to perform some special handling. So detect that possibility and make the source visible in rq_local. Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chuck Lever 提交于
An NFS/RDMA client's source port is meaningless for RDMA transports. The transport layer typically sets the source port value on the connection to a random ephemeral port. Currently, NFS server administrators must specify the "insecure" export option to enable clients to access exports via RDMA. But this means NFS clients can access such an export via IP using an ephemeral port, which may not be desirable. This patch eliminates the need to specify the "insecure" export option to allow NFS/RDMA clients access to an export. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-