- 02 11月, 2017 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org> Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 22 10月, 2017 1 次提交
-
-
由 Craig Gallek 提交于
Syzkaller stumbled upon a way to trigger WARNING: CPU: 1 PID: 13881 at net/core/sock_reuseport.c:41 reuseport_alloc+0x306/0x3b0 net/core/sock_reuseport.c:39 There are two initialization paths for the sock_reuseport structure in a socket: Through the udp/tcp bind paths of SO_REUSEPORT sockets or through SO_ATTACH_REUSEPORT_[CE]BPF before bind. The existing implementation assumedthat the socket lock protected both of these paths when it actually only protects the SO_ATTACH_REUSEPORT path. Syzkaller triggered this double allocation by running these paths concurrently. This patch moves the check for double allocation into the reuseport_alloc function which is protected by a global spin lock. Fixes: e32ea7e7 ("soreuseport: fast reuseport UDP socket selection") Fixes: c125e80b ("soreuseport: fast reuseport TCP socket selection") Signed-off-by: NCraig Gallek <kraig@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 10月, 2017 2 次提交
-
-
由 Matteo Croce 提交于
In the UDP code there are two leftover error messages with very few meaning. Replace them with a more descriptive error message as some users reported them as "strange network error". Signed-off-by: NMatteo Croce <mcroce@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
syzkaller found another bug in DCCP/TCP stacks [1] For the reasons explained in commit ce105008 ("tcp/dccp: fix ireq->pktopts race"), we need to make sure we do not access ireq->opt unless we own the request sock. Note the opt field is renamed to ireq_opt to ease grep games. [1] BUG: KASAN: use-after-free in ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474 Read of size 1 at addr ffff8801c951039c by task syz-executor5/3295 CPU: 1 PID: 3295 Comm: syz-executor5 Not tainted 4.14.0-rc4+ #80 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427 ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474 tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1135 tcp_send_ack.part.37+0x3bb/0x650 net/ipv4/tcp_output.c:3587 tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3557 __tcp_ack_snd_check+0x2c6/0x4b0 net/ipv4/tcp_input.c:5072 tcp_ack_snd_check net/ipv4/tcp_input.c:5085 [inline] tcp_rcv_state_process+0x2eff/0x4850 net/ipv4/tcp_input.c:6071 tcp_child_process+0x342/0x990 net/ipv4/tcp_minisocks.c:816 tcp_v4_rcv+0x1827/0x2f80 net/ipv4/tcp_ipv4.c:1682 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:249 [inline] ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:464 [inline] ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397 NF_HOOK include/linux/netfilter.h:249 [inline] ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587 netif_receive_skb+0xae/0x390 net/core/dev.c:4611 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792 call_write_iter include/linux/fs.h:1770 [inline] new_sync_write fs/read_write.c:468 [inline] __vfs_write+0x68a/0x970 fs/read_write.c:481 vfs_write+0x18f/0x510 fs/read_write.c:543 SYSC_write fs/read_write.c:588 [inline] SyS_write+0xef/0x220 fs/read_write.c:580 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x40c341 RSP: 002b:00007f469523ec10 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 000000000040c341 RDX: 0000000000000037 RSI: 0000000020004000 RDI: 0000000000000015 RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000f4240 R11: 0000000000000293 R12: 00000000004b7fd1 R13: 00000000ffffffff R14: 0000000020000000 R15: 0000000000025000 Allocated by task 3295: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551 __do_kmalloc mm/slab.c:3725 [inline] __kmalloc+0x162/0x760 mm/slab.c:3734 kmalloc include/linux/slab.h:498 [inline] tcp_v4_save_options include/net/tcp.h:1962 [inline] tcp_v4_init_req+0x2d3/0x3e0 net/ipv4/tcp_ipv4.c:1271 tcp_conn_request+0xf6d/0x3410 net/ipv4/tcp_input.c:6283 tcp_v4_conn_request+0x157/0x210 net/ipv4/tcp_ipv4.c:1313 tcp_rcv_state_process+0x8ea/0x4850 net/ipv4/tcp_input.c:5857 tcp_v4_do_rcv+0x55c/0x7d0 net/ipv4/tcp_ipv4.c:1482 tcp_v4_rcv+0x2d10/0x2f80 net/ipv4/tcp_ipv4.c:1711 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:249 [inline] ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:464 [inline] ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397 NF_HOOK include/linux/netfilter.h:249 [inline] ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587 netif_receive_skb+0xae/0x390 net/core/dev.c:4611 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792 call_write_iter include/linux/fs.h:1770 [inline] new_sync_write fs/read_write.c:468 [inline] __vfs_write+0x68a/0x970 fs/read_write.c:481 vfs_write+0x18f/0x510 fs/read_write.c:543 SYSC_write fs/read_write.c:588 [inline] SyS_write+0xef/0x220 fs/read_write.c:580 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed by task 3306: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524 __cache_free mm/slab.c:3503 [inline] kfree+0xca/0x250 mm/slab.c:3820 inet_sock_destruct+0x59d/0x950 net/ipv4/af_inet.c:157 __sk_destruct+0xfd/0x910 net/core/sock.c:1560 sk_destruct+0x47/0x80 net/core/sock.c:1595 __sk_free+0x57/0x230 net/core/sock.c:1603 sk_free+0x2a/0x40 net/core/sock.c:1614 sock_put include/net/sock.h:1652 [inline] inet_csk_complete_hashdance+0xd5/0xf0 net/ipv4/inet_connection_sock.c:959 tcp_check_req+0xf4d/0x1620 net/ipv4/tcp_minisocks.c:765 tcp_v4_rcv+0x17f6/0x2f80 net/ipv4/tcp_ipv4.c:1675 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:249 [inline] ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:464 [inline] ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397 NF_HOOK include/linux/netfilter.h:249 [inline] ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493 __netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587 netif_receive_skb+0xae/0x390 net/core/dev.c:4611 tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372 tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766 tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792 call_write_iter include/linux/fs.h:1770 [inline] new_sync_write fs/read_write.c:468 [inline] __vfs_write+0x68a/0x970 fs/read_write.c:481 vfs_write+0x18f/0x510 fs/read_write.c:543 SYSC_write fs/read_write.c:588 [inline] SyS_write+0xef/0x220 fs/read_write.c:580 entry_SYSCALL_64_fastpath+0x1f/0xbe Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets") Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 10月, 2017 1 次提交
-
-
由 Stephen Hemminger 提交于
The kernel config help for policy routing was still pointing at an ancient document from 2000 that refers to Linux 2.1. Update it to point to something that is at least occasionally updated. Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 10月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
This reverts commit fbb1fb4a. This was not the proper fix, lets cleanly revert it, so that following patch can be carried to stable versions. sock_cgroup_ptr() callers do not expect a NULL return value. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 10月, 2017 4 次提交
-
-
由 Eric Dumazet 提交于
sk_clone_lock() might run while TCP/DCCP listener already vanished. In order to prevent use after free, it is better to defer cgroup_sk_alloc() to the point we know both parent and child exist, and from process context. Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Instead of calling mem_cgroup_sk_alloc() from BH context, it is better to call it from inet_csk_accept() in process context. Not only this removes code in mem_cgroup_sk_alloc(), but it also fixes a bug since listener might have been dismantled and css_get() might cause a use-after-free. Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
The commit bc044e8d ("udp: perform source validation for mcast early demux") does not take into account that broadcast packets lands in the same code path and they need different checks for the source address - notably, zero source address are valid for bcast and invalid for mcast. As a result, 2nd and later broadcast packets with 0 source address landing to the same socket are dropped. This breaks dhcp servers. Since we don't have stringent performance requirements for ingress broadcast traffic, fix it by disabling UDP early demux such traffic. Reported-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Fixes: bc044e8d ("udp: perform source validation for mcast early demux") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steffen Klassert 提交于
A recent patch removed the dst_free() on the allocated dst_entry in ipv4_blackhole_route(). The dst_free() marked the dst_entry as dead and added it to the gc list. I.e. it was setup for a one time usage. As a result we may now have a blackhole route cached at a socket on some IPsec scenarios. This makes the connection unusable. Fix this by marking the dst_entry directly at allocation time as 'dead', so it is used only once. Fixes: b838d5e1 ("ipv4: mark DST_NOGC and remove the operation of dst_free()") Reported-by: NTobias Brunner <tobias@strongswan.org> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2017 2 次提交
-
-
由 Lin Zhang 提交于
In function {ipv4,ipv6}_synproxy_hook we expect a normal tcp packet, but the real server maybe reply an icmp error packet related to the exist tcp conntrack, so we will access wrong tcp data. Fix it by checking for the protocol field and only process tcp traffic. Signed-off-by: NLin Zhang <xiaolou4617@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Alexey Kodanev 提交于
When gso_size reset to zero for the tail segment in skb_segment(), later in ipv6_gso_segment(), __skb_udp_tunnel_segment() and gre_gso_segment() we will get incorrect results (payload length, pcsum) for that segment. inet_gso_segment() already has a check for gso_size before calculating payload. The issue was found with LTP vxlan & gre tests over ixgbe NIC. Fixes: 07b26c94 ("gso: Support partial splitting at the frag_list pointer") Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2017 4 次提交
-
-
由 Xin Long 提交于
The patch 'ip_gre: ipgre_tap device should keep dst' fixed the issue ipgre_tap dev mtu couldn't be updated in tx path. The same fix is needed for erspan as well. Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN") Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
According to __gre_tunnel_init, tunnel->hlen should be set as the headers' length between inner packet and outer iphdr. It would be used especially to calculate a proper mtu when updating mtu in tnl_update_pmtu. Now without setting it, a bigger mtu value than expected would be updated, which hurts performance a lot. This patch is to fix it by setting tunnel->hlen with: tunnel->tun_hlen + tunnel->encap_hlen + sizeof(struct erspanhdr) Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN") Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
As a ARPHRD_ETHER device, skb->len in erspan_xmit is the length of the whole ether packet. So before checking if a packet size exceeds the mtu, skb->len should subtract dev->hard_header_len. Otherwise, all packets with max size according to mtu would be trimmed to be truncated packet. Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN") Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
erspan only uses the first 10 bits of session_id as the key to look up the tunnel. But in erspan_rcv, it missed 'session_id & ID_MASK' when getting the key from session_id. If any other flag is also set in session_id in a packet, it would fail to find the tunnel due to incorrect key in erspan_rcv. This patch is to add 'session_id & ID_MASK' there and also remove the unnecessary variable session_id. Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN") Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 10月, 2017 3 次提交
-
-
由 Paolo Abeni 提交于
The UDP early demux can leverate the rx dst cache even for multicast unconnected sockets. In such scenario the ipv4 source address is validated only on the first packet in the given flow. After that, when we fetch the dst entry from the socket rx cache, we stop enforcing the rp_filter and we even start accepting any kind of martian addresses. Disabling the dst cache for unconnected multicast socket will cause large performace regression, nearly reducing by half the max ingress tput. Instead we factor out a route helper to completely validate an skb source address for multicast packets and we call it from the UDP early demux for mcast packets landing on unconnected sockets, after successful fetching the related cached dst entry. This still gives a measurable, but limited performance regression: rp_filter = 0 rp_filter = 1 edmux disabled: 1182 Kpps 1127 Kpps edmux before: 2238 Kpps 2238 Kpps edmux after: 2037 Kpps 2019 Kpps The above figures are on top of current net tree. Applying the net-next commit 6e617de8 ("net: avoid a full fib lookup when rp_filter is disabled.") the delta with rp_filter == 0 will decrease even more. Fixes: 421b3885 ("udp: ipv4: Add udp early demux") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Currently no error is emitted, but this infrastructure will used by the next patch to allow source address validation for mcast sockets. Since early demux can do a route lookup and an ipv4 route lookup can return an error code this is consistent with the current ipv4 route infrastructure. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Without keeping dst, the tunnel will not update any mtu/pmtu info, since it does not have a dst on the skb. Reproducer: client(ipgre_tap1 - eth1) <-----> (eth1 - ipgre_tap1)server After reducing eth1's mtu on client, then perforamnce became 0. This patch is to netif_keep_dst in gre_tap_init, as ipgre does. Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
My prior fix was not complete, as we were dereferencing a pointer three times per node, not twice as I initially thought. Fixes: 4cc5b44b ("inetpeer: fix RCU lookup()") Fixes: b145425f ("inetpeer: remove AVL implementation in favor of RB tree") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 9月, 2017 1 次提交
-
-
由 Alexey Kodanev 提交于
When running LTP IPsec tests, KASan might report: BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti] Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0 ... Call Trace: <IRQ> dump_stack+0x63/0x89 print_address_description+0x7c/0x290 kasan_report+0x28d/0x370 ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti] __asan_report_load4_noabort+0x19/0x20 vti_tunnel_xmit+0xeee/0xff0 [ip_vti] ? vti_init_net+0x190/0x190 [ip_vti] ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 dev_hard_start_xmit+0x147/0x510 ? icmp_echo.part.24+0x1f0/0x210 __dev_queue_xmit+0x1394/0x1c60 ... Freed by task 0: save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x70/0xc0 kmem_cache_free+0x81/0x1e0 kfree_skbmem+0xb1/0xe0 kfree_skb+0x75/0x170 kfree_skb_list+0x3e/0x60 __dev_queue_xmit+0x1298/0x1c60 dev_queue_xmit+0x10/0x20 neigh_resolve_output+0x3a8/0x740 ip_finish_output2+0x5c0/0xe70 ip_finish_output+0x4ba/0x680 ip_output+0x1c1/0x3a0 xfrm_output_resume+0xc65/0x13d0 xfrm_output+0x1e4/0x380 xfrm4_output_finish+0x5c/0x70 Can be fixed if we get skb->len before dst_output(). Fixes: b9959fd3 ("vti: switch to new ip tunnel code") Fixes: 22e1b23d ("vti6: Support inter address family tunneling.") Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 9月, 2017 3 次提交
-
-
由 Josef Bacik 提交于
When doing my reuseport rework I screwed up and changed a if (hlist_empty(&tb->owners)) to if (!hlist_empty(&tb->owners)) This is obviously bad as all of the reuseport/reuse logic was reversed, which caused weird problems like allowing an ipv4 bind conflict if we opened an ipv4 only socket on a port followed by an ipv6 only socket on the same port. Fixes: b9470c27 ("inet: kill smallest_size and smallest_port") Reported-by: NCole Robinson <crobinso@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Josef Bacik 提交于
In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the ipv6 compare with the fast socket information to make sure we're doing the proper comparisons. Fixes: 637bc8bb ("inet: reset tb->fastreuseport when adding a reuseport sk") Reported-and-tested-by: NCole Robinson <crobinso@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Josef Bacik 提交于
We need to set the tb->fast_sk_family properly so we can use the proper comparison function for all subsequent reuseport bind requests. Fixes: 637bc8bb ("inet: reset tb->fastreuseport when adding a reuseport sk") Reported-and-tested-by: NCole Robinson <crobinso@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
Our recent change exposed a bug in TCP Fastopen Client that syzkaller found right away [1] When we prepare skb with SYN+DATA, we attempt to transmit it, and we update socket state as if the transmit was a success. In socket RTX queue we have two skbs, one with the SYN alone, and a second one containing the DATA. When (malicious) ACK comes in, we now complain that second one had no skb_mstamp. The proper fix is to make sure that if the transmit failed, we do not pretend we sent the DATA skb, and make it our send_head. When 3WHS completes, we can now send the DATA right away, without having to wait for a timeout. [1] WARNING: CPU: 0 PID: 100189 at net/ipv4/tcp_input.c:3117 tcp_clean_rtx_queue+0x2057/0x2ab0 net/ipv4/tcp_input.c:3117() WARN_ON_ONCE(last_ackt == 0); Modules linked in: CPU: 0 PID: 100189 Comm: syz-executor1 Not tainted Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 0000000000000000 ffff8800b35cb1d8 ffffffff81cad00d 0000000000000000 ffffffff828a4347 ffff88009f86c080 ffffffff8316eb20 0000000000000d7f ffff8800b35cb220 ffffffff812c33c2 ffff8800baad2440 00000009d46575c0 Call Trace: [<ffffffff81cad00d>] __dump_stack [<ffffffff81cad00d>] dump_stack+0xc1/0x124 [<ffffffff812c33c2>] warn_slowpath_common+0xe2/0x150 [<ffffffff812c361e>] warn_slowpath_null+0x2e/0x40 [<ffffffff828a4347>] tcp_clean_rtx_queue+0x2057/0x2ab0 n [<ffffffff828ae6fd>] tcp_ack+0x151d/0x3930 [<ffffffff828baa09>] tcp_rcv_state_process+0x1c69/0x4fd0 [<ffffffff828efb7f>] tcp_v4_do_rcv+0x54f/0x7c0 [<ffffffff8258aacb>] sk_backlog_rcv [<ffffffff8258aacb>] __release_sock+0x12b/0x3a0 [<ffffffff8258ad9e>] release_sock+0x5e/0x1c0 [<ffffffff8294a785>] inet_wait_for_connect [<ffffffff8294a785>] __inet_stream_connect+0x545/0xc50 [<ffffffff82886f08>] tcp_sendmsg_fastopen [<ffffffff82886f08>] tcp_sendmsg+0x2298/0x35a0 [<ffffffff82952515>] inet_sendmsg+0xe5/0x520 [<ffffffff8257152f>] sock_sendmsg_nosec [<ffffffff8257152f>] sock_sendmsg+0xcf/0x110 Fixes: 8c72c65b ("tcp: update skb->skb_mstamp more carefully") Fixes: 783237e8 ("net-tcp: Fast Open client - sending SYN-data") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NDmitry Vyukov <dvyukov@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 9月, 2017 1 次提交
-
-
由 Yuchung Cheng 提交于
remove tcp_may_send_now and tcp_snd_test that are no longer used Fixes: 840a3cbe ("tcp: remove forward retransmit feature") Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 9月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
Now skb->mstamp_skb is updated later, we also need to call tcp_rate_skb_sent() after the update is done. Fixes: 8c72c65b ("tcp: update skb->skb_mstamp more carefully") Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 9月, 2017 2 次提交
-
-
由 Eric Dumazet 提交于
liujian reported a problem in TCP_USER_TIMEOUT processing with a patch in tcp_probe_timer() : https://www.spinics.net/lists/netdev/msg454496.html After investigations, the root cause of the problem is that we update skb->skb_mstamp of skbs in write queue, even if the attempt to send a clone or copy of it failed. One reason being a routing problem. This patch prevents this, solving liujian issue. It also removes a potential RTT miscalculation, since __tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has been changed. A future ACK would then lead to a very small RTT sample and min_rtt would then be lowered to this too small value. Tested: # cat user_timeout.pkt --local_ip=192.168.102.64 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0` +0 < S 0:0(0) win 0 <mss 1460> +0 > S. 0:0(0) ack 1 <mss 1460> +.1 < . 1:1(0) ack 1 win 65530 +0 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0 +0 write(4, ..., 24) = 24 +0 > P. 1:25(24) ack 1 win 29200 +.1 < . 1:1(0) ack 25 win 65530 //change the ipaddress +1 `ifconfig tun0 192.168.0.10/16` +1 write(4, ..., 24) = 24 +1 write(4, ..., 24) = 24 +1 write(4, ..., 24) = 24 +1 write(4, ..., 24) = 24 +0 `ifconfig tun0 192.168.102.64/16` +0 < . 1:2(1) ack 25 win 65530 +0 `ifconfig tun0 192.168.0.10/16` +3 write(4, ..., 24) = -1 # ./packetdrill user_timeout.pkt Signed-off-by: NEric Dumazet <edumazet@googl.com> Reported-by: Nliujian <liujian56@huawei.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
rt_iif is only set to the actual egress device for the output path. The recent change to consider the l3slave flag when returning IP_PKTINFO works for local traffic (the correct device index is returned), but it broke the more typical use case of packets received from a remote host always returning the VRF index rather than the original ingress device. Update the fixup to consider l3slave and rt_iif actually getting set. Fixes: 1dfa7639 ("net: ipv4: add check for l3slave for index returned in IP_PKTINFO") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2017 2 次提交
-
-
由 Haishuang Yan 提交于
In collect_md mode, if the tun dev is down, it still can call ip_tunnel_rcv to receive on packets, and the rx statistics increase improperly. When the md tunnel is down, it's not neccessary to increase RX drops for the tunnel device, packets would be recieved on fallback tunnel, and the RX drops on fallback device will be increased as expected. Fixes: 2e15ea39 ("ip_gre: Add support to collect tunnel metadata.") Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Back in linux-4.4, I inadvertently put a call to reqsk_put() in inet_child_forget(), forgetting it could be called from two different points. In the case it is called from inet_csk_reqsk_queue_add(), we want to keep the reference on the request socket, since it is released later by the caller (tcp_v{4|6}_rcv()) This bug never showed up because atomic_dec_and_test() was not signaling the underflow, and SLAB_DESTROY_BY RCU semantic for request sockets prevented the request to be put in quarantine. Recent conversion of socket refcount from atomic_t to refcount_t finally exposed the bug. So move the reqsk_put() to inet_csk_listen_stop() to fix this. Thanks to Shankara Pailoor for using syzkaller and providing a nice set of .config and C repro. WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186 refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186 Kernel panic - not syncing: panic_on_warn set ... CPU: 2 PID: 4277 Comm: syz-executor0 Not tainted 4.13.0-rc7 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0xf7/0x1aa lib/dump_stack.c:52 panic+0x1ae/0x3a7 kernel/panic.c:180 __warn+0x1c4/0x1d9 kernel/panic.c:541 report_bug+0x211/0x2d0 lib/bug.c:183 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x118/0x340 arch/x86/kernel/traps.c:310 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:846 RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186 RSP: 0018:ffff88006e006b60 EFLAGS: 00010286 RAX: 0000000000000026 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000026 RSI: 1ffff1000dc00d2c RDI: ffffed000dc00d60 RBP: ffff88006e006bf0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1000dc00d6d R13: 00000000ffffffff R14: 0000000000000001 R15: ffff88006ce9d340 refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211 reqsk_put+0x71/0x2b0 include/net/request_sock.h:123 tcp_v4_rcv+0x259e/0x2e20 net/ipv4/tcp_ipv4.c:1729 ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:248 [inline] ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:477 [inline] ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397 NF_HOOK include/linux/netfilter.h:248 [inline] ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488 __netif_receive_skb_core+0x1fb7/0x31f0 net/core/dev.c:4298 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4336 process_backlog+0x1c5/0x6d0 net/core/dev.c:5102 napi_poll net/core/dev.c:5499 [inline] net_rx_action+0x6d3/0x14a0 net/core/dev.c:5565 __do_softirq+0x2cb/0xb2d kernel/softirq.c:284 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:898 </IRQ> do_softirq.part.16+0x63/0x80 kernel/softirq.c:328 do_softirq kernel/softirq.c:176 [inline] __local_bh_enable_ip+0x84/0x90 kernel/softirq.c:181 local_bh_enable include/linux/bottom_half.h:31 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:705 [inline] ip_finish_output2+0x8ad/0x1360 net/ipv4/ip_output.c:231 ip_finish_output+0x74e/0xb80 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:237 [inline] ip_output+0x1cc/0x850 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:471 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_queue_xmit+0x8c6/0x1810 net/ipv4/ip_output.c:504 tcp_transmit_skb+0x1963/0x3320 net/ipv4/tcp_output.c:1123 tcp_send_ack.part.35+0x38c/0x620 net/ipv4/tcp_output.c:3575 tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3545 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5795 [inline] tcp_rcv_state_process+0x4876/0x4b60 net/ipv4/tcp_input.c:5930 tcp_v4_do_rcv+0x58a/0x820 net/ipv4/tcp_ipv4.c:1483 sk_backlog_rcv include/net/sock.h:907 [inline] __release_sock+0x124/0x360 net/core/sock.c:2223 release_sock+0xa4/0x2a0 net/core/sock.c:2715 inet_wait_for_connect net/ipv4/af_inet.c:557 [inline] __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643 inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682 SYSC_connect+0x204/0x470 net/socket.c:1628 SyS_connect+0x24/0x30 net/socket.c:1609 entry_SYSCALL_64_fastpath+0x18/0xad RIP: 0033:0x451e59 RSP: 002b:00007f474843fc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 0000000000451e59 RDX: 0000000000000010 RSI: 0000000020002000 RDI: 0000000000000007 RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000 R13: 00007ffc040a0f8f R14: 00007f47484409c0 R15: 0000000000000000 Fixes: ebb516af ("tcp/dccp: fix race at listener dismantle phase") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NShankara Pailoor <sp3485@columbia.edu> Tested-by: NShankara Pailoor <sp3485@columbia.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 9月, 2017 3 次提交
-
-
由 Haishuang Yan 提交于
ttl and tos variables are declared and assigned, but are not used in iptunnel_xmit() function. Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel") Cc: Alexei Starovoitov <ast@fb.com> Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While the cited commit fixed a possible deadlock, it added a leak of the request socket, since reqsk_put() must be called if the BPF filter decided the ACK packet must be dropped. Fixes: d624d276 ("tcp: fix possible deadlock in TCP stack vs BPF filter") Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
There are reports about spurious softlockups during iptables-restore, a backtrace i saw points at get_counters -- it uses a sequence lock and also has unbounded restart loop. Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 08 9月, 2017 1 次提交
-
-
由 Paolo Abeni 提交于
After commit 0ddf3fb2 ("udp: preserve skb->dst if required for IP options processing") we clear the skb head state as soon as the skb carrying them is first processed. Since the same skb can be processed several times when MSG_PEEK is used, we can end up lacking the required head states, and eventually oopsing. Fix this clearing the skb head state only when processing the last skb reference. Reported-by: NEric Dumazet <edumazet@google.com> Fixes: 0ddf3fb2 ("udp: preserve skb->dst if required for IP options processing") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 9月, 2017 5 次提交
-
-
由 Varsha Rao 提交于
This patch removes CONFIG_NETFILTER_DEBUG and _ASSERT() macros as they are no longer required. Replace _ASSERT() macros with WARN_ON(). Signed-off-by: NVarsha Rao <rvarsha016@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Varsha Rao 提交于
This patch removes NF_CT_ASSERT() and instead uses WARN_ON(). Signed-off-by: NVarsha Rao <rvarsha016@gmail.com>
-
由 Florian Westphal 提交于
tested with allmodconfig build. Signed-off-by: NFlorian Westphal <fw@strlen.de>
-
由 Jesper Dangaard Brouer 提交于
This reverts commit 1d6119ba. After reverting commit 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting") then here is no need for this fix-up patch. As percpu_counter is no longer used, it cannot memory leak it any-longer. Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting") Fixes: 1d6119ba ("net: fix percpu memory leaks") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
This reverts commit 6d7b857d. There is a bug in fragmentation codes use of the percpu_counter API, that can cause issues on systems with many CPUs. The frag_mem_limit() just reads the global counter (fbc->count), without considering other CPUs can have upto batch size (130K) that haven't been subtracted yet. Due to the 3MBytes lower thresh limit, this become dangerous at >=24 CPUs (3*1024*1024/130000=24). The correct API usage would be to use __percpu_counter_compare() which does the right thing, and takes into account the number of (online) CPUs and batch size, to account for this and call __percpu_counter_sum() when needed. We choose to revert the use of the lib/percpu_counter API for frag memory accounting for several reasons: 1) On systems with CPUs > 24, the heavier fully locked __percpu_counter_sum() is always invoked, which will be more expensive than the atomic_t that is reverted to. Given systems with more than 24 CPUs are becoming common this doesn't seem like a good option. To mitigate this, the batch size could be decreased and thresh be increased. 2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX CPU, before SKBs are pushed into sockets on remote CPUs. Given NICs can only hash on L2 part of the IP-header, the NIC-RXq's will likely be limited. Thus, a fair chance that atomic add+dec happen on the same CPU. Revert note that commit 1d6119ba ("net: fix percpu memory leaks") removed init_frag_mem_limit() and instead use inet_frags_init_net(). After this revert, inet_frags_uninit_net() becomes empty. Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting") Fixes: 1d6119ba ("net: fix percpu memory leaks") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-