- 07 12月, 2021 10 次提交
-
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
Note that other ip_tunnel users do not seem to hold a reference on tunnel->dev. Probably needs some investigations. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
We want to track all dev_hold()/dev_put() to ease leak hunting. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
We want to track all dev_hold()/dev_put() to ease leak hunting. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
This helper might hold a netdev reference for a long time, lets add reference tracking. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
This will help debugging pesky netdev reference leaks. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
This helps debugging net device refcount leaks. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
net device are refcounted. Over the years we had numerous bugs caused by imbalanced dev_hold() and dev_put() calls. The general idea is to be able to precisely pair each decrement with a corresponding prior increment. Both share a cookie, basically a pointer to private data storing stack traces. This patch adds dev_hold_track() and dev_put_track(). To use these helpers, each data structure owning a refcount should also use a "netdevice_tracker" to pair the hold and put. netdevice_tracker dev_tracker; ... dev_hold_track(dev, &dev_tracker, GFP_ATOMIC); ... dev_put_track(dev, &dev_tracker); Whenever a leak happens, we will get precise stack traces of the point dev_hold_track() happened, at device dismantle phase. We will also get a stack trace if too many dev_put_track() for the same netdevice_tracker are attempted. This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 12月, 2021 1 次提交
-
-
由 Chris Mi 提交于
Short-lived connections increase the insertion rate requirements, fill the offload table and provide very limited offload value since they process a very small amount of packets. The ct ASSURED flag is designed to filter short-lived connections for early expiration. Offload connections when they are ESTABLISHED and ASSURED. Signed-off-by: NChris Mi <cmi@nvidia.com> Reviewed-by: NOz Shlomo <ozsh@nvidia.com> Reviewed-by: NRoi Dayan <roid@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2021 13 次提交
-
-
由 Xu Wang 提交于
The 'if (dev)' statement already move into dev_{put , hold}, so remove redundant if statements. Signed-off-by: NXu Wang <vulab@iscas.ac.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xu Wang 提交于
The 'if (dev)' statement already move into dev_{put , hold}, so remove redundant if statements. Signed-off-by: NXu Wang <vulab@iscas.ac.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiapeng Chong 提交于
Eliminate the follow smatch warning: net/ipv6/ip6_offload.c:249 ipv6_gro_receive() warn: inconsistent indenting. Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 William Kucharski 提交于
Correct an error where setting /proc/sys/net/rds/tcp/rds_tcp_rcvbuf would instead modify the socket's sk_sndbuf and would leave sk_rcvbuf untouched. Fixes: c6a58ffe ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Signed-off-by: NWilliam Kucharski <william.kucharski@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matt Johnston 提交于
We need to test against the existing route type, not the rtm_type in the netlink request. Fixes: 83f0a0b7 ("mctp: Specify route types, require rtm_type in RTM_*ROUTE messages") Signed-off-by: NMatt Johnston <matt@codeconstruct.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tony Lu 提交于
When smc_close_final() returns error, the return code overwrites by kernel_sock_shutdown() in smc_close_active(). The return code of smc_close_final() is more important than kernel_sock_shutdown(), and it will pass to userspace directly. Fix it by keeping both return codes, if smc_close_final() raises an error, return it or kernel_sock_shutdown()'s. Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9 ("net/smc: Ensure the active closing peer first closes clcsock") Suggested-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NTony Lu <tonylu@linux.alibaba.com> Reviewed-by: NWen Gu <guwen@linux.alibaba.com> Acked-by: NKarsten Graul <kgraul@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dust Li 提交于
smc_lgr_cleanup_early() meant to delete the link group from the link group list, but it deleted the list head by mistake. This may cause memory corruption since we didn't remove the real link group from the list and later memseted the link group structure. We got a list corruption panic when testing: [ 231.277259] list_del corruption. prev->next should be ffff8881398a8000, but was 0000000000000000 [ 231.278222] ------------[ cut here ]------------ [ 231.278726] kernel BUG at lib/list_debug.c:53! [ 231.279326] invalid opcode: 0000 [#1] SMP NOPTI [ 231.279803] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.10.46+ #435 [ 231.280466] Hardware name: Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014 [ 231.281248] Workqueue: events smc_link_down_work [ 231.281732] RIP: 0010:__list_del_entry_valid+0x70/0x90 [ 231.282258] Code: 4c 60 82 e8 7d cc 6a 00 0f 0b 48 89 fe 48 c7 c7 88 4c 60 82 e8 6c cc 6a 00 0f 0b 48 89 fe 48 c7 c7 c0 4c 60 82 e8 5b cc 6a 00 <0f> 0b 48 89 fe 48 c7 c7 00 4d 60 82 e8 4a cc 6a 00 0f 0b cc cc cc [ 231.284146] RSP: 0018:ffffc90000033d58 EFLAGS: 00010292 [ 231.284685] RAX: 0000000000000054 RBX: ffff8881398a8000 RCX: 0000000000000000 [ 231.285415] RDX: 0000000000000001 RSI: ffff88813bc18040 RDI: ffff88813bc18040 [ 231.286141] RBP: ffffffff8305ad40 R08: 0000000000000003 R09: 0000000000000001 [ 231.286873] R10: ffffffff82803da0 R11: ffffc90000033b90 R12: 0000000000000001 [ 231.287606] R13: 0000000000000000 R14: ffff8881398a8000 R15: 0000000000000003 [ 231.288337] FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 231.289160] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 231.289754] CR2: 0000000000e72058 CR3: 000000010fa96006 CR4: 00000000003706f0 [ 231.290485] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 231.291211] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 231.291940] Call Trace: [ 231.292211] smc_lgr_terminate_sched+0x53/0xa0 [ 231.292677] smc_switch_conns+0x75/0x6b0 [ 231.293085] ? update_load_avg+0x1a6/0x590 [ 231.293517] ? ttwu_do_wakeup+0x17/0x150 [ 231.293907] ? update_load_avg+0x1a6/0x590 [ 231.294317] ? newidle_balance+0xca/0x3d0 [ 231.294716] smcr_link_down+0x50/0x1a0 [ 231.295090] ? __wake_up_common_lock+0x77/0x90 [ 231.295534] smc_link_down_work+0x46/0x60 [ 231.295933] process_one_work+0x18b/0x350 Fixes: a0a62ee1 ("net/smc: separate locks for SMCD and SMCR link group lists") Signed-off-by: NDust Li <dust.li@linux.alibaba.com> Acked-by: NKarsten Graul <kgraul@linux.ibm.com> Reviewed-by: NTony Lu <tonylu@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Before commit faa041a4 ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4.fib_num_tclassid_users were protected by RTNL. After the change, this is no longer the case, as free_fib_info_rcu() runs after rcu grace period, without rtnl being held. Fixes: faa041a4 ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
syzbot found that __dev_queue_xmit() is reading txq->xmit_lock_owner without annotations. No serious issue there, let's document what is happening there. BUG: KCSAN: data-race in __dev_queue_xmit / __dev_queue_xmit write to 0xffff888139d09484 of 4 bytes by interrupt on cpu 0: __netif_tx_unlock include/linux/netdevice.h:4437 [inline] __dev_queue_xmit+0x948/0xf70 net/core/dev.c:4229 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline] macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567 __netdev_start_xmit include/linux/netdevice.h:4987 [inline] netdev_start_xmit include/linux/netdevice.h:5001 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3590 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259 neigh_hh_output include/net/neighbour.h:511 [inline] neigh_output include/net/neighbour.h:525 [inline] ip6_finish_output2+0x995/0xbb0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421 expire_timers+0x116/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x410 kernel/time/timer.c:1734 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747 __do_softirq+0x158/0x2de kernel/softirq.c:558 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x37/0x70 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x3e/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 read to 0xffff888139d09484 of 4 bytes by interrupt on cpu 1: __dev_queue_xmit+0x5e3/0xf70 net/core/dev.c:4213 dev_queue_xmit_accel+0x19/0x20 net/core/dev.c:4265 macvlan_queue_xmit drivers/net/macvlan.c:543 [inline] macvlan_start_xmit+0x2b3/0x3d0 drivers/net/macvlan.c:567 __netdev_start_xmit include/linux/netdevice.h:4987 [inline] netdev_start_xmit include/linux/netdevice.h:5001 [inline] xmit_one+0x105/0x2f0 net/core/dev.c:3590 dev_hard_start_xmit+0x72/0x120 net/core/dev.c:3606 sch_direct_xmit+0x1b2/0x7c0 net/sched/sch_generic.c:342 __dev_xmit_skb+0x83d/0x1370 net/core/dev.c:3817 __dev_queue_xmit+0x590/0xf70 net/core/dev.c:4194 dev_queue_xmit+0x13/0x20 net/core/dev.c:4259 neigh_resolve_output+0x3db/0x410 net/core/neighbour.c:1523 neigh_output include/net/neighbour.h:527 [inline] ip6_finish_output2+0x9be/0xbb0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] ip6_finish_output+0x444/0x4c0 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x10e/0x210 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0x486/0x610 net/ipv6/ndisc.c:508 ndisc_send_rs+0x3b0/0x3e0 net/ipv6/ndisc.c:702 addrconf_rs_timer+0x370/0x540 net/ipv6/addrconf.c:3898 call_timer_fn+0x2e/0x240 kernel/time/timer.c:1421 expire_timers+0x116/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x410 kernel/time/timer.c:1734 run_timer_softirq+0x2e/0x60 kernel/time/timer.c:1747 __do_softirq+0x158/0x2de kernel/softirq.c:558 __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x37/0x70 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x8d/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 kcsan_setup_watchpoint+0x94/0x420 kernel/kcsan/core.c:443 folio_test_anon include/linux/page-flags.h:581 [inline] PageAnon include/linux/page-flags.h:586 [inline] zap_pte_range+0x5ac/0x10e0 mm/memory.c:1347 zap_pmd_range mm/memory.c:1467 [inline] zap_pud_range mm/memory.c:1496 [inline] zap_p4d_range mm/memory.c:1517 [inline] unmap_page_range+0x2dc/0x3d0 mm/memory.c:1538 unmap_single_vma+0x157/0x210 mm/memory.c:1583 unmap_vmas+0xd0/0x180 mm/memory.c:1615 exit_mmap+0x23d/0x470 mm/mmap.c:3170 __mmput+0x27/0x1b0 kernel/fork.c:1113 mmput+0x3d/0x50 kernel/fork.c:1134 exit_mm+0xdb/0x170 kernel/exit.c:507 do_exit+0x608/0x17a0 kernel/exit.c:819 do_group_exit+0xce/0x180 kernel/exit.c:929 get_signal+0xfc3/0x1550 kernel/signal.c:2852 arch_do_signal_or_restart+0x8c/0x2e0 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x113/0x190 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300 do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0xffffffff Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G W 5.16.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20211130170155.2331929-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Eric Dumazet 提交于
This reverts commit aeeecb88. The new SNMP variable (TCPSmallQueueFailure) can be incremented for good reasons, even on a 100Gbit single TCP_STREAM flow. If we really wanted to ease driver debugging [1], this would require something more sophisticated. [1] Usually, if a driver is delaying TX completions too much, this can lead to stalls in TCP output. Various work arounds have been used in the past, like skb_orphan() in ndo_start_xmit(). Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Menglong Dong <imagedong@tencent.com> Link: https://lore.kernel.org/r/20211201033246.2826224-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Russell King (Oracle) 提交于
Support the use of phylink_generic_validate() when there is no phylink_validate method given in the DSA switch operations and mac_capabilities have been set in the phylink_config structure by the DSA switch driver. This gives DSA switch drivers the option to use this if they provide the supported_interfaces and mac_capabilities, while still giving them an option to override the default implementation if necessary. Reviewed-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: NMarek Behún <kabel@kernel.org> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Russell King (Oracle) 提交于
Phylink needs slightly more information than phylink_get_interfaces() allows us to get from the DSA drivers - we need the MAC capabilities. Replace the phylink_get_interfaces() method with phylink_get_caps() to allow DSA drivers to fill in the phylink_config MAC capabilities field as well. Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: NMarek Behún <kabel@kernel.org> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Russell King (Oracle) 提交于
The code in port.c and slave.c creating the phylink instance is very similar - let's consolidate this into a single function. Signed-off-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: NMarek Behún <kabel@kernel.org> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 01 12月, 2021 2 次提交
-
-
由 Yang Yingliang 提交于
The skb will be checked inside kfree_skb(), so remove the outside check. Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20211130031243.768823-1-yangyingliang@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Harshit Mogalapalli 提交于
Adding a check on len parameter to avoid empty skb. This prevents a division error in netem_enqueue function which is caused when skb->len=0 and skb->data_len=0 in the randomized corruption step as shown below. skb->data[prandom_u32() % skb_headlen(skb)] ^= 1<<(prandom_u32() % 8); Crash Report: [ 343.170349] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0 [ 343.216110] netem: version 1.3 [ 343.235841] divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 343.236680] CPU: 3 PID: 4288 Comm: reproducer Not tainted 5.16.0-rc1+ [ 343.237569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 343.238707] RIP: 0010:netem_enqueue+0x1590/0x33c0 [sch_netem] [ 343.239499] Code: 89 85 58 ff ff ff e8 5f 5d e9 d3 48 8b b5 48 ff ff ff 8b 8d 50 ff ff ff 8b 85 58 ff ff ff 48 8b bd 70 ff ff ff 31 d2 2b 4f 74 <f7> f1 48 b8 00 00 00 00 00 fc ff df 49 01 d5 4c 89 e9 48 c1 e9 03 [ 343.241883] RSP: 0018:ffff88800bcd7368 EFLAGS: 00010246 [ 343.242589] RAX: 00000000ba7c0a9c RBX: 0000000000000001 RCX: 0000000000000000 [ 343.243542] RDX: 0000000000000000 RSI: ffff88800f8edb10 RDI: ffff88800f8eda40 [ 343.244474] RBP: ffff88800bcd7458 R08: 0000000000000000 R09: ffffffff94fb8445 [ 343.245403] R10: ffffffff94fb8336 R11: ffffffff94fb8445 R12: 0000000000000000 [ 343.246355] R13: ffff88800a5a7000 R14: ffff88800a5b5800 R15: 0000000000000020 [ 343.247291] FS: 00007fdde2bd7700(0000) GS:ffff888109780000(0000) knlGS:0000000000000000 [ 343.248350] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.249120] CR2: 00000000200000c0 CR3: 000000000ef4c000 CR4: 00000000000006e0 [ 343.250076] Call Trace: [ 343.250423] <TASK> [ 343.250713] ? memcpy+0x4d/0x60 [ 343.251162] ? netem_init+0xa0/0xa0 [sch_netem] [ 343.251795] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.252443] netem_enqueue+0xe28/0x33c0 [sch_netem] [ 343.253102] ? stack_trace_save+0x87/0xb0 [ 343.253655] ? filter_irq_stacks+0xb0/0xb0 [ 343.254220] ? netem_init+0xa0/0xa0 [sch_netem] [ 343.254837] ? __kasan_check_write+0x14/0x20 [ 343.255418] ? _raw_spin_lock+0x88/0xd6 [ 343.255953] dev_qdisc_enqueue+0x50/0x180 [ 343.256508] __dev_queue_xmit+0x1a7e/0x3090 [ 343.257083] ? netdev_core_pick_tx+0x300/0x300 [ 343.257690] ? check_kcov_mode+0x10/0x40 [ 343.258219] ? _raw_spin_unlock_irqrestore+0x29/0x40 [ 343.258899] ? __kasan_init_slab_obj+0x24/0x30 [ 343.259529] ? setup_object.isra.71+0x23/0x90 [ 343.260121] ? new_slab+0x26e/0x4b0 [ 343.260609] ? kasan_poison+0x3a/0x50 [ 343.261118] ? kasan_unpoison+0x28/0x50 [ 343.261637] ? __kasan_slab_alloc+0x71/0x90 [ 343.262214] ? memcpy+0x4d/0x60 [ 343.262674] ? write_comp_data+0x2f/0x90 [ 343.263209] ? __kasan_check_write+0x14/0x20 [ 343.263802] ? __skb_clone+0x5d6/0x840 [ 343.264329] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.264958] dev_queue_xmit+0x1c/0x20 [ 343.265470] netlink_deliver_tap+0x652/0x9c0 [ 343.266067] netlink_unicast+0x5a0/0x7f0 [ 343.266608] ? netlink_attachskb+0x860/0x860 [ 343.267183] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.267820] ? write_comp_data+0x2f/0x90 [ 343.268367] netlink_sendmsg+0x922/0xe80 [ 343.268899] ? netlink_unicast+0x7f0/0x7f0 [ 343.269472] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.270099] ? write_comp_data+0x2f/0x90 [ 343.270644] ? netlink_unicast+0x7f0/0x7f0 [ 343.271210] sock_sendmsg+0x155/0x190 [ 343.271721] ____sys_sendmsg+0x75f/0x8f0 [ 343.272262] ? kernel_sendmsg+0x60/0x60 [ 343.272788] ? write_comp_data+0x2f/0x90 [ 343.273332] ? write_comp_data+0x2f/0x90 [ 343.273869] ___sys_sendmsg+0x10f/0x190 [ 343.274405] ? sendmsg_copy_msghdr+0x80/0x80 [ 343.274984] ? slab_post_alloc_hook+0x70/0x230 [ 343.275597] ? futex_wait_setup+0x240/0x240 [ 343.276175] ? security_file_alloc+0x3e/0x170 [ 343.276779] ? write_comp_data+0x2f/0x90 [ 343.277313] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.277969] ? write_comp_data+0x2f/0x90 [ 343.278515] ? __fget_files+0x1ad/0x260 [ 343.279048] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.279685] ? write_comp_data+0x2f/0x90 [ 343.280234] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.280874] ? sockfd_lookup_light+0xd1/0x190 [ 343.281481] __sys_sendmsg+0x118/0x200 [ 343.281998] ? __sys_sendmsg_sock+0x40/0x40 [ 343.282578] ? alloc_fd+0x229/0x5e0 [ 343.283070] ? write_comp_data+0x2f/0x90 [ 343.283610] ? write_comp_data+0x2f/0x90 [ 343.284135] ? __sanitizer_cov_trace_pc+0x21/0x60 [ 343.284776] ? ktime_get_coarse_real_ts64+0xb8/0xf0 [ 343.285450] __x64_sys_sendmsg+0x7d/0xc0 [ 343.285981] ? syscall_enter_from_user_mode+0x4d/0x70 [ 343.286664] do_syscall_64+0x3a/0x80 [ 343.287158] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 343.287850] RIP: 0033:0x7fdde24cf289 [ 343.288344] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 db 2c 00 f7 d8 64 89 01 48 [ 343.290729] RSP: 002b:00007fdde2bd6d98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 343.291730] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fdde24cf289 [ 343.292673] RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000004 [ 343.293618] RBP: 00007fdde2bd6e20 R08: 0000000100000001 R09: 0000000000000000 [ 343.294557] R10: 0000000100000001 R11: 0000000000000246 R12: 0000000000000000 [ 343.295493] R13: 0000000000021000 R14: 0000000000000000 R15: 00007fdde2bd7700 [ 343.296432] </TASK> [ 343.296735] Modules linked in: sch_netem ip6_vti ip_vti ip_gre ipip sit ip_tunnel geneve macsec macvtap tap ipvlan macvlan 8021q garp mrp hsr wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic curve25519_x86_64 libcurve25519_generic libchacha xfrm_interface xfrm6_tunnel tunnel4 veth netdevsim psample batman_adv nlmon dummy team bonding tls vcan ip6_gre ip6_tunnel tunnel6 gre tun ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_security iptable_raw ebtable_filter ebtables rfkill ip6table_filter ip6_tables iptable_filter ppdev bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper cec parport_pc drm joydev floppy parport sg syscopyarea sysfillrect sysimgblt i2c_piix4 qemu_fw_cfg fb_sys_fops pcspkr [ 343.297459] ip_tables xfs virtio_net net_failover failover sd_mod sr_mod cdrom t10_pi ata_generic pata_acpi ata_piix libata virtio_pci virtio_pci_legacy_dev serio_raw virtio_pci_modern_dev dm_mirror dm_region_hash dm_log dm_mod [ 343.311074] Dumping ftrace buffer: [ 343.311532] (ftrace buffer empty) [ 343.312040] ---[ end trace a2e3db5a6ae05099 ]--- [ 343.312691] RIP: 0010:netem_enqueue+0x1590/0x33c0 [sch_netem] [ 343.313481] Code: 89 85 58 ff ff ff e8 5f 5d e9 d3 48 8b b5 48 ff ff ff 8b 8d 50 ff ff ff 8b 85 58 ff ff ff 48 8b bd 70 ff ff ff 31 d2 2b 4f 74 <f7> f1 48 b8 00 00 00 00 00 fc ff df 49 01 d5 4c 89 e9 48 c1 e9 03 [ 343.315893] RSP: 0018:ffff88800bcd7368 EFLAGS: 00010246 [ 343.316622] RAX: 00000000ba7c0a9c RBX: 0000000000000001 RCX: 0000000000000000 [ 343.317585] RDX: 0000000000000000 RSI: ffff88800f8edb10 RDI: ffff88800f8eda40 [ 343.318549] RBP: ffff88800bcd7458 R08: 0000000000000000 R09: ffffffff94fb8445 [ 343.319503] R10: ffffffff94fb8336 R11: ffffffff94fb8445 R12: 0000000000000000 [ 343.320455] R13: ffff88800a5a7000 R14: ffff88800a5b5800 R15: 0000000000000020 [ 343.321414] FS: 00007fdde2bd7700(0000) GS:ffff888109780000(0000) knlGS:0000000000000000 [ 343.322489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 343.323283] CR2: 00000000200000c0 CR3: 000000000ef4c000 CR4: 00000000000006e0 [ 343.324264] Kernel panic - not syncing: Fatal exception in interrupt [ 343.333717] Dumping ftrace buffer: [ 343.334175] (ftrace buffer empty) [ 343.334653] Kernel Offset: 0x13600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 343.336027] Rebooting in 86400 seconds.. Reported-by: Nsyzkaller <syzkaller@googlegroups.com> Signed-off-by: NHarshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20211129175328.55339-1-harshit.m.mogalapalli@oracle.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 30 11月, 2021 6 次提交
-
-
由 Leon Romanovsky 提交于
The devlink_resources_unregister() used second parameter as an entry point for the recursive removal of devlink resources. None of the callers outside of devlink core needed to use this field, so let's remove it. As part of this removal, the "struct devlink_resource" was moved from .h to .c file as it is not possible to use in any place in the code except devlink.c. Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
We can remove a bit of code duplication by reusing the new fib6_nh_release_dsts helper in fib6_nh_release. Their only difference is that fib6_nh_release's version doesn't use atomic operation to swap the pointers because it assumes the fib6_nh is no longer visible, while fib6_nh_release_dsts can be used anywhere. Suggested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
We can optimize resilient nexthop group replaces by reducing the number of synchronize_net calls. After commit 1005f19b ("net: nexthop: release IPv6 per-cpu dsts when replacing a nexthop group") we always do a synchronize_net because we must ensure no new dsts can be created for the replaced group's removed nexthops, but we already did that when replacing resilient groups, so if we always call synchronize_net after any group type replacement we'll take care of both cases and reduce synchronize_net calls for resilient groups. Suggested-by: NIdo Schimmel <idosch@idosch.org> Signed-off-by: NNikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: NIdo Schimmel <idosch@nvidia.com> Tested-by: NIdo Schimmel <idosch@nvidia.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tianjia Zhang 提交于
Assigning crypto_info variables in advance can simplify the logic of accessing value and move related local variables to a smaller scope. Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christophe JAILLET 提交于
The 'dest' bitmap is fully initialized by the 'for' loop, so there is no need to explicitly reset it. This also makes this function in line with 'ethnl_features_to_bitmap32()' which does not clear the destination before writing it. Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/17fca158231c6f03689bd891254f0dd1f4e84cb8.1638091829.git.christophe.jaillet@wanadoo.frSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jason A. Donenfeld 提交于
Each peer's endpoint contains a dst_cache entry that takes a reference to another netdev. When the containing namespace exits, we take down the socket and prevent future sockets from being created (by setting creating_net to NULL), which removes that potential reference on the netns. However, it doesn't release references to the netns that a netdev cached in dst_cache might be taking, so the netns still might fail to exit. Since the socket is gimped anyway, we can simply clear all the dst_caches (by way of clearing the endpoint src), which will release all references. However, the current dst_cache_reset function only releases those references lazily. But it turns out that all of our usages of wg_socket_clear_peer_endpoint_src are called from contexts that are not exactly high-speed or bottle-necked. For example, when there's connection difficulty, or when userspace is reconfiguring the interface. And in particular for this patch, when the netns is exiting. So for those cases, it makes more sense to call dst_release immediately. For that, we add a small helper function to dst_cache. This patch also adds a test to netns.sh from Hangbin Liu to ensure this doesn't regress. Tested-by: NHangbin Liu <liuhangbin@gmail.com> Reported-by: NXiumei Mu <xmu@redhat.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Paolo Abeni <pabeni@redhat.com> Fixes: 900575aa ("wireguard: device: avoid circular netns references") Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 29 11月, 2021 8 次提交
-
-
由 Eiichi Tsukata 提交于
Need to call rxrpc_put_local() for peer candidate before kfree() as it holds a ref to rxrpc_local. [DH: v2: Changed to abstract the peer freeing code out into a function] Fixes: 9ebeddef ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Signed-off-by: NEiichi Tsukata <eiichi.tsukata@nutanix.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NMarc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/all/20211121041608.133740-2-eiichi.tsukata@nutanix.com/ # v1
-
由 Eiichi Tsukata 提交于
Need to call rxrpc_put_peer() for bundle candidate before kfree() as it holds a ref to rxrpc_peer. [DH: v2: Changed to abstract out the bundle freeing code into a function] Fixes: 245500d8 ("rxrpc: Rewrite the client connection manager") Signed-off-by: NEiichi Tsukata <eiichi.tsukata@nutanix.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NMarc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20211121041608.133740-1-eiichi.tsukata@nutanix.com/ # v1
-
由 msizanoen1 提交于
The kernel leaks memory when a `fib` rule is present in IPv6 nftables firewall rules and a suppress_prefix rule is present in the IPv6 routing rules (used by certain tools such as wg-quick). In such scenarios, every incoming packet will leak an allocation in `ip6_dst_cache` slab cache. After some hours of `bpftrace`-ing and source code reading, I tracked down the issue to ca7a03c4 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule"). The problem with that change is that the generic `args->flags` always have `FIB_LOOKUP_NOREF` set[1][2] but the IPv6-specific flag `RT6_LOOKUP_F_DST_NOREF` might not be, leading to `fib6_rule_suppress` not decreasing the refcount when needed. How to reproduce: - Add the following nftables rule to a prerouting chain: meta nfproto ipv6 fib saddr . mark . iif oif missing drop This can be done with: sudo nft create table inet test sudo nft create chain inet test test_chain '{ type filter hook prerouting priority filter + 10; policy accept; }' sudo nft add rule inet test test_chain meta nfproto ipv6 fib saddr . mark . iif oif missing drop - Run: sudo ip -6 rule add table main suppress_prefixlength 0 - Watch `sudo slabtop -o | grep ip6_dst_cache` to see memory usage increase with every incoming ipv6 packet. This patch exposes the protocol-specific flags to the protocol specific `suppress` function, and check the protocol-specific `flags` argument for RT6_LOOKUP_F_DST_NOREF instead of the generic FIB_LOOKUP_NOREF when decreasing the refcount, like this. [1]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L71 [2]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L99 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215105 Fixes: ca7a03c4 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule") Cc: stable@vger.kernel.org Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Menglong Dong 提交于
Once tcp small queue check failed in tcp_small_queue_check(), the throughput of tcp will be limited, and it's hard to distinguish whether it is out of tcp congestion control. Add statistics of LINUX_MIB_TCPSMALLQUEUEFAILURE for this scene. Signed-off-by: NMenglong Dong <imagedong@tencent.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Leon Romanovsky 提交于
DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET command doesn't have .doit callback and has no use in internal_flags at all. Remove this misleading assignment. Fixes: e44ef4e4 ("devlink: Hang reporter's dump method on a dumpit cb") Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jeremy Kerr 提交于
In our test device, we're currently freeing skbs in the transmit path with kfree(), rather than kfree_skb(). This change uses the correct kfree_skb() instead. Fixes: ded21b72 ("mctp: Add test utils") Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tianjia Zhang 提交于
When the TLS cipher suite uses CCM mode, including AES CCM and SM4 CCM, the first byte of the B0 block is flags, and the real IV starts from the second byte. The XOR operation of the IV and rec_seq should be skip this byte, that is, add the iv_offset. Fixes: f295b3ae ("net/tls: Add support of AES128-CCM based ciphers") Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com> Cc: Vakul Garg <vakul.garg@nxp.com> Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Benjamin Poirier 提交于
There are separate for_nexthops and change_nexthops iterators. The for_nexthops variant should use const. Signed-off-by: NBenjamin Poirier <bpoirier@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-