1. 26 10月, 2022 1 次提交
  2. 14 6月, 2022 1 次提交
  3. 31 5月, 2022 1 次提交
    • E
      inet: fully convert sk->sk_rx_dst to RCU rules · 5f0c3540
      Eric Dumazet 提交于
      mainline inclusion
      from mainline-v5.16-rc7
      commit 8f905c0e
      category: bugfix
      bugzilla: 186714 https://gitee.com/src-openeuler/kernel/issues/I57QUK
      
      --------------------------------
      
      syzbot reported various issues around early demux,
      one being included in this changelog [1]
      
      sk->sk_rx_dst is using RCU protection without clearly
      documenting it.
      
      And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv()
      are not following standard RCU rules.
      
      [a]    dst_release(dst);
      [b]    sk->sk_rx_dst = NULL;
      
      They look wrong because a delete operation of RCU protected
      pointer is supposed to clear the pointer before
      the call_rcu()/synchronize_rcu() guarding actual memory freeing.
      
      In some cases indeed, dst could be freed before [b] is done.
      
      We could cheat by clearing sk_rx_dst before calling
      dst_release(), but this seems the right time to stick
      to standard RCU annotations and debugging facilities.
      
      [1]
      BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline]
      BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
      Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204
      
      CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247
       __kasan_report mm/kasan/report.c:433 [inline]
       kasan_report.cold+0x83/0xdf mm/kasan/report.c:450
       dst_check include/net/dst.h:470 [inline]
       tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
       ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340
       ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
       ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
       ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
       __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
       __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
       __netif_receive_skb_list net/core/dev.c:5608 [inline]
       netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
       gro_normal_list net/core/dev.c:5853 [inline]
       gro_normal_list net/core/dev.c:5849 [inline]
       napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
       virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
       virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
       __napi_poll+0xaf/0x440 net/core/dev.c:7023
       napi_poll net/core/dev.c:7090 [inline]
       net_rx_action+0x801/0xb40 net/core/dev.c:7177
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
       invoke_softirq kernel/softirq.c:432 [inline]
       __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
       common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240
       asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629
      RIP: 0033:0x7f5e972bfd57
      Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73
      RSP: 002b:00007fff8a413210 EFLAGS: 00000283
      RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45
      RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45
      RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9
      R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0
      R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019
       </TASK>
      
      Allocated by task 13:
       kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
       kasan_set_track mm/kasan/common.c:46 [inline]
       set_alloc_info mm/kasan/common.c:434 [inline]
       __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467
       kasan_slab_alloc include/linux/kasan.h:259 [inline]
       slab_post_alloc_hook mm/slab.h:519 [inline]
       slab_alloc_node mm/slub.c:3234 [inline]
       slab_alloc mm/slub.c:3242 [inline]
       kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247
       dst_alloc+0x146/0x1f0 net/core/dst.c:92
       rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
       ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340
       ip_route_input_rcu net/ipv4/route.c:2470 [inline]
       ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415
       ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354
       ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
       ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
       ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
       __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
       __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
       __netif_receive_skb_list net/core/dev.c:5608 [inline]
       netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
       gro_normal_list net/core/dev.c:5853 [inline]
       gro_normal_list net/core/dev.c:5849 [inline]
       napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
       virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
       virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
       __napi_poll+0xaf/0x440 net/core/dev.c:7023
       napi_poll net/core/dev.c:7090 [inline]
       net_rx_action+0x801/0xb40 net/core/dev.c:7177
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
      
      Freed by task 13:
       kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
       kasan_set_track+0x21/0x30 mm/kasan/common.c:46
       kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
       ____kasan_slab_free mm/kasan/common.c:366 [inline]
       ____kasan_slab_free mm/kasan/common.c:328 [inline]
       __kasan_slab_free+0xff/0x130 mm/kasan/common.c:374
       kasan_slab_free include/linux/kasan.h:235 [inline]
       slab_free_hook mm/slub.c:1723 [inline]
       slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749
       slab_free mm/slub.c:3513 [inline]
       kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530
       dst_destroy+0x2d6/0x3f0 net/core/dst.c:127
       rcu_do_batch kernel/rcu/tree.c:2506 [inline]
       rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741
       __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
      
      Last potentially related work creation:
       kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
       __kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348
       __call_rcu kernel/rcu/tree.c:2985 [inline]
       call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065
       dst_release net/core/dst.c:177 [inline]
       dst_release+0x79/0xe0 net/core/dst.c:167
       tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712
       sk_backlog_rcv include/net/sock.h:1030 [inline]
       __release_sock+0x134/0x3b0 net/core/sock.c:2768
       release_sock+0x54/0x1b0 net/core/sock.c:3300
       tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441
       inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:724
       sock_write_iter+0x289/0x3c0 net/socket.c:1057
       call_write_iter include/linux/fs.h:2162 [inline]
       new_sync_write+0x429/0x660 fs/read_write.c:503
       vfs_write+0x7cd/0xae0 fs/read_write.c:590
       ksys_write+0x1ee/0x250 fs/read_write.c:643
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The buggy address belongs to the object at ffff88807f1cb700
       which belongs to the cache ip_dst_cache of size 176
      The buggy address is located 58 bytes inside of
       176-byte region [ffff88807f1cb700, ffff88807f1cb7b0)
      The buggy address belongs to the page:
      page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb
      flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
      raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780
      raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062
       prep_new_page mm/page_alloc.c:2418 [inline]
       get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149
       __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369
       alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191
       alloc_slab_page mm/slub.c:1793 [inline]
       allocate_slab mm/slub.c:1930 [inline]
       new_slab+0x32d/0x4a0 mm/slub.c:1993
       ___slab_alloc+0x918/0xfe0 mm/slub.c:3022
       __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109
       slab_alloc_node mm/slub.c:3200 [inline]
       slab_alloc mm/slub.c:3242 [inline]
       kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247
       dst_alloc+0x146/0x1f0 net/core/dst.c:92
       rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
       __mkroute_output net/ipv4/route.c:2564 [inline]
       ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791
       ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619
       __ip_route_output_key include/net/route.h:126 [inline]
       ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850
       ip_route_output_key include/net/route.h:142 [inline]
       geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809
       geneve_xmit_skb drivers/net/geneve.c:899 [inline]
       geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082
       __netdev_start_xmit include/linux/netdevice.h:4994 [inline]
       netdev_start_xmit include/linux/netdevice.h:5008 [inline]
       xmit_one net/core/dev.c:3590 [inline]
       dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606
       __dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229
      page last free stack trace:
       reset_page_owner include/linux/page_owner.h:24 [inline]
       free_pages_prepare mm/page_alloc.c:1338 [inline]
       free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389
       free_unref_page_prepare mm/page_alloc.c:3309 [inline]
       free_unref_page+0x19/0x690 mm/page_alloc.c:3388
       qlink_free mm/kasan/quarantine.c:146 [inline]
       qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165
       kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272
       __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444
       kasan_slab_alloc include/linux/kasan.h:259 [inline]
       slab_post_alloc_hook mm/slab.h:519 [inline]
       slab_alloc_node mm/slub.c:3234 [inline]
       kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270
       __alloc_skb+0x215/0x340 net/core/skbuff.c:414
       alloc_skb include/linux/skbuff.h:1126 [inline]
       alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078
       sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575
       mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754
       add_grhead+0x265/0x330 net/ipv6/mcast.c:1857
       add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995
       mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242
       mld_send_initial_cr net/ipv6/mcast.c:1232 [inline]
       mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268
       process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
       worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
      
      Memory state around the buggy address:
       ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      >ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                              ^
       ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
       ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 41063e9d ("ipv4: Early TCP socket demux.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20211220143330.680945-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
      Conflict:
      	include/net/sock.h
      	net/ipv4/tcp_ipv4.c
      	net/ipv6/tcp_ipv6.c
      	net/ipv6/udp.c
      Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      5f0c3540
  4. 14 1月, 2022 4 次提交
  5. 07 1月, 2022 2 次提交
  6. 06 12月, 2021 2 次提交
  7. 15 10月, 2021 1 次提交
  8. 09 4月, 2021 2 次提交
  9. 08 2月, 2021 1 次提交
  10. 24 10月, 2020 1 次提交
  11. 03 10月, 2020 1 次提交
    • C
      tcp: use sendpage_ok() to detect misused .sendpage · cf83a17e
      Coly Li 提交于
      commit a10674bf ("tcp: detecting the misuse of .sendpage for Slab
      objects") adds the checks for Slab pages, but the pages don't have
      page_count are still missing from the check.
      
      Network layer's sendpage method is not designed to send page_count 0
      pages neither, therefore both PageSlab() and page_count() should be
      both checked for the sending page. This is exactly what sendpage_ok()
      does.
      
      This patch uses sendpage_ok() in do_tcp_sendpages() to detect misused
      .sendpage, to make the code more robust.
      
      Fixes: a10674bf ("tcp: detecting the misuse of .sendpage for Slab objects")
      Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: Vasily Averin <vvs@virtuozzo.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf83a17e
  12. 01 10月, 2020 1 次提交
  13. 15 9月, 2020 3 次提交
  14. 11 9月, 2020 2 次提交
  15. 25 8月, 2020 4 次提交
  16. 11 8月, 2020 1 次提交
    • J
      tcp: correct read of TFO keys on big endian systems · f19008e6
      Jason Baron 提交于
      When TFO keys are read back on big endian systems either via the global
      sysctl interface or via getsockopt() using TCP_FASTOPEN_KEY, the values
      don't match what was written.
      
      For example, on s390x:
      
      # echo "1-2-3-4" > /proc/sys/net/ipv4/tcp_fastopen_key
      # cat /proc/sys/net/ipv4/tcp_fastopen_key
      02000000-01000000-04000000-03000000
      
      Instead of:
      
      # cat /proc/sys/net/ipv4/tcp_fastopen_key
      00000001-00000002-00000003-00000004
      
      Fix this by converting to the correct endianness on read. This was
      reported by Colin Ian King when running the 'tcp_fastopen_backup_key' net
      selftest on s390x, which depends on the read value matching what was
      written. I've confirmed that the test now passes on big and little endian
      systems.
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Fixes: 438ac880 ("net: fastopen: robustness and endianness fixes for SipHash")
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Reported-and-tested-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f19008e6
  17. 01 8月, 2020 1 次提交
  18. 29 7月, 2020 1 次提交
  19. 25 7月, 2020 3 次提交
  20. 20 7月, 2020 1 次提交
  21. 10 7月, 2020 1 次提交
    • C
      tcp: make sure listeners don't initialize congestion-control state · ce69e563
      Christoph Paasch 提交于
      syzkaller found its way into setsockopt with TCP_CONGESTION "cdg".
      tcp_cdg_init() does a kcalloc to store the gradients. As sk_clone_lock
      just copies all the memory, the allocated pointer will be copied as
      well, if the app called setsockopt(..., TCP_CONGESTION) on the listener.
      If now the socket will be destroyed before the congestion-control
      has properly been initialized (through a call to tcp_init_transfer), we
      will end up freeing memory that does not belong to that particular
      socket, opening the door to a double-free:
      
      [   11.413102] ==================================================================
      [   11.414181] BUG: KASAN: double-free or invalid-free in tcp_cleanup_congestion_control+0x58/0xd0
      [   11.415329]
      [   11.415560] CPU: 3 PID: 4884 Comm: syz-executor.5 Not tainted 5.8.0-rc2 #80
      [   11.416544] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [   11.418148] Call Trace:
      [   11.418534]  <IRQ>
      [   11.418834]  dump_stack+0x7d/0xb0
      [   11.419297]  print_address_description.constprop.0+0x1a/0x210
      [   11.422079]  kasan_report_invalid_free+0x51/0x80
      [   11.423433]  __kasan_slab_free+0x15e/0x170
      [   11.424761]  kfree+0x8c/0x230
      [   11.425157]  tcp_cleanup_congestion_control+0x58/0xd0
      [   11.425872]  tcp_v4_destroy_sock+0x57/0x5a0
      [   11.426493]  inet_csk_destroy_sock+0x153/0x2c0
      [   11.427093]  tcp_v4_syn_recv_sock+0xb29/0x1100
      [   11.427731]  tcp_get_cookie_sock+0xc3/0x4a0
      [   11.429457]  cookie_v4_check+0x13d0/0x2500
      [   11.433189]  tcp_v4_do_rcv+0x60e/0x780
      [   11.433727]  tcp_v4_rcv+0x2869/0x2e10
      [   11.437143]  ip_protocol_deliver_rcu+0x23/0x190
      [   11.437810]  ip_local_deliver+0x294/0x350
      [   11.439566]  __netif_receive_skb_one_core+0x15d/0x1a0
      [   11.441995]  process_backlog+0x1b1/0x6b0
      [   11.443148]  net_rx_action+0x37e/0xc40
      [   11.445361]  __do_softirq+0x18c/0x61a
      [   11.445881]  asm_call_on_stack+0x12/0x20
      [   11.446409]  </IRQ>
      [   11.446716]  do_softirq_own_stack+0x34/0x40
      [   11.447259]  do_softirq.part.0+0x26/0x30
      [   11.447827]  __local_bh_enable_ip+0x46/0x50
      [   11.448406]  ip_finish_output2+0x60f/0x1bc0
      [   11.450109]  __ip_queue_xmit+0x71c/0x1b60
      [   11.451861]  __tcp_transmit_skb+0x1727/0x3bb0
      [   11.453789]  tcp_rcv_state_process+0x3070/0x4d3a
      [   11.456810]  tcp_v4_do_rcv+0x2ad/0x780
      [   11.457995]  __release_sock+0x14b/0x2c0
      [   11.458529]  release_sock+0x4a/0x170
      [   11.459005]  __inet_stream_connect+0x467/0xc80
      [   11.461435]  inet_stream_connect+0x4e/0xa0
      [   11.462043]  __sys_connect+0x204/0x270
      [   11.465515]  __x64_sys_connect+0x6a/0xb0
      [   11.466088]  do_syscall_64+0x3e/0x70
      [   11.466617]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   11.467341] RIP: 0033:0x7f56046dc469
      [   11.467844] Code: Bad RIP value.
      [   11.468282] RSP: 002b:00007f5604dccdd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      [   11.469326] RAX: ffffffffffffffda RBX: 000000000068bf00 RCX: 00007f56046dc469
      [   11.470379] RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000004
      [   11.471311] RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
      [   11.472286] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [   11.473341] R13: 000000000041427c R14: 00007f5604dcd5c0 R15: 0000000000000003
      [   11.474321]
      [   11.474527] Allocated by task 4884:
      [   11.475031]  save_stack+0x1b/0x40
      [   11.475548]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [   11.476182]  tcp_cdg_init+0xf0/0x150
      [   11.476744]  tcp_init_congestion_control+0x9b/0x3a0
      [   11.477435]  tcp_set_congestion_control+0x270/0x32f
      [   11.478088]  do_tcp_setsockopt.isra.0+0x521/0x1a00
      [   11.478744]  __sys_setsockopt+0xff/0x1e0
      [   11.479259]  __x64_sys_setsockopt+0xb5/0x150
      [   11.479895]  do_syscall_64+0x3e/0x70
      [   11.480395]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   11.481097]
      [   11.481321] Freed by task 4872:
      [   11.481783]  save_stack+0x1b/0x40
      [   11.482230]  __kasan_slab_free+0x12c/0x170
      [   11.482839]  kfree+0x8c/0x230
      [   11.483240]  tcp_cleanup_congestion_control+0x58/0xd0
      [   11.483948]  tcp_v4_destroy_sock+0x57/0x5a0
      [   11.484502]  inet_csk_destroy_sock+0x153/0x2c0
      [   11.485144]  tcp_close+0x932/0xfe0
      [   11.485642]  inet_release+0xc1/0x1c0
      [   11.486131]  __sock_release+0xc0/0x270
      [   11.486697]  sock_close+0xc/0x10
      [   11.487145]  __fput+0x277/0x780
      [   11.487632]  task_work_run+0xeb/0x180
      [   11.488118]  __prepare_exit_to_usermode+0x15a/0x160
      [   11.488834]  do_syscall_64+0x4a/0x70
      [   11.489326]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Wei Wang fixed a part of these CDG-malloc issues with commit c1201444
      ("tcp: memset ca_priv data to 0 properly").
      
      This patch here fixes the listener-scenario: We make sure that listeners
      setting the congestion-control through setsockopt won't initialize it
      (thus CDG never allocates on listeners). For those who use AF_UNSPEC to
      reuse a socket, tcp_disconnect() is changed to cleanup afterwards.
      
      (The issue can be reproduced at least down to v4.4.x.)
      
      Cc: Wei Wang <weiwan@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Fixes: 2b0a8c9e ("tcp: add CDG congestion control")
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce69e563
  22. 03 7月, 2020 1 次提交
    • E
      tcp: md5: allow changing MD5 keys in all socket states · 1ca0fafd
      Eric Dumazet 提交于
      This essentially reverts commit 72123032 ("tcp: md5: reject TCP_MD5SIG
      or TCP_MD5SIG_EXT on established sockets")
      
      Mathieu reported that many vendors BGP implementations can
      actually switch TCP MD5 on established flows.
      
      Quoting Mathieu :
         Here is a list of a few network vendors along with their behavior
         with respect to TCP MD5:
      
         - Cisco: Allows for password to be changed, but within the hold-down
           timer (~180 seconds).
         - Juniper: When password is initially set on active connection it will
           reset, but after that any subsequent password changes no network
           resets.
         - Nokia: No notes on if they flap the tcp connection or not.
         - Ericsson/RedBack: Allows for 2 password (old/new) to co-exist until
           both sides are ok with new passwords.
         - Meta-Switch: Expects the password to be set before a connection is
           attempted, but no further info on whether they reset the TCP
           connection on a change.
         - Avaya: Disable the neighbor, then set password, then re-enable.
         - Zebos: Would normally allow the change when socket connected.
      
      We can revert my prior change because commit 9424e2e7 ("tcp: md5: fix potential
      overestimation of TCP option space") removed the leak of 4 kernel bytes to
      the wire that was the main reason for my patch.
      
      While doing my investigations, I found a bug when a MD5 key is changed, leading
      to these commits that stable teams want to consider before backporting this revert :
      
       Commit 6a2febec ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
       Commit e6ced831 ("tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers")
      
      Fixes: 72123032 "tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets"
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ca0fafd
  23. 02 7月, 2020 1 次提交
    • E
      tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers · e6ced831
      Eric Dumazet 提交于
      My prior fix went a bit too far, according to Herbert and Mathieu.
      
      Since we accept that concurrent TCP MD5 lookups might see inconsistent
      keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
      
      Clearing all key->key[] is needed to avoid possible KMSAN reports,
      if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
      using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
      
      data_race() was added in linux-5.8 and will prevent KCSAN reports,
      this can safely be removed in stable backports, if data_race() is
      not yet backported.
      
      v2: use data_race() both in tcp_md5_hash_key() and tcp_md5_do_add()
      
      Fixes: 6a2febec ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Marco Elver <elver@google.com>
      Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e6ced831
  24. 01 7月, 2020 1 次提交
    • E
      tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key() · 6a2febec
      Eric Dumazet 提交于
      MD5 keys are read with RCU protection, and tcp_md5_do_add()
      might update in-place a prior key.
      
      Normally, typical RCU updates would allocate a new piece
      of memory. In this case only key->key and key->keylen might
      be updated, and we do not care if an incoming packet could
      see the old key, the new one, or some intermediate value,
      since changing the key on a live flow is known to be problematic
      anyway.
      
      We only want to make sure that in the case key->keylen
      is changed, cpus in tcp_md5_hash_key() wont try to use
      uninitialized data, or crash because key->keylen was
      read twice to feed sg_init_one() and ahash_request_set_crypt()
      
      Fixes: 9ea88a15 ("tcp: md5: check md5 signature without socket lock")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a2febec
  25. 25 6月, 2020 1 次提交
  26. 10 6月, 2020 1 次提交