1. 25 7月, 2020 10 次提交
  2. 22 7月, 2020 1 次提交
  3. 20 7月, 2020 11 次提交
  4. 18 7月, 2020 6 次提交
  5. 14 7月, 2020 1 次提交
  6. 11 7月, 2020 3 次提交
    • K
      inet: Remove an unnecessary argument of syn_ack_recalc(). · a594920f
      Kuniyuki Iwashima 提交于
      Commit 0c3d79bc ("tcp: reduce SYN-ACK
      retrans for TCP_DEFER_ACCEPT") introduces syn_ack_recalc() which decides
      if a minisock is held and a SYN+ACK is retransmitted or not.
      
      If rskq_defer_accept is not zero in syn_ack_recalc(), max_retries always
      has the same value because max_retries is overwritten by rskq_defer_accept
      in reqsk_timer_handler().
      
      This commit adds three changes:
      - remove redundant non-zero check for rskq_defer_accept in
         reqsk_timer_handler().
      - remove max_retries from the arguments of syn_ack_recalc() and use
         rskq_defer_accept instead.
      - rename thresh to max_syn_ack_retries for readability.
      Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.co.jp>
      Reviewed-by: NBenjamin Herrenschmidt <benh@amazon.com>
      CC: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a594920f
    • J
      ethtool: add tunnel info interface · c7d759eb
      Jakub Kicinski 提交于
      Add an interface to report offloaded UDP ports via ethtool netlink.
      
      Now that core takes care of tracking which UDP tunnel ports the NICs
      are aware of we can quite easily export this information out to
      user space.
      
      The responsibility of writing the netlink dumps is split between
      ethtool code and udp_tunnel_nic.c - since udp_tunnel module may
      not always be loaded, yet we should always report the capabilities
      of the NIC.
      
      $ ethtool --show-tunnels eth0
      Tunnel information for eth0:
        UDP port table 0:
          Size: 4
          Types: vxlan
          No entries
        UDP port table 1:
          Size: 4
          Types: geneve, vxlan-gpe
          Entries (1):
              port 1230, vxlan-gpe
      
      v4:
       - back to v2, build fix is now directly in udp_tunnel.h
      v3:
       - don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET
         not set.
      v2:
       - fix string set count,
       - reorder enums in the uAPI,
       - fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset
         in docs and comments.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7d759eb
    • J
      udp_tunnel: add central NIC RX port offload infrastructure · cc4e3835
      Jakub Kicinski 提交于
      Cater to devices which:
       (a) may want to sleep in the callbacks;
       (b) only have IPv4 support;
       (c) need all the programming to happen while the netdev is up.
      
      Drivers attach UDP tunnel offload info struct to their netdevs,
      where they declare how many UDP ports of various tunnel types
      they support. Core takes care of tracking which ports to offload.
      
      Use a fixed-size array since this matches what almost all drivers
      do, and avoids a complexity and uncertainty around memory allocations
      in an atomic context.
      
      Make sure that tunnel drivers don't try to replay the ports when
      new NIC netdev is registered. Automatic replays would mess up
      reference counting, and will be removed completely once all drivers
      are converted.
      
      v4:
       - use a #define NULL to avoid build issues with CONFIG_INET=n.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc4e3835
  7. 10 7月, 2020 2 次提交
    • C
      tcp: make sure listeners don't initialize congestion-control state · ce69e563
      Christoph Paasch 提交于
      syzkaller found its way into setsockopt with TCP_CONGESTION "cdg".
      tcp_cdg_init() does a kcalloc to store the gradients. As sk_clone_lock
      just copies all the memory, the allocated pointer will be copied as
      well, if the app called setsockopt(..., TCP_CONGESTION) on the listener.
      If now the socket will be destroyed before the congestion-control
      has properly been initialized (through a call to tcp_init_transfer), we
      will end up freeing memory that does not belong to that particular
      socket, opening the door to a double-free:
      
      [   11.413102] ==================================================================
      [   11.414181] BUG: KASAN: double-free or invalid-free in tcp_cleanup_congestion_control+0x58/0xd0
      [   11.415329]
      [   11.415560] CPU: 3 PID: 4884 Comm: syz-executor.5 Not tainted 5.8.0-rc2 #80
      [   11.416544] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [   11.418148] Call Trace:
      [   11.418534]  <IRQ>
      [   11.418834]  dump_stack+0x7d/0xb0
      [   11.419297]  print_address_description.constprop.0+0x1a/0x210
      [   11.422079]  kasan_report_invalid_free+0x51/0x80
      [   11.423433]  __kasan_slab_free+0x15e/0x170
      [   11.424761]  kfree+0x8c/0x230
      [   11.425157]  tcp_cleanup_congestion_control+0x58/0xd0
      [   11.425872]  tcp_v4_destroy_sock+0x57/0x5a0
      [   11.426493]  inet_csk_destroy_sock+0x153/0x2c0
      [   11.427093]  tcp_v4_syn_recv_sock+0xb29/0x1100
      [   11.427731]  tcp_get_cookie_sock+0xc3/0x4a0
      [   11.429457]  cookie_v4_check+0x13d0/0x2500
      [   11.433189]  tcp_v4_do_rcv+0x60e/0x780
      [   11.433727]  tcp_v4_rcv+0x2869/0x2e10
      [   11.437143]  ip_protocol_deliver_rcu+0x23/0x190
      [   11.437810]  ip_local_deliver+0x294/0x350
      [   11.439566]  __netif_receive_skb_one_core+0x15d/0x1a0
      [   11.441995]  process_backlog+0x1b1/0x6b0
      [   11.443148]  net_rx_action+0x37e/0xc40
      [   11.445361]  __do_softirq+0x18c/0x61a
      [   11.445881]  asm_call_on_stack+0x12/0x20
      [   11.446409]  </IRQ>
      [   11.446716]  do_softirq_own_stack+0x34/0x40
      [   11.447259]  do_softirq.part.0+0x26/0x30
      [   11.447827]  __local_bh_enable_ip+0x46/0x50
      [   11.448406]  ip_finish_output2+0x60f/0x1bc0
      [   11.450109]  __ip_queue_xmit+0x71c/0x1b60
      [   11.451861]  __tcp_transmit_skb+0x1727/0x3bb0
      [   11.453789]  tcp_rcv_state_process+0x3070/0x4d3a
      [   11.456810]  tcp_v4_do_rcv+0x2ad/0x780
      [   11.457995]  __release_sock+0x14b/0x2c0
      [   11.458529]  release_sock+0x4a/0x170
      [   11.459005]  __inet_stream_connect+0x467/0xc80
      [   11.461435]  inet_stream_connect+0x4e/0xa0
      [   11.462043]  __sys_connect+0x204/0x270
      [   11.465515]  __x64_sys_connect+0x6a/0xb0
      [   11.466088]  do_syscall_64+0x3e/0x70
      [   11.466617]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   11.467341] RIP: 0033:0x7f56046dc469
      [   11.467844] Code: Bad RIP value.
      [   11.468282] RSP: 002b:00007f5604dccdd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      [   11.469326] RAX: ffffffffffffffda RBX: 000000000068bf00 RCX: 00007f56046dc469
      [   11.470379] RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000004
      [   11.471311] RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
      [   11.472286] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [   11.473341] R13: 000000000041427c R14: 00007f5604dcd5c0 R15: 0000000000000003
      [   11.474321]
      [   11.474527] Allocated by task 4884:
      [   11.475031]  save_stack+0x1b/0x40
      [   11.475548]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [   11.476182]  tcp_cdg_init+0xf0/0x150
      [   11.476744]  tcp_init_congestion_control+0x9b/0x3a0
      [   11.477435]  tcp_set_congestion_control+0x270/0x32f
      [   11.478088]  do_tcp_setsockopt.isra.0+0x521/0x1a00
      [   11.478744]  __sys_setsockopt+0xff/0x1e0
      [   11.479259]  __x64_sys_setsockopt+0xb5/0x150
      [   11.479895]  do_syscall_64+0x3e/0x70
      [   11.480395]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   11.481097]
      [   11.481321] Freed by task 4872:
      [   11.481783]  save_stack+0x1b/0x40
      [   11.482230]  __kasan_slab_free+0x12c/0x170
      [   11.482839]  kfree+0x8c/0x230
      [   11.483240]  tcp_cleanup_congestion_control+0x58/0xd0
      [   11.483948]  tcp_v4_destroy_sock+0x57/0x5a0
      [   11.484502]  inet_csk_destroy_sock+0x153/0x2c0
      [   11.485144]  tcp_close+0x932/0xfe0
      [   11.485642]  inet_release+0xc1/0x1c0
      [   11.486131]  __sock_release+0xc0/0x270
      [   11.486697]  sock_close+0xc/0x10
      [   11.487145]  __fput+0x277/0x780
      [   11.487632]  task_work_run+0xeb/0x180
      [   11.488118]  __prepare_exit_to_usermode+0x15a/0x160
      [   11.488834]  do_syscall_64+0x4a/0x70
      [   11.489326]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Wei Wang fixed a part of these CDG-malloc issues with commit c1201444
      ("tcp: memset ca_priv data to 0 properly").
      
      This patch here fixes the listener-scenario: We make sure that listeners
      setting the congestion-control through setsockopt won't initialize it
      (thus CDG never allocates on listeners). For those who use AF_UNSPEC to
      reuse a socket, tcp_disconnect() is changed to cleanup afterwards.
      
      (The issue can be reproduced at least down to v4.4.x.)
      
      Cc: Wei Wang <weiwan@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Fixes: 2b0a8c9e ("tcp: add CDG congestion control")
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce69e563
    • P
      inet_diag: support for wider protocol numbers · 3f935c75
      Paolo Abeni 提交于
      After commit bf976514 ("sock: Make sk_protocol a 16-bit value")
      the current size of 'sdiag_protocol' is not sufficient to represent
      the possible protocol values.
      
      This change introduces a new inet diag request attribute to let
      user space specify the relevant protocol number using u32 values.
      
      The attribute is parsed by inet diag core on get/dump command
      and the extended protocol value, if available, is preferred to
      'sdiag_protocol' to lookup the diag handler.
      
      The parse attributed are exposed to all the diag handlers via
      the cb->data.
      
      Note that inet_diag_dump_one_icsk() is left unmodified, as it
      will not be used by protocol using the extended attribute.
      Suggested-by: NDavid S. Miller <davem@davemloft.net>
      Co-developed-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Acked-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f935c75
  8. 08 7月, 2020 3 次提交
  9. 07 7月, 2020 1 次提交
  10. 04 7月, 2020 2 次提交