1. 17 12月, 2021 3 次提交
    • E
      sit: do not call ipip6_dev_free() from sit_init_net() · e28587cc
      Eric Dumazet 提交于
      ipip6_dev_free is sit dev->priv_destructor, already called
      by register_netdevice() if something goes wrong.
      
      Alternative would be to make ipip6_dev_free() robust against
      multiple invocations, but other drivers do not implement this
      strategy.
      
      syzbot reported:
      
      dst_release underflow
      WARNING: CPU: 0 PID: 5059 at net/core/dst.c:173 dst_release+0xd8/0xe0 net/core/dst.c:173
      Modules linked in:
      CPU: 1 PID: 5059 Comm: syz-executor.4 Not tainted 5.16.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:dst_release+0xd8/0xe0 net/core/dst.c:173
      Code: 4c 89 f2 89 d9 31 c0 5b 41 5e 5d e9 da d5 44 f9 e8 1d 90 5f f9 c6 05 87 48 c6 05 01 48 c7 c7 80 44 99 8b 31 c0 e8 e8 67 29 f9 <0f> 0b eb 85 0f 1f 40 00 53 48 89 fb e8 f7 8f 5f f9 48 83 c3 a8 48
      RSP: 0018:ffffc9000aa5faa0 EFLAGS: 00010246
      RAX: d6894a925dd15a00 RBX: 00000000ffffffff RCX: 0000000000040000
      RDX: ffffc90005e19000 RSI: 000000000003ffff RDI: 0000000000040000
      RBP: 0000000000000000 R08: ffffffff816a1f42 R09: ffffed1017344f2c
      R10: ffffed1017344f2c R11: 0000000000000000 R12: 0000607f462b1358
      R13: 1ffffffff1bfd305 R14: ffffe8ffffcb1358 R15: dffffc0000000000
      FS:  00007f66c71a2700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f88aaed5058 CR3: 0000000023e0f000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       dst_cache_destroy+0x107/0x1e0 net/core/dst_cache.c:160
       ipip6_dev_free net/ipv6/sit.c:1414 [inline]
       sit_init_net+0x229/0x550 net/ipv6/sit.c:1936
       ops_init+0x313/0x430 net/core/net_namespace.c:140
       setup_net+0x35b/0x9d0 net/core/net_namespace.c:326
       copy_net_ns+0x359/0x5c0 net/core/net_namespace.c:470
       create_new_namespaces+0x4ce/0xa00 kernel/nsproxy.c:110
       unshare_nsproxy_namespaces+0x11e/0x180 kernel/nsproxy.c:226
       ksys_unshare+0x57d/0xb50 kernel/fork.c:3075
       __do_sys_unshare kernel/fork.c:3146 [inline]
       __se_sys_unshare kernel/fork.c:3144 [inline]
       __x64_sys_unshare+0x34/0x40 kernel/fork.c:3144
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f66c882ce99
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f66c71a2168 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
      RAX: ffffffffffffffda RBX: 00007f66c893ff60 RCX: 00007f66c882ce99
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000048040200
      RBP: 00007f66c8886ff1 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007fff6634832f R14: 00007f66c71a2300 R15: 0000000000022000
       </TASK>
      
      Fixes: cf124db5 ("net: Fix inconsistent teardown and release of private netdev state.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20211216111741.1387540-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      e28587cc
    • F
      net: systemport: Add global locking for descriptor lifecycle · 8b8e6e78
      Florian Fainelli 提交于
      The descriptor list is a shared resource across all of the transmit queues, and
      the locking mechanism used today only protects concurrency across a given
      transmit queue between the transmit and reclaiming. This creates an opportunity
      for the SYSTEMPORT hardware to work on corrupted descriptors if we have
      multiple producers at once which is the case when using multiple transmit
      queues.
      
      This was particularly noticeable when using multiple flows/transmit queues and
      it showed up in interesting ways in that UDP packets would get a correct UDP
      header checksum being calculated over an incorrect packet length. Similarly TCP
      packets would get an equally correct checksum computed by the hardware over an
      incorrect packet length.
      
      The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges
      when the driver produces a new descriptor anytime it writes to the
      WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to
      re-organize its descriptors and it is possible that concurrent TX queues
      eventually break this internal allocation scheme to the point where the
      length/status part of the descriptor gets used for an incorrect data buffer.
      
      The fix is to impose a global serialization for all TX queues in the short
      section where we are writing to the WRITE_PORT_{HI,LO} registers which solves
      the corruption even with multiple concurrent TX queues being used.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20211215202450.4086240-1-f.fainelli@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      8b8e6e78
    • D
      net/smc: Prevent smc_release() from long blocking · 5c15b312
      D. Wythe 提交于
      In nginx/wrk benchmark, there's a hung problem with high probability
      on case likes that: (client will last several minutes to exit)
      
      server: smc_run nginx
      
      client: smc_run wrk -c 10000 -t 1 http://server
      
      Client hangs with the following backtrace:
      
      0 [ffffa7ce8Of3bbf8] __schedule at ffffffff9f9eOd5f
      1 [ffffa7ce8Of3bc88] schedule at ffffffff9f9eløe6
      2 [ffffa7ce8Of3bcaO] schedule_timeout at ffffffff9f9e3f3c
      3 [ffffa7ce8Of3bd2O] wait_for_common at ffffffff9f9el9de
      4 [ffffa7ce8Of3bd8O] __flush_work at ffffffff9fOfeOl3
      5 [ffffa7ce8øf3bdfO] smc_release at ffffffffcO697d24 [smc]
      6 [ffffa7ce8Of3be2O] __sock_release at ffffffff9f8O2e2d
      7 [ffffa7ce8Of3be4ø] sock_close at ffffffff9f8ø2ebl
      8 [ffffa7ce8øf3be48] __fput at ffffffff9f334f93
      9 [ffffa7ce8Of3be78] task_work_run at ffffffff9flOlff5
      10 [ffffa7ce8Of3beaO] do_exit at ffffffff9fOe5Ol2
      11 [ffffa7ce8Of3bflO] do_group_exit at ffffffff9fOe592a
      12 [ffffa7ce8Of3bf38] __x64_sys_exit_group at ffffffff9fOe5994
      13 [ffffa7ce8Of3bf4O] do_syscall_64 at ffffffff9f9d4373
      14 [ffffa7ce8Of3bfsO] entry_SYSCALL_64_after_hwframe at ffffffff9fa0007c
      
      This issue dues to flush_work(), which is used to wait for
      smc_connect_work() to finish in smc_release(). Once lots of
      smc_connect_work() was pending or all executing work dangling,
      smc_release() has to block until one worker comes to free, which
      is equivalent to wait another smc_connnect_work() to finish.
      
      In order to fix this, There are two changes:
      
      1. For those idle smc_connect_work(), cancel it from the workqueue; for
         executing smc_connect_work(), waiting for it to finish. For that
         purpose, replace flush_work() with cancel_work_sync().
      
      2. Since smc_connect() hold a reference for passive closing, if
         smc_connect_work() has been cancelled, release the reference.
      
      Fixes: 24ac3a08 ("net/smc: rebuild nonblocking connect")
      Reported-by: NTony Lu <tonylu@linux.alibaba.com>
      Tested-by: NDust Li <dust.li@linux.alibaba.com>
      Reviewed-by: NDust Li <dust.li@linux.alibaba.com>
      Reviewed-by: NTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: ND. Wythe <alibuda@linux.alibaba.com>
      Acked-by: NKarsten Graul <kgraul@linux.ibm.com>
      Link: https://lore.kernel.org/r/1639571361-101128-1-git-send-email-alibuda@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      5c15b312
  2. 16 12月, 2021 15 次提交
  3. 15 12月, 2021 9 次提交
  4. 14 12月, 2021 13 次提交