1. 11 8月, 2022 1 次提交
    • H
      net: fix refcount bug in sk_psock_get (2) · 2a013372
      Hawkins Jiawei 提交于
      Syzkaller reports refcount bug as follows:
      ------------[ cut here ]------------
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
      Modules linked in:
      CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda #0
       <TASK>
       __refcount_add_not_zero include/linux/refcount.h:163 [inline]
       __refcount_inc_not_zero include/linux/refcount.h:227 [inline]
       refcount_inc_not_zero include/linux/refcount.h:245 [inline]
       sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
       tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
       tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
       tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
       tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
       tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
       sk_backlog_rcv include/net/sock.h:1061 [inline]
       __release_sock+0x134/0x3b0 net/core/sock.c:2849
       release_sock+0x54/0x1b0 net/core/sock.c:3404
       inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
       __sys_shutdown_sock net/socket.c:2331 [inline]
       __sys_shutdown_sock net/socket.c:2325 [inline]
       __sys_shutdown+0xf1/0x1b0 net/socket.c:2343
       __do_sys_shutdown net/socket.c:2351 [inline]
       __se_sys_shutdown net/socket.c:2349 [inline]
       __x64_sys_shutdown+0x50/0x70 net/socket.c:2349
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
       </TASK>
      
      During SMC fallback process in connect syscall, kernel will
      replaces TCP with SMC. In order to forward wakeup
      smc socket waitqueue after fallback, kernel will sets
      clcsk->sk_user_data to origin smc socket in
      smc_fback_replace_callbacks().
      
      Later, in shutdown syscall, kernel will calls
      sk_psock_get(), which treats the clcsk->sk_user_data
      as psock type, triggering the refcnt warning.
      
      So, the root cause is that smc and psock, both will use
      sk_user_data field. So they will mismatch this field
      easily.
      
      This patch solves it by using another bit(defined as
      SK_USER_DATA_PSOCK) in PTRMASK, to mark whether
      sk_user_data points to a psock object or not.
      This patch depends on a PTRMASK introduced in commit f1ff5ce2
      ("net, sk_msg: Clear sk_user_data pointer on clone if tagged").
      
      For there will possibly be more flags in the sk_user_data field,
      this patch also refactor sk_user_data flags code to be more generic
      to improve its maintainability.
      
      Reported-and-tested-by: syzbot+5f26f85569bd179c18ce@syzkaller.appspotmail.com
      Suggested-by: NJakub Kicinski <kuba@kernel.org>
      Acked-by: NWen Gu <guwen@linux.alibaba.com>
      Signed-off-by: NHawkins Jiawei <yin31149@gmail.com>
      Reviewed-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      2a013372
  2. 10 8月, 2022 31 次提交
    • J
      genetlink: correct uAPI defines · f329a0eb
      Jakub Kicinski 提交于
      Commit 50a896cf ("genetlink: properly support per-op policy dumping")
      seems to have copy'n'pasted things a little incorrectly.
      
      The #define CTRL_ATTR_MCAST_GRP_MAX should have stayed right
      after the previous enum. The new CTRL_ATTR_POLICY_* needs
      its own define for MAX and that max should not contain the
      superfluous _DUMP in the name.
      
      We probably can't do anything about the CTRL_ATTR_POLICY_DUMP_MAX
      any more, there's likely code which uses it. For consistency
      (*cough* codegen *cough*) let's add the correctly name define
      nonetheless.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f329a0eb
    • I
      devlink: Fix use-after-free after a failed reload · 6b4db2e5
      Ido Schimmel 提交于
      After a failed devlink reload, devlink parameters are still registered,
      which means user space can set and get their values. In the case of the
      mlxsw "acl_region_rehash_interval" parameter, these operations will
      trigger a use-after-free [1].
      
      Fix this by rejecting set and get operations while in the failed state.
      Return the "-EOPNOTSUPP" error code which does not abort the parameters
      dump, but instead causes it to skip over the problematic parameter.
      
      Another possible fix is to perform these checks in the mlxsw parameter
      callbacks, but other drivers might be affected by the same problem and I
      am not aware of scenarios where these stricter checks will cause a
      regression.
      
      [1]
      mlxsw_spectrum3 0000:00:10.0: Port 125: Failed to register netdev
      mlxsw_spectrum3 0000:00:10.0: Failed to create ports
      
      ==================================================================
      BUG: KASAN: use-after-free in mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xbd/0xd0 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:904
      Read of size 4 at addr ffff8880099dcfd8 by task kworker/u4:4/777
      
      CPU: 1 PID: 777 Comm: kworker/u4:4 Not tainted 5.19.0-rc7-custom-126601-gfe26f28c586d #1
      Hardware name: QEMU MSN4700, BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      Workqueue: netns cleanup_net
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x92/0xbd lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:313 [inline]
       print_report.cold+0x5e/0x5cf mm/kasan/report.c:429
       kasan_report+0xb9/0xf0 mm/kasan/report.c:491
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/report_generic.c:306
       mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xbd/0xd0 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:904
       mlxsw_sp_acl_region_rehash_intrvl_get+0x49/0x60 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c:1106
       mlxsw_sp_params_acl_region_rehash_intrvl_get+0x33/0x80 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3854
       devlink_param_get net/core/devlink.c:4981 [inline]
       devlink_nl_param_fill+0x238/0x12d0 net/core/devlink.c:5089
       devlink_param_notify+0xe5/0x230 net/core/devlink.c:5168
       devlink_ns_change_notify net/core/devlink.c:4417 [inline]
       devlink_ns_change_notify net/core/devlink.c:4396 [inline]
       devlink_reload+0x15f/0x700 net/core/devlink.c:4507
       devlink_pernet_pre_exit+0x112/0x1d0 net/core/devlink.c:12272
       ops_pre_exit_list net/core/net_namespace.c:152 [inline]
       cleanup_net+0x494/0xc00 net/core/net_namespace.c:582
       process_one_work+0x9fc/0x1710 kernel/workqueue.c:2289
       worker_thread+0x675/0x10b0 kernel/workqueue.c:2436
       kthread+0x30c/0x3d0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
       </TASK>
      
      The buggy address belongs to the physical page:
      page:ffffea0000267700 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x99dc
      flags: 0x100000000000000(node=0|zone=1)
      raw: 0100000000000000 0000000000000000 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880099dce80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff8880099dcf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      >ffff8880099dcf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                                          ^
       ffff8880099dd000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff8880099dd080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      ==================================================================
      
      Fixes: 98bbf70c ("mlxsw: spectrum: add "acl_region_rehash_interval" devlink param")
      Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: NJiri Pirko <jiri@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b4db2e5
    • S
      net:bonding:support balance-alb interface with vlan to bridge · d5410ac7
      Sun Shouxin 提交于
      In my test, balance-alb bonding with two slaves eth0 and eth1,
      and then Bond0.150 is created with vlan id attached bond0.
      After adding bond0.150 into one linux bridge, I noted that Bond0,
      bond0.150 and  bridge were assigned to the same MAC as eth0.
      Once bond0.150 receives a packet whose dest IP is bridge's
      and dest MAC is eth1's, the linux bridge will not match
      eth1's MAC entry in FDB, and not handle it as expected.
      The patch fix the issue, and diagram as below:
      
      eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
                            |
                         bond0.150(mac:eth0_mac)
                            |
                         bridge(ip:br_ip, mac:eth0_mac)--other port
      Suggested-by: NHu Yadi <huyd12@chinatelecom.cn>
      Signed-off-by: NSun Shouxin <sunshouxin@chinatelecom.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5410ac7
    • C
      macsec: Fix traffic counters/statistics · 91ec9bd5
      Clayton Yager 提交于
      OutOctetsProtected, OutOctetsEncrypted, InOctetsValidated, and
      InOctetsDecrypted were incrementing by the total number of octets in frames
      instead of by the number of octets of User Data in frames.
      
      The Controlled Port statistics ifOutOctets and ifInOctets were incrementing
      by the total number of octets instead of the number of octets of the MSDUs
      plus octets of the destination and source MAC addresses.
      
      The Controlled Port statistics ifInDiscards and ifInErrors were not
      incrementing each time the counters they aggregate were.
      
      The Controlled Port statistic ifInErrors was not included in the output of
      macsec_get_stats64 so the value was not present in ip commands output.
      
      The ReceiveSA counters InPktsNotValid, InPktsNotUsingSA, and InPktsUnusedSA
      were not incrementing.
      Signed-off-by: NClayton Yager <Clayton_Yager@selinc.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91ec9bd5
    • P
      vsock: Set socket state back to SS_UNCONNECTED in vsock_connect_timeout() · a3e7b29e
      Peilin Ye 提交于
      Imagine two non-blocking vsock_connect() requests on the same socket.
      The first request schedules @connect_work, and after it times out,
      vsock_connect_timeout() sets *sock* state back to TCP_CLOSE, but keeps
      *socket* state as SS_CONNECTING.
      
      Later, the second request returns -EALREADY, meaning the socket "already
      has a pending connection in progress", even though the first request has
      already timed out.
      
      As suggested by Stefano, fix it by setting *socket* state back to
      SS_UNCONNECTED, so that the second request will return -ETIMEDOUT.
      Suggested-by: NStefano Garzarella <sgarzare@redhat.com>
      Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NPeilin Ye <peilin.ye@bytedance.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a3e7b29e
    • P
      vsock: Fix memory leak in vsock_connect() · 7e97cfed
      Peilin Ye 提交于
      An O_NONBLOCK vsock_connect() request may try to reschedule
      @connect_work.  Imagine the following sequence of vsock_connect()
      requests:
      
        1. The 1st, non-blocking request schedules @connect_work, which will
           expire after 200 jiffies.  Socket state is now SS_CONNECTING;
      
        2. Later, the 2nd, blocking request gets interrupted by a signal after
           a few jiffies while waiting for the connection to be established.
           Socket state is back to SS_UNCONNECTED, but @connect_work is still
           pending, and will expire after 100 jiffies.
      
        3. Now, the 3rd, non-blocking request tries to schedule @connect_work
           again.  Since @connect_work is already scheduled,
           schedule_delayed_work() silently returns.  sock_hold() is called
           twice, but sock_put() will only be called once in
           vsock_connect_timeout(), causing a memory leak reported by syzbot:
      
        BUG: memory leak
        unreferenced object 0xffff88810ea56a40 (size 1232):
          comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
          hex dump (first 32 bytes):
            00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
            28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  (..@............
          backtrace:
            [<ffffffff837c830e>] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
            [<ffffffff837cbe22>] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
            [<ffffffff842ccf68>] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
            [<ffffffff842ce8f1>] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
            [<ffffffff837c0cbb>] __sock_create+0x1ab/0x2b0 net/socket.c:1468
            [<ffffffff837c3acf>] sock_create net/socket.c:1519 [inline]
            [<ffffffff837c3acf>] __sys_socket+0x6f/0x140 net/socket.c:1561
            [<ffffffff837c3bba>] __do_sys_socket net/socket.c:1570 [inline]
            [<ffffffff837c3bba>] __se_sys_socket net/socket.c:1568 [inline]
            [<ffffffff837c3bba>] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
            [<ffffffff84512815>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
            [<ffffffff84512815>] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
            [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae
        <...>
      
      Use mod_delayed_work() instead: if @connect_work is already scheduled,
      reschedule it, and undo sock_hold() to keep the reference count
      balanced.
      
      Reported-and-tested-by: syzbot+b03f55bf128f9a38f064@syzkaller.appspotmail.com
      Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
      Co-developed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NPeilin Ye <peilin.ye@bytedance.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e97cfed
    • J
      Revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" · 6fd2c17f
      Jose Alonso 提交于
      This reverts commit 36a15e1c.
      
      The usage of FLAG_SEND_ZLP causes problems to other firmware/hardware
      versions that have no issues.
      
      The FLAG_SEND_ZLP is not safe to use in this context.
      See:
      https://patchwork.ozlabs.org/project/netdev/patch/1270599787.8900.8.camel@Linuxdev4-laptop/#118378
      The original problem needs another way to solve.
      
      Fixes: 36a15e1c ("net: usb: ax88179_178a needs FLAG_SEND_ZLP")
      Cc: stable@vger.kernel.org
      Reported-by: NRonald Wahl <ronald.wahl@raritan.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216327
      Link: https://bugs.archlinux.org/task/75491Signed-off-by: NJose Alonso <joalonsof@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6fd2c17f
    • T
      netlabel: fix typo in comment · 2cd0e8db
      Topi Miettinen 提交于
      'IPv4 and IPv4' should be 'IPv4 and IPv6'.
      Signed-off-by: NTopi Miettinen <toiwoton@gmail.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cd0e8db
    • D
      Merge tag 'linux-can-fixes-for-6.0-20220810' of... · e7f16495
      David S. Miller 提交于
      Merge tag 'linux-can-fixes-for-6.0-20220810' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      this is a pull request of 4 patches for net/master, with the
      whitespace issue fixed.
      
      Fedor Pchelkin contributes 2 fixes for the j1939 CAN protocol.
      
      A patch by me for the ems_usb driver fixes an unaligned access
      warning.
      
      Sebastian Würl's patch for the mcp251x driver fixes a race condition
      in the receive interrupt.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e7f16495
    • J
      Merge branch 'do-not-use-rt_tos-for-ipv6-flowlabel' · 996237d9
      Jakub Kicinski 提交于
      Matthias May says:
      
      ====================
      Do not use RT_TOS for IPv6 flowlabel
      
      According to Guillaume Nault RT_TOS should never be used for IPv6.
      
      Quote:
      RT_TOS() is an old macro used to interprete IPv4 TOS as described in
      the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
      code, although, given the current state of the code, most of the
      existing calls have no consequence.
      
      But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
      field to be interpreted the RFC 1349 way. There's no historical
      compatibility to worry about.
      ====================
      
      Link: https://lore.kernel.org/r/20220805191906.9323-1-matthias.may@westermo.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      996237d9
    • M
      ipv6: do not use RT_TOS for IPv6 flowlabel · ab7e2e0d
      Matthias May 提交于
      According to Guillaume Nault RT_TOS should never be used for IPv6.
      
      Quote:
      RT_TOS() is an old macro used to interprete IPv4 TOS as described in
      the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
      code, although, given the current state of the code, most of the
      existing calls have no consequence.
      
      But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
      field to be interpreted the RFC 1349 way. There's no historical
      compatibility to worry about.
      
      Fixes: 571912c6 ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NMatthias May <matthias.may@westermo.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ab7e2e0d
    • M
      mlx5: do not use RT_TOS for IPv6 flowlabel · bcb0da7f
      Matthias May 提交于
      According to Guillaume Nault RT_TOS should never be used for IPv6.
      
      Quote:
      RT_TOS() is an old macro used to interprete IPv4 TOS as described in
      the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
      code, although, given the current state of the code, most of the
      existing calls have no consequence.
      
      But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
      field to be interpreted the RFC 1349 way. There's no historical
      compatibility to worry about.
      
      Fixes: ce99f6b9 ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels")
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NMatthias May <matthias.may@westermo.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      bcb0da7f
    • M
      vxlan: do not use RT_TOS for IPv6 flowlabel · e488d4f5
      Matthias May 提交于
      According to Guillaume Nault RT_TOS should never be used for IPv6.
      
      Quote:
      RT_TOS() is an old macro used to interprete IPv4 TOS as described in
      the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
      code, although, given the current state of the code, most of the
      existing calls have no consequence.
      
      But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
      field to be interpreted the RFC 1349 way. There's no historical
      compatibility to worry about.
      
      Fixes: 1400615d ("vxlan: allow setting ipv6 traffic class")
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NMatthias May <matthias.may@westermo.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      e488d4f5
    • M
      geneve: do not use RT_TOS for IPv6 flowlabel · ca2bb695
      Matthias May 提交于
      According to Guillaume Nault RT_TOS should never be used for IPv6.
      
      Quote:
      RT_TOS() is an old macro used to interprete IPv4 TOS as described in
      the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4
      code, although, given the current state of the code, most of the
      existing calls have no consequence.
      
      But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS"
      field to be interpreted the RFC 1349 way. There's no historical
      compatibility to worry about.
      
      Fixes: 3a56f86f ("geneve: handle ipv6 priority like ipv4 tos")
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NMatthias May <matthias.may@westermo.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      ca2bb695
    • M
      geneve: fix TOS inheriting for ipv4 · b4ab94d6
      Matthias May 提交于
      The current code retrieves the TOS field after the lookup
      on the ipv4 routing table. The routing process currently
      only allows routing based on the original 3 TOS bits, and
      not on the full 6 DSCP bits.
      As a result the retrieved TOS is cut to the 3 bits.
      However for inheriting purposes the full 6 bits should be used.
      
      Extract the full 6 bits before the route lookup and use
      that instead of the cut off 3 TOS bits.
      
      Fixes: e305ac6c ("geneve: Add support to collect tunnel metadata.")
      Signed-off-by: NMatthias May <matthias.may@westermo.com>
      Acked-by: NGuillaume Nault <gnault@redhat.com>
      Link: https://lore.kernel.org/r/20220805190006.8078-1-matthias.may@westermo.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      b4ab94d6
    • C
      net: atlantic: fix aq_vec index out of range error · 2ba5e47f
      Chia-Lin Kao (AceLan) 提交于
      The final update statement of the for loop exceeds the array range, the
      dereference of self->aq_vec[i] is not checked and then leads to the
      index out of range error.
      Also fixed this kind of coding style in other for loop.
      
      [   97.937604] UBSAN: array-index-out-of-bounds in drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1404:48
      [   97.937607] index 8 is out of range for type 'aq_vec_s *[8]'
      [   97.937608] CPU: 38 PID: 3767 Comm: kworker/u256:18 Not tainted 5.19.0+ #2
      [   97.937610] Hardware name: Dell Inc. Precision 7865 Tower/, BIOS 1.0.0 06/12/2022
      [   97.937611] Workqueue: events_unbound async_run_entry_fn
      [   97.937616] Call Trace:
      [   97.937617]  <TASK>
      [   97.937619]  dump_stack_lvl+0x49/0x63
      [   97.937624]  dump_stack+0x10/0x16
      [   97.937626]  ubsan_epilogue+0x9/0x3f
      [   97.937627]  __ubsan_handle_out_of_bounds.cold+0x44/0x49
      [   97.937629]  ? __scm_send+0x348/0x440
      [   97.937632]  ? aq_vec_stop+0x72/0x80 [atlantic]
      [   97.937639]  aq_nic_stop+0x1b6/0x1c0 [atlantic]
      [   97.937644]  aq_suspend_common+0x88/0x90 [atlantic]
      [   97.937648]  aq_pm_suspend_poweroff+0xe/0x20 [atlantic]
      [   97.937653]  pci_pm_suspend+0x7e/0x1a0
      [   97.937655]  ? pci_pm_suspend_noirq+0x2b0/0x2b0
      [   97.937657]  dpm_run_callback+0x54/0x190
      [   97.937660]  __device_suspend+0x14c/0x4d0
      [   97.937661]  async_suspend+0x23/0x70
      [   97.937663]  async_run_entry_fn+0x33/0x120
      [   97.937664]  process_one_work+0x21f/0x3f0
      [   97.937666]  worker_thread+0x4a/0x3c0
      [   97.937668]  ? process_one_work+0x3f0/0x3f0
      [   97.937669]  kthread+0xf0/0x120
      [   97.937671]  ? kthread_complete_and_exit+0x20/0x20
      [   97.937672]  ret_from_fork+0x22/0x30
      [   97.937676]  </TASK>
      
      v2. fixed "warning: variable 'aq_vec' set but not used"
      
      v3. simplified a for loop
      
      Fixes: 97bde5c4 ("net: ethernet: aquantia: Support for NIC-specific code")
      Signed-off-by: NChia-Lin Kao (AceLan) <acelan.kao@canonical.com>
      Acked-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Link: https://lore.kernel.org/r/20220808081845.42005-1-acelan.kao@canonical.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2ba5e47f
    • C
    • J
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 690bf643
      Jakub Kicinski 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Harden set element field checks to avoid out-of-bound memory access,
         this patch also fixes the type of issue described in 7e6bc1f6
         ("netfilter: nf_tables: stricter validation of element data") in a
         broader way.
      
      2) Patches to restrict the chain, set, and rule id lookup in the
         transaction to the corresponding top-level table, patches from
         Thadeu Lima de Souza Cascardo.
      
      3) Fix incorrect comment in ip6t_LOG.h
      
      4) nft_data_init() performs upfront validation of the expected data.
         struct nft_data_desc is used to describe the expected data to be
         received from userspace. The .size field represents the maximum size
         that can be stored, for bound checks. Then, .len is an input/output field
         which stores the expected length as input (this is optional, to restrict
         the checks), as output it stores the real length received from userspace
         (if it was not specified as input). This patch comes in response to
         7e6bc1f6 ("netfilter: nf_tables: stricter validation of element data")
         to address this type of issue in a more generic way by avoid opencoded
         data validation. Next patch requires this as a dependency.
      
      5) Disallow jump to implicit chain from set element, this configuration
         is invalid. Only allow jump to chain via immediate expression is
         supported at this stage.
      
      6) Fix possible null-pointer derefence in the error path of table updates,
         if memory allocation of the transaction fails. From Florian Westphal.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: fix null deref due to zeroed list head
        netfilter: nf_tables: disallow jump to implicit chain from set element
        netfilter: nf_tables: upfront validation of data via nft_data_init()
        netfilter: ip6t_LOG: Fix a typo in a comment
        netfilter: nf_tables: do not allow RULE_ID to refer to another chain
        netfilter: nf_tables: do not allow CHAIN_ID to refer to another table
        netfilter: nf_tables: do not allow SET_ID to refer to another table
        netfilter: nf_tables: validate variable length element extension
      ====================
      
      Link: https://lore.kernel.org/r/20220809220532.130240-1-pablo@netfilter.org/Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      690bf643
    • S
      can: mcp251x: Fix race condition on receive interrupt · d80d60b0
      Sebastian Würl 提交于
      The mcp251x driver uses both receiving mailboxes of the CAN controller
      chips. For retrieving the CAN frames from the controller via SPI, it checks
      once per interrupt which mailboxes have been filled and will retrieve the
      messages accordingly.
      
      This introduces a race condition, as another CAN frame can enter mailbox 1
      while mailbox 0 is emptied. If now another CAN frame enters mailbox 0 until
      the interrupt handler is called next, mailbox 0 is emptied before
      mailbox 1, leading to out-of-order CAN frames in the network device.
      
      This is fixed by checking the interrupt flags once again after freeing
      mailbox 0, to correctly also empty mailbox 1 before leaving the handler.
      
      For reproducing the bug I created the following setup:
       - Two CAN devices, one Raspberry Pi with MCP2515, the other can be any.
       - Setup CAN to 1 MHz
       - Spam bursts of 5 CAN-messages with increasing CAN-ids
       - Continue sending the bursts while sleeping a second between the bursts
       - Check on the RPi whether the received messages have increasing CAN-ids
       - Without this patch, every burst of messages will contain a flipped pair
      
      v3: https://lore.kernel.org/all/20220804075914.67569-1-sebastian.wuerl@ororatech.com
      v2: https://lore.kernel.org/all/20220804064803.63157-1-sebastian.wuerl@ororatech.com
      v1: https://lore.kernel.org/all/20220803153300.58732-1-sebastian.wuerl@ororatech.com
      
      Fixes: bf66f373 ("can: mcp251x: Move to threaded interrupts instead of workqueues.")
      Signed-off-by: NSebastian Würl <sebastian.wuerl@ororatech.com>
      Link: https://lore.kernel.org/all/20220804081411.68567-1-sebastian.wuerl@ororatech.com
      [mkl: reduce scope of intf1, eflag1]
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      d80d60b0
    • F
      plip: avoid rcu debug splat · bc3c8fe3
      Florian Westphal 提交于
      WARNING: suspicious RCU usage
      5.2.0-rc2-00605-g2638eb8b #1 Not tainted
      drivers/net/plip/plip.c:1110 suspicious rcu_dereference_check() usage!
      
      plip_open is called with RTNL held, switch to the correct helper.
      
      Fixes: 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list")
      Reported-by: Nkernel test robot <oliver.sang@intel.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Link: https://lore.kernel.org/r/20220807115304.13257-1-fw@strlen.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      bc3c8fe3
    • S
      net: bgmac: Fix a BUG triggered by wrong bytes_compl · 1b7680c6
      Sandor Bodo-Merle 提交于
      On one of our machines we got:
      
      kernel BUG at lib/dynamic_queue_limits.c:27!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
      CPU: 0 PID: 1166 Comm: irq/41-bgmac Tainted: G        W  O    4.14.275-rt132 #1
      Hardware name: BRCM XGS iProc
      task: ee3415c0 task.stack: ee32a000
      PC is at dql_completed+0x168/0x178
      LR is at bgmac_poll+0x18c/0x6d8
      pc : [<c03b9430>]    lr : [<c04b5a18>]    psr: 800a0313
      sp : ee32be14  ip : 000005ea  fp : 00000bd4
      r10: ee558500  r9 : c0116298  r8 : 00000002
      r7 : 00000000  r6 : ef128810  r5 : 01993267  r4 : 01993851
      r3 : ee558000  r2 : 000070e1  r1 : 00000bd4  r0 : ee52c180
      Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 12c5387d  Table: 8e88c04a  DAC: 00000051
      Process irq/41-bgmac (pid: 1166, stack limit = 0xee32a210)
      Stack: (0xee32be14 to 0xee32c000)
      be00:                                              ee558520 ee52c100 ef128810
      be20: 00000000 00000002 c0116298 c04b5a18 00000000 c0a0c8c4 c0951780 00000040
      be40: c0701780 ee558500 ee55d520 ef05b340 ef6f9780 ee558520 00000001 00000040
      be60: ffffe000 c0a56878 ef6fa040 c0952040 0000012c c0528744 ef6f97b0 fffcfb6a
      be80: c0a04104 2eda8000 c0a0c4ec c0a0d368 ee32bf44 c0153534 ee32be98 ee32be98
      bea0: ee32bea0 ee32bea0 ee32bea8 ee32bea8 00000000 c01462e4 ffffe000 ef6f22a8
      bec0: ffffe000 00000008 ee32bee4 c0147430 ffffe000 c094a2a8 00000003 ffffe000
      bee0: c0a54528 00208040 0000000c c0a0c8c4 c0a65980 c0124d3c 00000008 ee558520
      bf00: c094a23c c0a02080 00000000 c07a9910 ef136970 ef136970 ee30a440 ef136900
      bf20: ee30a440 00000001 ef136900 ee30a440 c016d990 00000000 c0108db0 c012500c
      bf40: ef136900 c016da14 ee30a464 ffffe000 00000001 c016dd14 00000000 c016db28
      bf60: ffffe000 ee21a080 ee30a400 00000000 ee32a000 ee30a440 c016dbfc ee25fd70
      bf80: ee21a09c c013edcc ee32a000 ee30a400 c013ec7c 00000000 00000000 00000000
      bfa0: 00000000 00000000 00000000 c0108470 00000000 00000000 00000000 00000000
      bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
      [<c03b9430>] (dql_completed) from [<c04b5a18>] (bgmac_poll+0x18c/0x6d8)
      [<c04b5a18>] (bgmac_poll) from [<c0528744>] (net_rx_action+0x1c4/0x494)
      [<c0528744>] (net_rx_action) from [<c0124d3c>] (do_current_softirqs+0x1ec/0x43c)
      [<c0124d3c>] (do_current_softirqs) from [<c012500c>] (__local_bh_enable+0x80/0x98)
      [<c012500c>] (__local_bh_enable) from [<c016da14>] (irq_forced_thread_fn+0x84/0x98)
      [<c016da14>] (irq_forced_thread_fn) from [<c016dd14>] (irq_thread+0x118/0x1c0)
      [<c016dd14>] (irq_thread) from [<c013edcc>] (kthread+0x150/0x158)
      [<c013edcc>] (kthread) from [<c0108470>] (ret_from_fork+0x14/0x24)
      Code: a83f15e0 0200001a 0630a0e1 c3ffffea (f201f0e7)
      
      The issue seems similar to commit 90b3b339 ("net: hisilicon: Fix a BUG
      trigered by wrong bytes_compl") and potentially introduced by commit
      b38c83dd ("bgmac: simplify tx ring index handling").
      
      If there is an RX interrupt between setting ring->end
      and netdev_sent_queue() we can hit the BUG_ON as bgmac_dma_tx_free()
      can miscalculate the queue size while called from bgmac_poll().
      
      The machine which triggered the BUG runs a v4.14 RT kernel - but the issue
      seems present in mainline too.
      
      Fixes: b38c83dd ("bgmac: simplify tx ring index handling")
      Signed-off-by: NSandor Bodo-Merle <sbodomerle@gmail.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220808173939.193804-1-sbodomerle@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      1b7680c6
    • V
      net: dsa: felix: suppress non-changes to the tagging protocol · 4c46bb49
      Vladimir Oltean 提交于
      The way in which dsa_tree_change_tag_proto() works is that when
      dsa_tree_notify() fails, it doesn't know whether the operation failed
      mid way in a multi-switch tree, or it failed for a single-switch tree.
      So even though drivers need to fail cleanly in
      ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify()
      again, to restore the old tag protocol for potential switches in the
      tree where the change did succeeed (before failing for others).
      
      This means for the felix driver that if we report an error in
      felix_change_tag_protocol(), we'll get another call where proto_ops ==
      old_proto_ops. If we proceed to act upon that, we may do unexpected
      things. For example, we will call dsa_tag_8021q_register() twice in a
      row, without any dsa_tag_8021q_unregister() in between. Then we will
      actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown,
      which (if it manages to run at all, after walking through corrupted data
      structures) will leave the ports inoperational anyway.
      
      The bug can be readily reproduced if we force an error while in
      tag_8021q mode; this crashes the kernel.
      
      echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
      echo edsa > /sys/class/net/eno2/dsa/tagging # -EPROTONOSUPPORT
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
      Call trace:
       vcap_entry_get+0x24/0x124
       ocelot_vcap_filter_del+0x198/0x270
       felix_tag_8021q_vlan_del+0xd4/0x21c
       dsa_switch_tag_8021q_vlan_del+0x168/0x2cc
       dsa_switch_event+0x68/0x1170
       dsa_tree_notify+0x14/0x34
       dsa_port_tag_8021q_vlan_del+0x84/0x110
       dsa_tag_8021q_unregister+0x15c/0x1c0
       felix_tag_8021q_teardown+0x16c/0x180
       felix_change_tag_protocol+0x1bc/0x230
       dsa_switch_event+0x14c/0x1170
       dsa_tree_change_tag_proto+0x118/0x1c0
      
      Fixes: 7a29d220 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220808125127.3344094-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      4c46bb49
    • J
      Merge tag 'wireless-2022-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · 7ba0fa7f
      Jakub Kicinski 提交于
      Kalle Valo says:
      
      ====================
      wireless fixes for v6.0
      
      First set of fixes for v6.0. Small one this time, fix a cfg80211
      warning seen with brcmfmac and remove an unncessary inline keyword
      from wilc1000.
      
      * tag 'wireless-2022-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
        wifi: wilc1000: fix spurious inline in wilc_handle_disconnect()
        wifi: cfg80211: Fix validating BSS pointers in __cfg80211_connect_result
      ====================
      
      Link: https://lore.kernel.org/r/20220809164756.B1DAEC433D6@smtp.kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      7ba0fa7f
    • F
      netfilter: nf_tables: fix null deref due to zeroed list head · 58007785
      Florian Westphal 提交于
      In nf_tables_updtable, if nf_tables_table_enable returns an error,
      nft_trans_destroy is called to free the transaction object.
      
      nft_trans_destroy() calls list_del(), but the transaction was never
      placed on a list -- the list head is all zeroes, this results in
      a null dereference:
      
      BUG: KASAN: null-ptr-deref in nft_trans_destroy+0x26/0x59
      Call Trace:
       nft_trans_destroy+0x26/0x59
       nf_tables_newtable+0x4bc/0x9bc
       [..]
      
      Its sane to assume that nft_trans_destroy() can be called
      on the transaction object returned by nft_trans_alloc(), so
      make sure the list head is initialised.
      
      Fixes: 55dd6f93 ("netfilter: nf_tables: use new transaction infrastructure to handle table")
      Reported-by: Nmingi cho <mgcho.minic@gmail.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      58007785
    • P
      netfilter: nf_tables: disallow jump to implicit chain from set element · f323ef3a
      Pablo Neira Ayuso 提交于
      Extend struct nft_data_desc to add a flag field that specifies
      nft_data_init() is being called for set element data.
      
      Use it to disallow jump to implicit chain from set element, only jump
      to chain via immediate expression is allowed.
      
      Fixes: d0e2c7de ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      f323ef3a
    • P
      netfilter: nf_tables: upfront validation of data via nft_data_init() · 341b6941
      Pablo Neira Ayuso 提交于
      Instead of parsing the data and then validate that type and length are
      correct, pass a description of the expected data so it can be validated
      upfront before parsing it to bail out earlier.
      
      This patch adds a new .size field to specify the maximum size of the
      data area. The .len field is optional and it is used as an input/output
      field, it provides the specific length of the expected data in the input
      path. If then .len field is not specified, then obtained length from the
      netlink attribute is stored. This is required by cmp, bitwise, range and
      immediate, which provide no netlink attribute that describes the data
      length. The immediate expression uses the destination register type to
      infer the expected data type.
      
      Relying on opencoded validation of the expected data might lead to
      subtle bugs as described in 7e6bc1f6 ("netfilter: nf_tables:
      stricter validation of element data").
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      341b6941
    • C
      netfilter: ip6t_LOG: Fix a typo in a comment · 13494168
      Christophe JAILLET 提交于
      s/_IPT_LOG_H/_IP6T_LOG_H/
      
      While at it add some surrounding space to ease reading.
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      13494168
    • T
      netfilter: nf_tables: do not allow RULE_ID to refer to another chain · 36d5b291
      Thadeu Lima de Souza Cascardo 提交于
      When doing lookups for rules on the same batch by using its ID, a rule from
      a different chain can be used. If a rule is added to a chain but tries to
      be positioned next to a rule from a different chain, it will be linked to
      chain2, but the use counter on chain1 would be the one to be incremented.
      
      When looking for rules by ID, use the chain that was used for the lookup by
      name. The chain used in the context copied to the transaction needs to
      match that same chain. That way, struct nft_rule does not need to get
      enlarged with another member.
      
      Fixes: 1a94e38d ("netfilter: nf_tables: add NFTA_RULE_ID attribute")
      Fixes: 75dd48e2 ("netfilter: nf_tables: Support RULE_ID reference in new rule")
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      36d5b291
    • T
      netfilter: nf_tables: do not allow CHAIN_ID to refer to another table · 95f466d2
      Thadeu Lima de Souza Cascardo 提交于
      When doing lookups for chains on the same batch by using its ID, a chain
      from a different table can be used. If a rule is added to a table but
      refers to a chain in a different table, it will be linked to the chain in
      table2, but would have expressions referring to objects in table1.
      
      Then, when table1 is removed, the rule will not be removed as its linked to
      a chain in table2. When expressions in the rule are processed or removed,
      that will lead to a use-after-free.
      
      When looking for chains by ID, use the table that was used for the lookup
      by name, and only return chains belonging to that same table.
      
      Fixes: 837830a4 ("netfilter: nf_tables: add NFTA_RULE_CHAIN_ID attribute")
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      95f466d2
    • T
      netfilter: nf_tables: do not allow SET_ID to refer to another table · 470ee20e
      Thadeu Lima de Souza Cascardo 提交于
      When doing lookups for sets on the same batch by using its ID, a set from a
      different table can be used.
      
      Then, when the table is removed, a reference to the set may be kept after
      the set is freed, leading to a potential use-after-free.
      
      When looking for sets by ID, use the table that was used for the lookup by
      name, and only return sets belonging to that same table.
      
      This fixes CVE-2022-2586, also reported as ZDI-CAN-17470.
      
      Reported-by: Team Orca of Sea Security (@seasecresponse)
      Fixes: 958bee14 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      470ee20e
    • P
      netfilter: nf_tables: validate variable length element extension · 34aae2c2
      Pablo Neira Ayuso 提交于
      Update template to validate variable length extensions. This patch adds
      a new .ext_len[id] field to the template to store the expected extension
      length. This is used to sanity check the initialization of the variable
      length extension.
      
      Use PTR_ERR() in nft_set_elem_init() to report errors since, after this
      update, there are two reason why this might fail, either because of
      ENOMEM or insufficient room in the extension field (EINVAL).
      
      Kernels up until 7e6bc1f6 ("netfilter: nf_tables: stricter
      validation of element data") allowed to copy more data to the extension
      than was allocated. This ext_len field allows to validate if the
      destination has the correct size as additional check.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      34aae2c2
  3. 09 8月, 2022 8 次提交