1. 09 7月, 2021 7 次提交
    • I
      net/ncsi: fix restricted cast warning of sparse · 27fa107d
      Ivan Mikhaylov 提交于
      Sparse reports:
      net/ncsi/ncsi-rsp.c:406:24: warning: cast to restricted __be32
      net/ncsi/ncsi-manage.c:732:33: warning: cast to restricted __be32
      net/ncsi/ncsi-manage.c:756:25: warning: cast to restricted __be32
      net/ncsi/ncsi-manage.c:779:25: warning: cast to restricted __be32
      Signed-off-by: NIvan Mikhaylov <i.mikhaylov@yadro.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27fa107d
    • R
      net: microchip: sparx5: fix kconfig warning · 96248d6d
      Randy Dunlap 提交于
      PHY_SPARX5_SERDES depends on OF so SPARX5_SWITCH should also depend
      on OF since 'select' does not follow any dependencies.
      
      WARNING: unmet direct dependencies detected for PHY_SPARX5_SERDES
        Depends on [n]: (ARCH_SPARX5 || COMPILE_TEST [=n]) && OF [=n] && HAS_IOMEM [=y]
        Selected by [y]:
        - SPARX5_SWITCH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && NET_SWITCHDEV [=y] && HAS_IOMEM [=y]
      
      Fixes: 3cfa11ba ("net: sparx5: add the basic sparx5 driver")
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Lars Povlsen <lars.povlsen@microchip.com>
      Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
      Cc: UNGLinuxDriver@microchip.com
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96248d6d
    • S
      cxgb4: fix IRQ free race during driver unload · 015fe6fd
      Shahjada Abul Husain 提交于
      IRQs are requested during driver's ndo_open() and then later
      freed up in disable_interrupts() during driver unload.
      A race exists where driver can set the CXGB4_FULL_INIT_DONE
      flag in ndo_open() after the disable_interrupts() in driver
      unload path checks it, and hence misses calling free_irq().
      
      Fix by unregistering netdevice first and sync with driver's
      ndo_open(). This ensures disable_interrupts() checks the flag
      correctly and frees up the IRQs properly.
      
      Fixes: b37987e8 ("cxgb4: Disable interrupts and napi before unregistering netdev")
      Signed-off-by: NShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: NRaju Rangoju <rajur@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      015fe6fd
    • A
      mt76: mt7921: continue to probe driver when fw already downloaded · c3426904
      Aaron Ma 提交于
      When reboot system, no power cycles, firmware is already downloaded,
      return -EIO will break driver as error:
      mt7921e: probe of 0000:03:00.0 failed with error -5
      
      Skip firmware download and continue to probe.
      Signed-off-by: NAaron Ma <aaron.ma@canonical.com>
      Fixes: 1c099ab4 ("mt76: mt7921: add MCU support")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3426904
    • G
      atl1c: fix Mikrotik 10/25G NIC detection · b9d233ea
      Gatis Peisenieks 提交于
      Since Mikrotik 10/25G NIC MDIO op emulation is not 100% reliable,
      on rare occasions it can happen that some physical functions of
      the NIC do not get initialized due to timeouted early MDIO op.
      
      This changes the atl1c probe on Mikrotik 10/25G NIC not to
      depend on MDIO op emulation.
      Signed-off-by: NGatis Peisenieks <gatis@mikrotik.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b9d233ea
    • J
      ptp: Relocate lookup cookie to correct block. · debdd8e3
      Jonathan Lemon 提交于
      An earlier commit set the pps_lookup cookie, but the line
      was somehow added to the wrong code block.  Correct this.
      
      Fixes: 8602e40f ("ptp: Set lookup cookie when creating a PTP PPS source.")
      Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDario Binacchi <dariobin@libero.it>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      debdd8e3
    • E
      ipv6: tcp: drop silly ICMPv6 packet too big messages · c7bb4b89
      Eric Dumazet 提交于
      While TCP stack scales reasonably well, there is still one part that
      can be used to DDOS it.
      
      IPv6 Packet too big messages have to lookup/insert a new route,
      and if abused by attackers, can easily put hosts under high stress,
      with many cpus contending on a spinlock while one is stuck in fib6_run_gc()
      
      ip6_protocol_deliver_rcu()
       icmpv6_rcv()
        icmpv6_notify()
         tcp_v6_err()
          tcp_v6_mtu_reduced()
           inet6_csk_update_pmtu()
            ip6_rt_update_pmtu()
             __ip6_rt_update_pmtu()
              ip6_rt_cache_alloc()
               ip6_dst_alloc()
                dst_alloc()
                 ip6_dst_gc()
                  fib6_run_gc()
                   spin_lock_bh() ...
      
      Some of our servers have been hit by malicious ICMPv6 packets
      trying to _increase_ the MTU/MSS of TCP flows.
      
      We believe these ICMPv6 packets are a result of a bug in one ISP stack,
      since they were blindly sent back for _every_ (small) packet sent to them.
      
      These packets are for one TCP flow:
      09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
      
      TCP stack can filter some silly requests :
      
      1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err()
      2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS.
      
      This tests happen before the IPv6 routing stack is entered, thus
      removing the potential contention and route exhaustion.
      
      Note that IPv6 stack was performing these checks, but too late
      (ie : after the route has been added, and after the potential
      garbage collect war)
      
      v2: fix typo caught by Martin, thanks !
      v3: exports tcp_mtu_to_mss(), caught by David, thanks !
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reviewed-by: NMaciej Żenczykowski <maze@google.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7bb4b89
  2. 08 7月, 2021 8 次提交
  3. 07 7月, 2021 15 次提交
    • D
      netfilter: uapi: refer to nfnetlink_conntrack.h, not nf_conntrack_netlink.h · d322957e
      Duncan Roe 提交于
      nf_conntrack_netlink.h does not exist, refer to nfnetlink_conntrack.h instead.
      Signed-off-by: NDuncan Roe <duncan_roe@optusnet.com.au>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d322957e
    • N
      ipv6: fix 'disable_policy' for fwd packets · ccd27f05
      Nicolas Dichtel 提交于
      The goal of commit df789fe7 ("ipv6: Provide ipv6 version of
      "disable_policy" sysctl") was to have the disable_policy from ipv4
      available on ipv6.
      However, it's not exactly the same mechanism. On IPv4, all packets coming
      from an interface, which has disable_policy set, bypass the policy check.
      For ipv6, this is done only for local packets, ie for packets destinated to
      an address configured on the incoming interface.
      
      Let's align ipv6 with ipv4 so that the 'disable_policy' sysctl has the same
      effect for both protocols.
      
      My first approach was to create a new kind of route cache entries, to be
      able to set DST_NOPOLICY without modifying routes. This would have added a
      lot of code. Because the local delivery path is already handled, I choose
      to focus on the forwarding path to minimize code churn.
      
      Fixes: df789fe7 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccd27f05
    • C
      octeontx2-pf: Fix assigned error return value that is never used · ad1f3797
      Colin Ian King 提交于
      Currently when the call to otx2_mbox_alloc_msg_cgx_mac_addr_update fails
      the error return variable rc is being assigned -ENOMEM and does not
      return early. rc is then re-assigned and the error case is not handled
      correctly. Fix this by returning -ENOMEM rather than assigning rc.
      
      Addresses-Coverity: ("Unused value")
      Fixes: 79d2be38 ("octeontx2-pf: offload DMAC filters to CGX/RPM block")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad1f3797
    • D
      Merge branch 'bonding-ipsec' · 5ddef2ad
      David S. Miller 提交于
      Taehee Yoo says:
      
      ====================
      net: fix bonding ipsec offload problems
      
      This series fixes some problems related to bonding ipsec offload.
      
      The 1, 5, and 8th patches are to add a missing rcu_read_lock().
      The 2nd patch is to add null check code to bond_ipsec_add_sa.
      When bonding interface doesn't have an active real interface, the
      bond->curr_active_slave pointer is null.
      But bond_ipsec_add_sa() uses that pointer without null check.
      So that it results in null-ptr-deref.
      The 3 and 4th patches are to replace xs->xso.dev with xs->xso.real_dev.
      The 6th patch is to disallow to set ipsec offload if a real interface
      type is bonding.
      The 7th patch is to add struct bond_ipsec to manage SA.
      If bond mode is changed, or active real interface is changed, SA should
      be removed from old current active real interface then it should be added
      to new active real interface.
      But it can't, because it doesn't manage SA.
      The 9th patch is to fix incorrect return value of bond_ipsec_offload_ok().
      
      v1 -> v2:
       - Add 9th patch.
       - Do not print warning when there is no SA in bond_ipsec_add_sa_all().
       - Add comment for ipsec_lock.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ddef2ad
    • T
      bonding: fix incorrect return value of bond_ipsec_offload_ok() · 168e696a
      Taehee Yoo 提交于
      bond_ipsec_offload_ok() is called to check whether the interface supports
      ipsec offload or not.
      bonding interface support ipsec offload only in active-backup mode.
      So, if a bond interface is not in active-backup mode, it should return
      false but it returns true.
      
      Fixes: a3b658cf ("bonding: allow xfrm offload setup post-module-load")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      168e696a
    • T
      bonding: fix suspicious RCU usage in bond_ipsec_offload_ok() · 955b785e
      Taehee Yoo 提交于
      To dereference bond->curr_active_slave, it uses rcu_dereference().
      But it and the caller doesn't acquire RCU so a warning occurs.
      So add rcu_read_lock().
      
      Splat looks like:
      WARNING: suspicious RCU usage
      5.13.0-rc6+ #1179 Not tainted
      drivers/net/bonding/bond_main.c:571 suspicious
      rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by ping/974:
       #0: ffff888109e7db70 (sk_lock-AF_INET){+.+.}-{0:0},
      at: raw_sendmsg+0x1303/0x2cb0
      
      stack backtrace:
      CPU: 2 PID: 974 Comm: ping Not tainted 5.13.0-rc6+ #1179
      Call Trace:
       dump_stack+0xa4/0xe5
       bond_ipsec_offload_ok+0x1f4/0x260 [bonding]
       xfrm_output+0x179/0x890
       xfrm4_output+0xfa/0x410
       ? __xfrm4_output+0x4b0/0x4b0
       ? __ip_make_skb+0xecc/0x2030
       ? xfrm4_udp_encap_rcv+0x800/0x800
       ? ip_local_out+0x21/0x3a0
       ip_send_skb+0x37/0xa0
       raw_sendmsg+0x1bfd/0x2cb0
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      955b785e
    • T
      bonding: Add struct bond_ipesc to manage SA · 9a560550
      Taehee Yoo 提交于
      bonding has been supporting ipsec offload.
      When SA is added, bonding just passes SA to its own active real interface.
      But it doesn't manage SA.
      So, when events(add/del real interface, active real interface change, etc)
      occur, bonding can't handle that well because It doesn't manage SA.
      So some problems(panic, UAF, refcnt leak)occur.
      
      In order to make it stable, it should manage SA.
      That's the reason why struct bond_ipsec is added.
      When a new SA is added to bonding interface, it is stored in the
      bond_ipsec list. And the SA is passed to a current active real interface.
      If events occur, it uses bond_ipsec data to handle these events.
      bond->ipsec_list is protected by bond->ipsec_lock.
      
      If a current active real interface is changed, the following logic works.
      1. delete all SAs from old active real interface
      2. Add all SAs to the new active real interface.
      3. If a new active real interface doesn't support ipsec offload or SA's
      option, it sets real_dev to NULL.
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a560550
    • T
      bonding: disallow setting nested bonding + ipsec offload · b1216933
      Taehee Yoo 提交于
      bonding interface can be nested and it supports ipsec offload.
      So, it allows setting the nested bonding + ipsec scenario.
      But code does not support this scenario.
      So, it should be disallowed.
      
      interface graph:
      bond2
         |
      bond1
         |
      eth0
      
      The nested bonding + ipsec offload may not a real usecase.
      So, disallowing this scenario is fine.
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1216933
    • T
      bonding: fix suspicious RCU usage in bond_ipsec_del_sa() · a22c39b8
      Taehee Yoo 提交于
      To dereference bond->curr_active_slave, it uses rcu_dereference().
      But it and the caller doesn't acquire RCU so a warning occurs.
      So add rcu_read_lock().
      
      Test commands:
          ip netns add A
          ip netns exec A bash
          modprobe netdevsim
          echo "1 1" > /sys/bus/netdevsim/new_device
          ip link add bond0 type bond
          ip link set eth0 master bond0
          ip link set eth0 up
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \
      transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
          ip x s f
      
      Splat looks like:
      =============================
      WARNING: suspicious RCU usage
      5.13.0-rc3+ #1168 Not tainted
      -----------------------------
      drivers/net/bonding/bond_main.c:448 suspicious rcu_dereference_check()
      usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by ip/705:
       #0: ffff888106701780 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{3:3},
      at: xfrm_netlink_rcv+0x59/0x80 [xfrm_user]
       #1: ffff8880075b0098 (&x->lock){+.-.}-{2:2},
      at: xfrm_state_delete+0x16/0x30
      
      stack backtrace:
      CPU: 6 PID: 705 Comm: ip Not tainted 5.13.0-rc3+ #1168
      Call Trace:
       dump_stack+0xa4/0xe5
       bond_ipsec_del_sa+0x16a/0x1c0 [bonding]
       __xfrm_state_delete+0x51f/0x730
       xfrm_state_delete+0x1e/0x30
       xfrm_state_flush+0x22f/0x390
       xfrm_flush_sa+0xd8/0x260 [xfrm_user]
       ? xfrm_flush_policy+0x290/0x290 [xfrm_user]
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
      [ ... ]
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a22c39b8
    • T
      ixgbevf: use xso.real_dev instead of xso.dev in callback functions of struct xfrmdev_ops · 2de7e4f6
      Taehee Yoo 提交于
      There are two pointers in struct xfrm_state_offload, *dev, *real_dev.
      These are used in callback functions of struct xfrmdev_ops.
      The *dev points whether bonding interface or real interface.
      If bonding ipsec offload is used, it points bonding interface If not,
      it points real interface.
      And real_dev always points real interface.
      So, ixgbevf should always use real_dev instead of dev.
      Of course, real_dev always not be null.
      
      Test commands:
          ip link add bond0 type bond
          #eth0 is ixgbevf interface
          ip link set eth0 master bond0
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \
      transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
      
      Splat looks like:
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 6 PID: 688 Comm: ip Not tainted 5.13.0-rc3+ #1168
      RIP: 0010:ixgbevf_ipsec_find_empty_idx+0x28/0x1b0 [ixgbevf]
      Code: 00 00 0f 1f 44 00 00 55 53 48 89 fb 48 83 ec 08 40 84 f6 0f 84 9c
      00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02
      84 c0 74 08 3c 01 0f 8e 4c 01 00 00 66 81 3b 00 04 0f
      RSP: 0018:ffff8880089af390 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
      RBP: ffff8880089af4f8 R08: 0000000000000003 R09: fffffbfff4287e11
      R10: 0000000000000001 R11: ffff888005de8908 R12: 0000000000000000
      R13: ffff88810936a000 R14: ffff88810936a000 R15: ffff888004d78040
      FS:  00007fdf9883a680(0000) GS:ffff88811a400000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055bc14adbf40 CR3: 000000000b87c005 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ixgbevf_ipsec_add_sa+0x1bf/0x9c0 [ixgbevf]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? ixgbevf_ipsec_parse_proto_keys.isra.9+0x280/0x280 [ixgbevf]
       ? lock_acquire+0x191/0x720
       ? bond_ipsec_add_sa+0x48/0x350 [bonding]
       ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
       ? rcu_read_lock_held+0x91/0xa0
       ? rcu_read_lock_sched_held+0xc0/0xc0
       bond_ipsec_add_sa+0x193/0x350 [bonding]
       xfrm_dev_state_add+0x2a9/0x770
       ? memcpy+0x38/0x60
       xfrm_add_sa+0x2278/0x3b10 [xfrm_user]
       ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user]
       ? register_lock_class+0x1750/0x1750
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
      [ ... ]
      
      Fixes: 272c2330 ("xfrm: bail early on slave pass over skb")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2de7e4f6
    • T
      net: netdevsim: use xso.real_dev instead of xso.dev in callback functions of struct xfrmdev_ops · 09adf756
      Taehee Yoo 提交于
      There are two pointers in struct xfrm_state_offload, *dev, *real_dev.
      These are used in callback functions of struct xfrmdev_ops.
      The *dev points whether bonding interface or real interface.
      If bonding ipsec offload is used, it points bonding interface If not,
      it points real interface.
      And real_dev always points real interface.
      So, netdevsim should always use real_dev instead of dev.
      Of course, real_dev always not be null.
      
      Test commands:
          ip netns add A
          ip netns exec A bash
          modprobe netdevsim
          echo "1 1" > /sys/bus/netdevsim/new_device
          ip link add bond0 type bond mode active-backup
          ip link set eth0 master bond0
          ip link set eth0 up
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \
      transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
      
      Splat looks like:
      BUG: spinlock bad magic on CPU#5, kworker/5:1/53
       lock: 0xffff8881068c2cc8, .magic: 11121314, .owner: <none>/-1,
      .owner_cpu: -235736076
      CPU: 5 PID: 53 Comm: kworker/5:1 Not tainted 5.13.0-rc3+ #1168
      Workqueue: events linkwatch_event
      Call Trace:
       dump_stack+0xa4/0xe5
       do_raw_spin_lock+0x20b/0x270
       ? rwlock_bug.part.1+0x90/0x90
       _raw_spin_lock_nested+0x5f/0x70
       bond_get_stats+0xe4/0x4c0 [bonding]
       ? rcu_read_lock_sched_held+0xc0/0xc0
       ? bond_neigh_init+0x2c0/0x2c0 [bonding]
       ? dev_get_alias+0xe2/0x190
       ? dev_get_port_parent_id+0x14a/0x360
       ? rtnl_unregister+0x190/0x190
       ? dev_get_phys_port_name+0xa0/0xa0
       ? memset+0x1f/0x40
       ? memcpy+0x38/0x60
       ? rtnl_phys_switch_id_fill+0x91/0x100
       dev_get_stats+0x8c/0x270
       rtnl_fill_stats+0x44/0xbe0
       ? nla_put+0xbe/0x140
       rtnl_fill_ifinfo+0x1054/0x3ad0
      [ ... ]
      
      Fixes: 272c2330 ("xfrm: bail early on slave pass over skb")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09adf756
    • T
      bonding: fix null dereference in bond_ipsec_add_sa() · 105cd17a
      Taehee Yoo 提交于
      If bond doesn't have real device, bond->curr_active_slave is null.
      But bond_ipsec_add_sa() dereferences bond->curr_active_slave without
      null checking.
      So, null-ptr-deref would occur.
      
      Test commands:
          ip link add bond0 type bond
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi \
      0x07 mode transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
      
      Splat looks like:
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 4 PID: 680 Comm: ip Not tainted 5.13.0-rc3+ #1168
      RIP: 0010:bond_ipsec_add_sa+0xc4/0x2e0 [bonding]
      Code: 85 21 02 00 00 4d 8b a6 48 0c 00 00 e8 75 58 44 ce 85 c0 0f 85 14
      01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02
      00 0f 85 fc 01 00 00 48 8d bb e0 02 00 00 4d 8b 2c 24 48
      RSP: 0018:ffff88810946f508 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: ffff88810b4e8040 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: ffffffff8fe34280 RDI: ffff888115abe100
      RBP: ffff88810946f528 R08: 0000000000000003 R09: fffffbfff2287e11
      R10: 0000000000000001 R11: ffff888115abe0c8 R12: 0000000000000000
      R13: ffffffffc0aea9a0 R14: ffff88800d7d2000 R15: ffff88810b4e8330
      FS:  00007efc5552e680(0000) GS:ffff888119c00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055c2530dbf40 CR3: 0000000103056004 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       xfrm_dev_state_add+0x2a9/0x770
       ? memcpy+0x38/0x60
       xfrm_add_sa+0x2278/0x3b10 [xfrm_user]
       ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user]
       ? register_lock_class+0x1750/0x1750
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? netlink_ack+0x9d0/0x9d0
       ? netlink_deliver_tap+0x17c/0xa50
       xfrm_netlink_rcv+0x68/0x80 [xfrm_user]
       netlink_unicast+0x41c/0x610
       ? netlink_attachskb+0x710/0x710
       netlink_sendmsg+0x6b9/0xb70
      [ ...]
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      105cd17a
    • T
      bonding: fix suspicious RCU usage in bond_ipsec_add_sa() · b648eba4
      Taehee Yoo 提交于
      To dereference bond->curr_active_slave, it uses rcu_dereference().
      But it and the caller doesn't acquire RCU so a warning occurs.
      So add rcu_read_lock().
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add bond0 type bond
          ip link set dummy0 master bond0
          ip link set dummy0 up
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 \
      	    mode transport \
      	    reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      	    0x44434241343332312423222114131211f4f3f2f1 128 sel \
      	    src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload \
      	    dev bond0 dir in
      
      Splat looks like:
      =============================
      WARNING: suspicious RCU usage
      5.13.0-rc3+ #1168 Not tainted
      -----------------------------
      drivers/net/bonding/bond_main.c:411 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by ip/684:
       #0: ffffffff9a2757c0 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{3:3},
      at: xfrm_netlink_rcv+0x59/0x80 [xfrm_user]
         55.191733][  T684] stack backtrace:
      CPU: 0 PID: 684 Comm: ip Not tainted 5.13.0-rc3+ #1168
      Call Trace:
       dump_stack+0xa4/0xe5
       bond_ipsec_add_sa+0x18c/0x1f0 [bonding]
       xfrm_dev_state_add+0x2a9/0x770
       ? memcpy+0x38/0x60
       xfrm_add_sa+0x2278/0x3b10 [xfrm_user]
       ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user]
       ? register_lock_class+0x1750/0x1750
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? netlink_ack+0x9d0/0x9d0
       ? netlink_deliver_tap+0x17c/0xa50
       xfrm_netlink_rcv+0x68/0x80 [xfrm_user]
       netlink_unicast+0x41c/0x610
       ? netlink_attachskb+0x710/0x710
       netlink_sendmsg+0x6b9/0xb70
      [ ... ]
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b648eba4
    • N
      tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized · be5d1b61
      Nguyen Dinh Phi 提交于
      This commit fixes a bug (found by syzkaller) that could cause spurious
      double-initializations for congestion control modules, which could cause
      memory leaks or other problems for congestion control modules (like CDG)
      that allocate memory in their init functions.
      
      The buggy scenario constructed by syzkaller was something like:
      
      (1) create a TCP socket
      (2) initiate a TFO connect via sendto()
      (3) while socket is in TCP_SYN_SENT, call setsockopt(TCP_CONGESTION),
          which calls:
             tcp_set_congestion_control() ->
               tcp_reinit_congestion_control() ->
                 tcp_init_congestion_control()
      (4) receive ACK, connection is established, call tcp_init_transfer(),
          set icsk_ca_initialized=0 (without first calling cc->release()),
          call tcp_init_congestion_control() again.
      
      Note that in this sequence tcp_init_congestion_control() is called
      twice without a cc->release() call in between. Thus, for CC modules
      that allocate memory in their init() function, e.g, CDG, a memory leak
      may occur. The syzkaller tool managed to find a reproducer that
      triggered such a leak in CDG.
      
      The bug was introduced when that commit 8919a9b3 ("tcp: Only init
      congestion control if not initialized already")
      introduced icsk_ca_initialized and set icsk_ca_initialized to 0 in
      tcp_init_transfer(), missing the possibility for a sequence like the
      one above, where a process could call setsockopt(TCP_CONGESTION) in
      state TCP_SYN_SENT (i.e. after the connect() or TFO open sendmsg()),
      which would call tcp_init_congestion_control(). It did not intend to
      reset any initialization that the user had already explicitly made;
      it just missed the possibility of that particular sequence (which
      syzkaller managed to find).
      
      Fixes: 8919a9b3 ("tcp: Only init congestion control if not initialized already")
      Reported-by: syzbot+f1e24a0594d4e3a895d3@syzkaller.appspotmail.com
      Signed-off-by: NNguyen Dinh Phi <phind.uet@gmail.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Tested-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be5d1b61
    • P
      skbuff: Release nfct refcount on napi stolen or re-used skbs · 8550ff8d
      Paul Blakey 提交于
      When multiple SKBs are merged to a new skb under napi GRO,
      or SKB is re-used by napi, if nfct was set for them in the
      driver, it will not be released while freeing their stolen
      head state or on re-use.
      
      Release nfct on napi's stolen or re-used SKBs, and
      in gro_list_prepare, check conntrack metadata diff.
      
      Fixes: 5c6b9460 ("net/mlx5e: CT: Handle misses after executing CT action")
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NPaul Blakey <paulb@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8550ff8d
  4. 06 7月, 2021 10 次提交