1. 17 7月, 2021 6 次提交
  2. 16 7月, 2021 1 次提交
    • R
      bus: mhi: pci-generic: configurable network interface MRU · 5c2c8531
      Richard Laing 提交于
      The MRU value used by the MHI MBIM network interface affects
      the throughput performance of the interface. Different modem
      models use different default MRU sizes based on their bandwidth
      capabilities. Large values generally result in higher throughput
      for larger packet sizes.
      
      In addition if the MRU used by the MHI device is larger than that
      specified in the MHI net device the data is fragmented and needs
      to be re-assembled which generates a (single) warning message about
      the fragmented packets. Setting the MRU on both ends avoids the
      extra processing to re-assemble the packets.
      
      This patch allows the documented MRU for a modem to be automatically
      set as the MHI net device MRU avoiding fragmentation and improving
      throughput performance.
      Signed-off-by: NRichard Laing <richard.laing@alliedtelesis.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5c2c8531
  3. 14 7月, 2021 6 次提交
    • Í
      sfc: add logs explaining XDP_TX/REDIRECT is not available · d2a16bde
      Íñigo Huguet 提交于
      If it's not possible to allocate enough channels for XDP, XDP_TX and
      XDP_REDIRECT don't work. However, only a message saying that not enough
      channels were available was shown, but not saying what are the
      consequences in that case. The user didn't know if he/she can use XDP
      or not, if the performance is reduced, or what.
      Signed-off-by: NÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2a16bde
    • Í
      sfc: ensure correct number of XDP queues · 788bc000
      Íñigo Huguet 提交于
      Commit 99ba0ea6 ("sfc: adjust efx->xdp_tx_queue_count with the real
      number of initialized queues") intended to fix a problem caused by a
      round up when calculating the number of XDP channels and queues.
      However, this was not the real problem. The real problem was that the
      number of XDP TX queues had been reduced to half in
      commit e26ca4b5 ("sfc: reduce the number of requested xdp ev queues"),
      but the variable xdp_tx_queue_count had remained the same.
      
      Once the correct number of XDP TX queues is created again in the
      previous patch of this series, this also can be reverted since the error
      doesn't actually exist.
      
      Only in the case that there is a bug in the code we can have different
      values in xdp_queue_number and efx->xdp_tx_queue_count. Because of this,
      and per Edward Cree's suggestion, I add instead a WARN_ON to catch if it
      happens again in the future.
      
      Note that the number of allocated queues can be higher than the number
      of used ones due to the round up, as explained in the existing comment
      in the code. That's why we also have to stop increasing xdp_queue_number
      beyond efx->xdp_tx_queue_count.
      Signed-off-by: NÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      788bc000
    • Í
      sfc: fix lack of XDP TX queues - error XDP TX failed (-22) · f28100cb
      Íñigo Huguet 提交于
      Fixes: e26ca4b5 sfc: reduce the number of requested xdp ev queues
      
      The buggy commit intended to allocate less channels for XDP in order to
      be more unlikely to reach the limit of 32 channels of the driver.
      
      The idea was to use each IRQ/eventqeue for more XDP TX queues than
      before, calculating which is the maximum number of TX queues that one
      event queue can handle. For example, in EF10 each event queue could
      handle up to 8 queues, better than the 4 they were handling before the
      change. This way, it would have to allocate half of channels than before
      for XDP TX.
      
      The problem is that the TX queues are also contained inside the channel
      structs, and there are only 4 queues per channel. Reducing the number of
      channels means also reducing the number of queues, resulting in not
      having the desired number of 1 queue per CPU.
      
      This leads to getting errors on XDP_TX and XDP_REDIRECT if they're
      executed from a high numbered CPU, because there only exist queues for
      the low half of CPUs, actually. If XDP_TX/REDIRECT is executed in a low
      numbered CPU, the error doesn't happen. This is the error in the logs
      (repeated many times, even rate limited):
      sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22)
      
      This errors happens in function efx_xdp_tx_buffers, where it expects to
      have a dedicated XDP TX queue per CPU.
      
      Reverting the change makes again more likely to reach the limit of 32
      channels in machines with many CPUs. If this happen, no XDP_TX/REDIRECT
      will be possible at all, and we will have this log error messages:
      
      At interface probe:
      sfc 0000:5e:00.0: Insufficient resources for 12 XDP event queues (24 other channels, max 32)
      
      At every subsequent XDP_TX/REDIRECT failure, rate limited:
      sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22)
      
      However, without reverting the change, it makes the user to think that
      everything is OK at probe time, but later it fails in an unpredictable
      way, depending on the CPU that handles the packet.
      
      It is better to restore the predictable behaviour. If the user sees the
      error message at probe time, he/she can try to configure the best way it
      fits his/her needs. At least, he/she will have 2 options:
      - Accept that XDP_TX/REDIRECT is not available (he/she may not need it)
      - Load sfc module with modparam 'rss_cpus' with a lower number, thus
        creating less normal RX queues/channels, letting more free resources
        for XDP, with some performance penalty.
      
      Anyway, let the calculation of maximum TX queues that can be handled by
      a single event queue, and use it only if it's less than the number of TX
      queues per channel. This doesn't happen in practice, but could happen if
      some constant values are tweaked in the future, such us
      EFX_MAX_TXQ_PER_CHANNEL, EFX_MAX_EVQ_SIZE or EFX_MAX_DMAQ_SIZE.
      
      Related mailing list thread:
      https://lore.kernel.org/bpf/20201215104327.2be76156@carbon/Signed-off-by: NÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f28100cb
    • P
      net: fddi: fix UAF in fza_probe · deb7178e
      Pavel Skripkin 提交于
      fp is netdev private data and it cannot be
      used after free_netdev() call. Using fp after free_netdev()
      can cause UAF bug. Fix it by moving free_netdev() after error message.
      
      Fixes: 61414f5e ("FDDI: defza: Add support for DEC FDDIcontroller 700
      TURBOchannel adapter")
      Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      deb7178e
    • V
      net: dsa: sja1105: fix address learning getting disabled on the CPU port · b0b33b04
      Vladimir Oltean 提交于
      In May 2019 when commit 640f763f ("net: dsa: sja1105: Add support
      for Spanning Tree Protocol") was introduced, the comment that "STP does
      not get called for the CPU port" was true. This changed after commit
      0394a63a ("net: dsa: enable and disable all ports") in August 2019
      and went largely unnoticed, because the sja1105_bridge_stp_state_set()
      method did nothing different compared to the static setup done by
      sja1105_init_mac_settings().
      
      With the ability to turn address learning off introduced by the blamed
      commit, there is a new priv->learn_ena port mask in the driver. When
      sja1105_bridge_stp_state_set() gets called and we are in
      BR_STATE_LEARNING or later, address learning is enabled or not depending
      on priv->learn_ena & BIT(port).
      
      So what happens is that priv->learn_ena is not being set from anywhere
      for the CPU port, and the static configuration done by
      sja1105_init_mac_settings() is being overwritten.
      
      To solve this, acknowledge that the static configuration of STP state is
      no longer necessary because the STP state is being set by the DSA core
      now, but what is necessary is to set priv->learn_ena for the CPU port.
      
      Fixes: 4d942354 ("net: dsa: sja1105: offload bridge port flags to device")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0b33b04
    • V
      net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload · e56c6bbd
      Vladimir Oltean 提交于
      The point with a *dev and a *brport_dev is that when we have a LAG net
      device that is a bridge port, *dev is an ocelot net device and
      *brport_dev is the bonding/team net device. The ocelot net device
      beneath the LAG does not exist from the bridge's perspective, so we need
      to sync the switchdev objects belonging to the brport_dev and not to the
      dev.
      
      Fixes: e4bd44e8 ("net: ocelot: replay switchdev events when joining bridge")
      Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e56c6bbd
  4. 13 7月, 2021 2 次提交
  5. 12 7月, 2021 2 次提交
  6. 11 7月, 2021 1 次提交
  7. 10 7月, 2021 3 次提交
  8. 09 7月, 2021 5 次提交
  9. 08 7月, 2021 3 次提交
  10. 07 7月, 2021 10 次提交
    • C
      octeontx2-pf: Fix assigned error return value that is never used · ad1f3797
      Colin Ian King 提交于
      Currently when the call to otx2_mbox_alloc_msg_cgx_mac_addr_update fails
      the error return variable rc is being assigned -ENOMEM and does not
      return early. rc is then re-assigned and the error case is not handled
      correctly. Fix this by returning -ENOMEM rather than assigning rc.
      
      Addresses-Coverity: ("Unused value")
      Fixes: 79d2be38 ("octeontx2-pf: offload DMAC filters to CGX/RPM block")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad1f3797
    • T
      bonding: fix incorrect return value of bond_ipsec_offload_ok() · 168e696a
      Taehee Yoo 提交于
      bond_ipsec_offload_ok() is called to check whether the interface supports
      ipsec offload or not.
      bonding interface support ipsec offload only in active-backup mode.
      So, if a bond interface is not in active-backup mode, it should return
      false but it returns true.
      
      Fixes: a3b658cf ("bonding: allow xfrm offload setup post-module-load")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      168e696a
    • T
      bonding: fix suspicious RCU usage in bond_ipsec_offload_ok() · 955b785e
      Taehee Yoo 提交于
      To dereference bond->curr_active_slave, it uses rcu_dereference().
      But it and the caller doesn't acquire RCU so a warning occurs.
      So add rcu_read_lock().
      
      Splat looks like:
      WARNING: suspicious RCU usage
      5.13.0-rc6+ #1179 Not tainted
      drivers/net/bonding/bond_main.c:571 suspicious
      rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by ping/974:
       #0: ffff888109e7db70 (sk_lock-AF_INET){+.+.}-{0:0},
      at: raw_sendmsg+0x1303/0x2cb0
      
      stack backtrace:
      CPU: 2 PID: 974 Comm: ping Not tainted 5.13.0-rc6+ #1179
      Call Trace:
       dump_stack+0xa4/0xe5
       bond_ipsec_offload_ok+0x1f4/0x260 [bonding]
       xfrm_output+0x179/0x890
       xfrm4_output+0xfa/0x410
       ? __xfrm4_output+0x4b0/0x4b0
       ? __ip_make_skb+0xecc/0x2030
       ? xfrm4_udp_encap_rcv+0x800/0x800
       ? ip_local_out+0x21/0x3a0
       ip_send_skb+0x37/0xa0
       raw_sendmsg+0x1bfd/0x2cb0
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      955b785e
    • T
      bonding: Add struct bond_ipesc to manage SA · 9a560550
      Taehee Yoo 提交于
      bonding has been supporting ipsec offload.
      When SA is added, bonding just passes SA to its own active real interface.
      But it doesn't manage SA.
      So, when events(add/del real interface, active real interface change, etc)
      occur, bonding can't handle that well because It doesn't manage SA.
      So some problems(panic, UAF, refcnt leak)occur.
      
      In order to make it stable, it should manage SA.
      That's the reason why struct bond_ipsec is added.
      When a new SA is added to bonding interface, it is stored in the
      bond_ipsec list. And the SA is passed to a current active real interface.
      If events occur, it uses bond_ipsec data to handle these events.
      bond->ipsec_list is protected by bond->ipsec_lock.
      
      If a current active real interface is changed, the following logic works.
      1. delete all SAs from old active real interface
      2. Add all SAs to the new active real interface.
      3. If a new active real interface doesn't support ipsec offload or SA's
      option, it sets real_dev to NULL.
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a560550
    • T
      bonding: disallow setting nested bonding + ipsec offload · b1216933
      Taehee Yoo 提交于
      bonding interface can be nested and it supports ipsec offload.
      So, it allows setting the nested bonding + ipsec scenario.
      But code does not support this scenario.
      So, it should be disallowed.
      
      interface graph:
      bond2
         |
      bond1
         |
      eth0
      
      The nested bonding + ipsec offload may not a real usecase.
      So, disallowing this scenario is fine.
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1216933
    • T
      bonding: fix suspicious RCU usage in bond_ipsec_del_sa() · a22c39b8
      Taehee Yoo 提交于
      To dereference bond->curr_active_slave, it uses rcu_dereference().
      But it and the caller doesn't acquire RCU so a warning occurs.
      So add rcu_read_lock().
      
      Test commands:
          ip netns add A
          ip netns exec A bash
          modprobe netdevsim
          echo "1 1" > /sys/bus/netdevsim/new_device
          ip link add bond0 type bond
          ip link set eth0 master bond0
          ip link set eth0 up
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \
      transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
          ip x s f
      
      Splat looks like:
      =============================
      WARNING: suspicious RCU usage
      5.13.0-rc3+ #1168 Not tainted
      -----------------------------
      drivers/net/bonding/bond_main.c:448 suspicious rcu_dereference_check()
      usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by ip/705:
       #0: ffff888106701780 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{3:3},
      at: xfrm_netlink_rcv+0x59/0x80 [xfrm_user]
       #1: ffff8880075b0098 (&x->lock){+.-.}-{2:2},
      at: xfrm_state_delete+0x16/0x30
      
      stack backtrace:
      CPU: 6 PID: 705 Comm: ip Not tainted 5.13.0-rc3+ #1168
      Call Trace:
       dump_stack+0xa4/0xe5
       bond_ipsec_del_sa+0x16a/0x1c0 [bonding]
       __xfrm_state_delete+0x51f/0x730
       xfrm_state_delete+0x1e/0x30
       xfrm_state_flush+0x22f/0x390
       xfrm_flush_sa+0xd8/0x260 [xfrm_user]
       ? xfrm_flush_policy+0x290/0x290 [xfrm_user]
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
      [ ... ]
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a22c39b8
    • T
      ixgbevf: use xso.real_dev instead of xso.dev in callback functions of struct xfrmdev_ops · 2de7e4f6
      Taehee Yoo 提交于
      There are two pointers in struct xfrm_state_offload, *dev, *real_dev.
      These are used in callback functions of struct xfrmdev_ops.
      The *dev points whether bonding interface or real interface.
      If bonding ipsec offload is used, it points bonding interface If not,
      it points real interface.
      And real_dev always points real interface.
      So, ixgbevf should always use real_dev instead of dev.
      Of course, real_dev always not be null.
      
      Test commands:
          ip link add bond0 type bond
          #eth0 is ixgbevf interface
          ip link set eth0 master bond0
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \
      transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
      
      Splat looks like:
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 6 PID: 688 Comm: ip Not tainted 5.13.0-rc3+ #1168
      RIP: 0010:ixgbevf_ipsec_find_empty_idx+0x28/0x1b0 [ixgbevf]
      Code: 00 00 0f 1f 44 00 00 55 53 48 89 fb 48 83 ec 08 40 84 f6 0f 84 9c
      00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02
      84 c0 74 08 3c 01 0f 8e 4c 01 00 00 66 81 3b 00 04 0f
      RSP: 0018:ffff8880089af390 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
      RBP: ffff8880089af4f8 R08: 0000000000000003 R09: fffffbfff4287e11
      R10: 0000000000000001 R11: ffff888005de8908 R12: 0000000000000000
      R13: ffff88810936a000 R14: ffff88810936a000 R15: ffff888004d78040
      FS:  00007fdf9883a680(0000) GS:ffff88811a400000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055bc14adbf40 CR3: 000000000b87c005 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ixgbevf_ipsec_add_sa+0x1bf/0x9c0 [ixgbevf]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? ixgbevf_ipsec_parse_proto_keys.isra.9+0x280/0x280 [ixgbevf]
       ? lock_acquire+0x191/0x720
       ? bond_ipsec_add_sa+0x48/0x350 [bonding]
       ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
       ? rcu_read_lock_held+0x91/0xa0
       ? rcu_read_lock_sched_held+0xc0/0xc0
       bond_ipsec_add_sa+0x193/0x350 [bonding]
       xfrm_dev_state_add+0x2a9/0x770
       ? memcpy+0x38/0x60
       xfrm_add_sa+0x2278/0x3b10 [xfrm_user]
       ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user]
       ? register_lock_class+0x1750/0x1750
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
      [ ... ]
      
      Fixes: 272c2330 ("xfrm: bail early on slave pass over skb")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2de7e4f6
    • T
      net: netdevsim: use xso.real_dev instead of xso.dev in callback functions of struct xfrmdev_ops · 09adf756
      Taehee Yoo 提交于
      There are two pointers in struct xfrm_state_offload, *dev, *real_dev.
      These are used in callback functions of struct xfrmdev_ops.
      The *dev points whether bonding interface or real interface.
      If bonding ipsec offload is used, it points bonding interface If not,
      it points real interface.
      And real_dev always points real interface.
      So, netdevsim should always use real_dev instead of dev.
      Of course, real_dev always not be null.
      
      Test commands:
          ip netns add A
          ip netns exec A bash
          modprobe netdevsim
          echo "1 1" > /sys/bus/netdevsim/new_device
          ip link add bond0 type bond mode active-backup
          ip link set eth0 master bond0
          ip link set eth0 up
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \
      transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
      
      Splat looks like:
      BUG: spinlock bad magic on CPU#5, kworker/5:1/53
       lock: 0xffff8881068c2cc8, .magic: 11121314, .owner: <none>/-1,
      .owner_cpu: -235736076
      CPU: 5 PID: 53 Comm: kworker/5:1 Not tainted 5.13.0-rc3+ #1168
      Workqueue: events linkwatch_event
      Call Trace:
       dump_stack+0xa4/0xe5
       do_raw_spin_lock+0x20b/0x270
       ? rwlock_bug.part.1+0x90/0x90
       _raw_spin_lock_nested+0x5f/0x70
       bond_get_stats+0xe4/0x4c0 [bonding]
       ? rcu_read_lock_sched_held+0xc0/0xc0
       ? bond_neigh_init+0x2c0/0x2c0 [bonding]
       ? dev_get_alias+0xe2/0x190
       ? dev_get_port_parent_id+0x14a/0x360
       ? rtnl_unregister+0x190/0x190
       ? dev_get_phys_port_name+0xa0/0xa0
       ? memset+0x1f/0x40
       ? memcpy+0x38/0x60
       ? rtnl_phys_switch_id_fill+0x91/0x100
       dev_get_stats+0x8c/0x270
       rtnl_fill_stats+0x44/0xbe0
       ? nla_put+0xbe/0x140
       rtnl_fill_ifinfo+0x1054/0x3ad0
      [ ... ]
      
      Fixes: 272c2330 ("xfrm: bail early on slave pass over skb")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09adf756
    • T
      bonding: fix null dereference in bond_ipsec_add_sa() · 105cd17a
      Taehee Yoo 提交于
      If bond doesn't have real device, bond->curr_active_slave is null.
      But bond_ipsec_add_sa() dereferences bond->curr_active_slave without
      null checking.
      So, null-ptr-deref would occur.
      
      Test commands:
          ip link add bond0 type bond
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi \
      0x07 mode transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \
      dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
      
      Splat looks like:
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      CPU: 4 PID: 680 Comm: ip Not tainted 5.13.0-rc3+ #1168
      RIP: 0010:bond_ipsec_add_sa+0xc4/0x2e0 [bonding]
      Code: 85 21 02 00 00 4d 8b a6 48 0c 00 00 e8 75 58 44 ce 85 c0 0f 85 14
      01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02
      00 0f 85 fc 01 00 00 48 8d bb e0 02 00 00 4d 8b 2c 24 48
      RSP: 0018:ffff88810946f508 EFLAGS: 00010246
      RAX: dffffc0000000000 RBX: ffff88810b4e8040 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: ffffffff8fe34280 RDI: ffff888115abe100
      RBP: ffff88810946f528 R08: 0000000000000003 R09: fffffbfff2287e11
      R10: 0000000000000001 R11: ffff888115abe0c8 R12: 0000000000000000
      R13: ffffffffc0aea9a0 R14: ffff88800d7d2000 R15: ffff88810b4e8330
      FS:  00007efc5552e680(0000) GS:ffff888119c00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055c2530dbf40 CR3: 0000000103056004 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       xfrm_dev_state_add+0x2a9/0x770
       ? memcpy+0x38/0x60
       xfrm_add_sa+0x2278/0x3b10 [xfrm_user]
       ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user]
       ? register_lock_class+0x1750/0x1750
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? netlink_ack+0x9d0/0x9d0
       ? netlink_deliver_tap+0x17c/0xa50
       xfrm_netlink_rcv+0x68/0x80 [xfrm_user]
       netlink_unicast+0x41c/0x610
       ? netlink_attachskb+0x710/0x710
       netlink_sendmsg+0x6b9/0xb70
      [ ...]
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      105cd17a
    • T
      bonding: fix suspicious RCU usage in bond_ipsec_add_sa() · b648eba4
      Taehee Yoo 提交于
      To dereference bond->curr_active_slave, it uses rcu_dereference().
      But it and the caller doesn't acquire RCU so a warning occurs.
      So add rcu_read_lock().
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add bond0 type bond
          ip link set dummy0 master bond0
          ip link set dummy0 up
          ip link set bond0 up
          ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 \
      	    mode transport \
      	    reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \
      	    0x44434241343332312423222114131211f4f3f2f1 128 sel \
      	    src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload \
      	    dev bond0 dir in
      
      Splat looks like:
      =============================
      WARNING: suspicious RCU usage
      5.13.0-rc3+ #1168 Not tainted
      -----------------------------
      drivers/net/bonding/bond_main.c:411 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by ip/684:
       #0: ffffffff9a2757c0 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{3:3},
      at: xfrm_netlink_rcv+0x59/0x80 [xfrm_user]
         55.191733][  T684] stack backtrace:
      CPU: 0 PID: 684 Comm: ip Not tainted 5.13.0-rc3+ #1168
      Call Trace:
       dump_stack+0xa4/0xe5
       bond_ipsec_add_sa+0x18c/0x1f0 [bonding]
       xfrm_dev_state_add+0x2a9/0x770
       ? memcpy+0x38/0x60
       xfrm_add_sa+0x2278/0x3b10 [xfrm_user]
       ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user]
       ? register_lock_class+0x1750/0x1750
       xfrm_user_rcv_msg+0x331/0x660 [xfrm_user]
       ? rcu_read_lock_sched_held+0x91/0xc0
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? find_held_lock+0x3a/0x1c0
       ? mutex_lock_io_nested+0x1210/0x1210
       ? sched_clock_cpu+0x18/0x170
       netlink_rcv_skb+0x121/0x350
       ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user]
       ? netlink_ack+0x9d0/0x9d0
       ? netlink_deliver_tap+0x17c/0xa50
       xfrm_netlink_rcv+0x68/0x80 [xfrm_user]
       netlink_unicast+0x41c/0x610
       ? netlink_attachskb+0x710/0x710
       netlink_sendmsg+0x6b9/0xb70
      [ ... ]
      
      Fixes: 18cb261a ("bonding: support hardware encryption offload to slaves")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b648eba4
  11. 06 7月, 2021 1 次提交