1. 24 5月, 2021 5 次提交
    • D
      net/sched: fq_pie: fix OOB access in the traffic path · e70f7a11
      Davide Caratti 提交于
      the following script:
      
        # tc qdisc add dev eth0 handle 0x1 root fq_pie flows 2
        # tc qdisc add dev eth0 clsact
        # tc filter add dev eth0 egress matchall action skbedit priority 0x10002
        # ping 192.0.2.2 -I eth0 -c2 -w1 -q
      
      produces the following splat:
      
       BUG: KASAN: slab-out-of-bounds in fq_pie_qdisc_enqueue+0x1314/0x19d0 [sch_fq_pie]
       Read of size 4 at addr ffff888171306924 by task ping/942
      
       CPU: 3 PID: 942 Comm: ping Not tainted 5.12.0+ #441
       Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
       Call Trace:
        dump_stack+0x92/0xc1
        print_address_description.constprop.7+0x1a/0x150
        kasan_report.cold.13+0x7f/0x111
        fq_pie_qdisc_enqueue+0x1314/0x19d0 [sch_fq_pie]
        __dev_queue_xmit+0x1034/0x2b10
        ip_finish_output2+0xc62/0x2120
        __ip_finish_output+0x553/0xea0
        ip_output+0x1ca/0x4d0
        ip_send_skb+0x37/0xa0
        raw_sendmsg+0x1c4b/0x2d00
        sock_sendmsg+0xdb/0x110
        __sys_sendto+0x1d7/0x2b0
        __x64_sys_sendto+0xdd/0x1b0
        do_syscall_64+0x3c/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7fe69735c3eb
       Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 75 42 2c 00 41 89 ca 8b 00 85 c0 75 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 41 57 4d 89 c7 41 56 41 89
       RSP: 002b:00007fff06d7fb38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
       RAX: ffffffffffffffda RBX: 000055e961413700 RCX: 00007fe69735c3eb
       RDX: 0000000000000040 RSI: 000055e961413700 RDI: 0000000000000003
       RBP: 0000000000000040 R08: 000055e961410500 R09: 0000000000000010
       R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff06d81260
       R13: 00007fff06d7fb40 R14: 00007fff06d7fc30 R15: 000055e96140f0a0
      
       Allocated by task 917:
        kasan_save_stack+0x19/0x40
        __kasan_kmalloc+0x7f/0xa0
        __kmalloc_node+0x139/0x280
        fq_pie_init+0x555/0x8e8 [sch_fq_pie]
        qdisc_create+0x407/0x11b0
        tc_modify_qdisc+0x3c2/0x17e0
        rtnetlink_rcv_msg+0x346/0x8e0
        netlink_rcv_skb+0x120/0x380
        netlink_unicast+0x439/0x630
        netlink_sendmsg+0x719/0xbf0
        sock_sendmsg+0xe2/0x110
        ____sys_sendmsg+0x5ba/0x890
        ___sys_sendmsg+0xe9/0x160
        __sys_sendmsg+0xd3/0x170
        do_syscall_64+0x3c/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
       The buggy address belongs to the object at ffff888171306800
        which belongs to the cache kmalloc-256 of size 256
       The buggy address is located 36 bytes to the right of
        256-byte region [ffff888171306800, ffff888171306900)
       The buggy address belongs to the page:
       page:00000000bcfb624e refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x171306
       head:00000000bcfb624e order:1 compound_mapcount:0
       flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
       raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff888100042b40
       raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        ffff888171306800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff888171306880: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
       >ffff888171306900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                      ^
        ffff888171306980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff888171306a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      fix fq_pie traffic path to avoid selecting 'q->flows + q->flows_cnt' as a
      valid flow: it's an address beyond the allocated memory.
      
      Fixes: ec97ecf1 ("net: sched: add Flow Queue PIE packet scheduler")
      CC: stable@vger.kernel.org
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e70f7a11
    • D
      net/sched: fq_pie: re-factor fix for fq_pie endless loop · 3a62fed2
      Davide Caratti 提交于
      the patch that fixed an endless loop in_fq_pie_init() was not considering
      that 65535 is a valid class id. The correct bugfix for this infinite loop
      is to change 'idx' to become an u32, like Colin proposed in the past [1].
      
      Fix this as follows:
       - restore 65536 as maximum possible values of 'flows_cnt'
       - use u32 'idx' when iterating on 'q->flows'
       - fix the TDC selftest
      
      This reverts commit bb2f930d.
      
      [1] https://lore.kernel.org/netdev/20210407163808.499027-1-colin.king@canonical.com/
      
      CC: Colin Ian King <colin.king@canonical.com>
      CC: stable@vger.kernel.org
      Fixes: bb2f930d ("net/sched: fix infinite loop in sch_fq_pie")
      Fixes: ec97ecf1 ("net: sched: add Flow Queue PIE packet scheduler")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3a62fed2
    • Z
      net: macb: ensure the device is available before accessing GEMGXL control registers · 5eff1461
      Zong Li 提交于
      If runtime power menagement is enabled, the gigabit ethernet PLL would
      be disabled after macb_probe(). During this period of time, the system
      would hang up if we try to access GEMGXL control registers.
      
      We can't put runtime_pm_get/runtime_pm_put/ there due to the issue of
      sleep inside atomic section (7fa2955f ("sh_eth: Fix sleeping
      function called from invalid context"). Add netif_running checking to
      ensure the device is available before accessing GEMGXL device.
      
      Changed in v2:
       - Use netif_running instead of its own flag
      Signed-off-by: NZong Li <zong.li@sifive.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5eff1461
    • S
      net: ethernet: mtk_eth_soc: Fix packet statistics support for MT7628/88 · ad79fd2c
      Stefan Roese 提交于
      The MT7628/88 SoC(s) have other (limited) packet counter registers than
      currently supported in the mtk_eth_soc driver. This patch adds support
      for reading these registers, so that the packet statistics are correctly
      updated.
      
      Additionally the defines for the non-MT7628 variant packet counter
      registers are added and used in this patch instead of using hard coded
      values.
      Signed-off-by: NStefan Roese <sr@denx.de>
      Fixes: 296c9120 ("net: ethernet: mediatek: Add MT7628/88 SoC support")
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: John Crispin <john@phrozen.org>
      Cc: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
      Cc: Reto Schneider <code@reto-schneider.ch>
      Cc: Reto Schneider <reto.schneider@husqvarnagroup.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad79fd2c
    • V
      MAINTAINERS: Add entries for CBS, ETF and taprio qdiscs · 1e69abf9
      Vinicius Costa Gomes 提交于
      Add Vinicius Costa Gomes as maintainer for these qdiscs.
      
      These qdiscs are all TSN (Time Sensitive Networking) related.
      Signed-off-by: NVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: NCong Wang <cong.wang@bytedance.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e69abf9
  2. 22 5月, 2021 8 次提交
  3. 21 5月, 2021 7 次提交
    • D
      Merge branch 'stmmac-fixes' · 5cb4a593
      David S. Miller 提交于
      Joakim Zhang says:
      
      ====================
      net: fixes for stmmac
      
      Two clock fixes for stmmac driver.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5cb4a593
    • J
      net: stmmac: fix system hang if change mac address after interface ifdown · 4691ffb1
      Joakim Zhang 提交于
      Fix system hang with below sequences:
      ~# ifconfig ethx down
      ~# ifconfig ethx hw ether xx:xx:xx:xx:xx:xx
      
      After ethx down, stmmac all clocks gated off and then register access causes
      system hang.
      
      Fixes: 5ec55823 ("net: stmmac: add clocks management for gmac driver")
      Signed-off-by: NJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4691ffb1
    • J
      net: stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid() · b3dcb312
      Joakim Zhang 提交于
      This should be a mistake to fix conflicts when removing RFC tag to
      repost the patch.
      
      Fixes: 5ec55823 ("net: stmmac: add clocks management for gmac driver")
      Signed-off-by: NJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b3dcb312
    • Z
      net/qla3xxx: fix schedule while atomic in ql_sem_spinlock · 13a6f315
      Zheyu Ma 提交于
      When calling the 'ql_sem_spinlock', the driver has already acquired the
      spin lock, so the driver should not call 'ssleep' in atomic context.
      
      This bug can be fixed by using 'mdelay' instead of 'ssleep'.
      
      The KASAN's log reveals it:
      
      [    3.238124 ] BUG: scheduling while atomic: swapper/0/1/0x00000002
      [    3.238748 ] 2 locks held by swapper/0/1:
      [    3.239151 ]  #0: ffff88810177b240 (&dev->mutex){....}-{3:3}, at:
      __device_driver_lock+0x41/0x60
      [    3.240026 ]  #1: ffff888107c60e28 (&qdev->hw_lock){....}-{2:2}, at:
      ql3xxx_probe+0x2aa/0xea0
      [    3.240873 ] Modules linked in:
      [    3.241187 ] irq event stamp: 460854
      [    3.241541 ] hardirqs last  enabled at (460853): [<ffffffff843051bf>]
      _raw_spin_unlock_irqrestore+0x4f/0x70
      [    3.242245 ] hardirqs last disabled at (460854): [<ffffffff843058ca>]
      _raw_spin_lock_irqsave+0x2a/0x70
      [    3.242245 ] softirqs last  enabled at (446076): [<ffffffff846002e4>]
      __do_softirq+0x2e4/0x4b1
      [    3.242245 ] softirqs last disabled at (446069): [<ffffffff811ba5e0>]
      irq_exit_rcu+0x100/0x110
      [    3.242245 ] Preemption disabled at:
      [    3.242245 ] [<ffffffff828ca5ba>] ql3xxx_probe+0x2aa/0xea0
      [    3.242245 ] Kernel panic - not syncing: scheduling while atomic
      [    3.242245 ] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
      5.13.0-rc1-00145
      -gee7dc339169-dirty #16
      [    3.242245 ] Call Trace:
      [    3.242245 ]  dump_stack+0xba/0xf5
      [    3.242245 ]  ? ql3xxx_probe+0x1f0/0xea0
      [    3.242245 ]  panic+0x15a/0x3f2
      [    3.242245 ]  ? vprintk+0x76/0x150
      [    3.242245 ]  ? ql3xxx_probe+0x2aa/0xea0
      [    3.242245 ]  __schedule_bug+0xae/0xe0
      [    3.242245 ]  __schedule+0x72e/0xa00
      [    3.242245 ]  schedule+0x43/0xf0
      [    3.242245 ]  schedule_timeout+0x28b/0x500
      [    3.242245 ]  ? del_timer_sync+0xf0/0xf0
      [    3.242245 ]  ? msleep+0x2f/0x70
      [    3.242245 ]  msleep+0x59/0x70
      [    3.242245 ]  ql3xxx_probe+0x307/0xea0
      [    3.242245 ]  ? _raw_spin_unlock_irqrestore+0x3a/0x70
      [    3.242245 ]  ? pci_device_remove+0x110/0x110
      [    3.242245 ]  local_pci_probe+0x45/0xa0
      [    3.242245 ]  pci_device_probe+0x12b/0x1d0
      [    3.242245 ]  really_probe+0x2a9/0x610
      [    3.242245 ]  driver_probe_device+0x90/0x1d0
      [    3.242245 ]  ? mutex_lock_nested+0x1b/0x20
      [    3.242245 ]  device_driver_attach+0x68/0x70
      [    3.242245 ]  __driver_attach+0x124/0x1b0
      [    3.242245 ]  ? device_driver_attach+0x70/0x70
      [    3.242245 ]  bus_for_each_dev+0xbb/0x110
      [    3.242245 ]  ? rdinit_setup+0x45/0x45
      [    3.242245 ]  driver_attach+0x27/0x30
      [    3.242245 ]  bus_add_driver+0x1eb/0x2a0
      [    3.242245 ]  driver_register+0xa9/0x180
      [    3.242245 ]  __pci_register_driver+0x82/0x90
      [    3.242245 ]  ? yellowfin_init+0x25/0x25
      [    3.242245 ]  ql3xxx_driver_init+0x23/0x25
      [    3.242245 ]  do_one_initcall+0x7f/0x3d0
      [    3.242245 ]  ? rdinit_setup+0x45/0x45
      [    3.242245 ]  ? rcu_read_lock_sched_held+0x4f/0x80
      [    3.242245 ]  kernel_init_freeable+0x2aa/0x301
      [    3.242245 ]  ? rest_init+0x2c0/0x2c0
      [    3.242245 ]  kernel_init+0x18/0x190
      [    3.242245 ]  ? rest_init+0x2c0/0x2c0
      [    3.242245 ]  ? rest_init+0x2c0/0x2c0
      [    3.242245 ]  ret_from_fork+0x1f/0x30
      [    3.242245 ] Dumping ftrace buffer:
      [    3.242245 ]    (ftrace buffer empty)
      [    3.242245 ] Kernel Offset: disabled
      [    3.242245 ] Rebooting in 1 seconds.
      Reported-by: NZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: NZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13a6f315
    • A
      net: encx24j600: fix kernel-doc syntax in file headers · 503c599a
      Aditya Srivastava 提交于
      The opening comment mark '/**' is used for highlighting the beginning of
      kernel-doc comments.
      The header for drivers/net/ethernet/microchip/encx24j600 files follows
      this syntax, but the content inside does not comply with kernel-doc.
      
      This line was probably not meant for kernel-doc parsing, but is parsed
      due to the presence of kernel-doc like comment syntax(i.e, '/**'), which
      causes unexpected warning from kernel-doc.
      For e.g., running scripts/kernel-doc -none
      drivers/net/ethernet/microchip/encx24j600_hw.h emits:
      warning: expecting prototype for h(). Prototype was for _ENCX24J600_HW_H() instead
      
      Provide a simple fix by replacing such occurrences with general comment
      format, i.e. '/*', to prevent kernel-doc from parsing it.
      Signed-off-by: NAditya Srivastava <yashsri421@gmail.com>
      Acked-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      503c599a
    • J
      ixgbe: fix large MTU request from VF · 63e39d29
      Jesse Brandeburg 提交于
      Check that the MTU value requested by the VF is in the supported
      range of MTUs before attempting to set the VF large packet enable,
      otherwise reject the request. This also avoids unnecessary
      register updates in the case of the 82599 controller.
      
      Fixes: 872844dd ("ixgbe: Enable jumbo frames support w/ SR-IOV")
      Co-developed-by: NPiotr Skajewski <piotrx.skajewski@intel.com>
      Signed-off-by: NPiotr Skajewski <piotrx.skajewski@intel.com>
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Co-developed-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
      Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63e39d29
    • D
      selftests: Add .gitignore for nci test suite · 8570e75a
      David Matlack 提交于
      Building the nci test suite produces a binary, nci_dev, that git then
      tries to track. Add a .gitignore file to tell git to ignore this binary.
      Signed-off-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8570e75a
  4. 20 5月, 2021 5 次提交
  5. 19 5月, 2021 15 次提交
    • J
      mlx5e: add add missing BH locking around napi_schdule() · e63052a5
      Jakub Kicinski 提交于
      It's not correct to call napi_schedule() in pure process
      context. Because we use __raise_softirq_irqoff() we require
      callers to be in a context which will eventually lead to
      softirq handling (hardirq, bh disabled, etc.).
      
      With code as is users will see:
      
       NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
      
      Fixes: a8dd7ac1 ("net/mlx5e: Generalize RQ activation")
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      e63052a5
    • A
      net/mlx5: Set term table as an unmanaged flow table · 6ff51ab8
      Ariel Levkovich 提交于
      Termination tables are restricted to have the default miss action and
      cannot be set to forward to another table in case of a miss.
      If the fs prio of the termination table is not the last one in the
      list, fs_core will attempt to attach it to another table.
      
      Set the unmanaged ft flag when creating the termination table ft
      and select the tc offload prio for it to prevent fs_core from selecting
      the forwarding to next ft miss action and use the default one.
      
      In addition, set the flow that forwards to the termination table to
      ignore ft level restrictions since the ft level is not set by fs_core
      for unamanged fts.
      
      Fixes: 249ccc3c ("net/mlx5e: Add support for offloading traffic from uplink to uplink")
      Signed-off-by: NAriel Levkovich <lariel@nvidia.com>
      6ff51ab8
    • L
      net/mlx5: Don't overwrite HCA capabilities when setting MSI-X count · 75e8564e
      Leon Romanovsky 提交于
      During driver probe of device that has dynamic MSI-X feature enabled,
      the following error is printed in some FW flavour (not released yet).
      
       mlx5_core 0000:06:00.0: firmware version: 4.7.4387
       mlx5_core 0000:06:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
       mlx5_core 0000:06:00.0: mlx5_cmd_check:777:(pid 70599): SET_HCA_CAP(0x109) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x0)
       mlx5_core 0000:06:00.0: set_hca_cap:622:(pid 70599): handle_hca_cap failed
       mlx5_core 0000:06:00.0: mlx5_function_setup:1045:(pid 70599): set_hca_cap failed
       mlx5_core 0000:06:00.0: probe_one:1465:(pid 70599): mlx5_init_one failed with error code -22
       mlx5_core: probe of 0000:06:00.0 failed with error -22
      
      In order to make the setting capability of MSI-X future proof, let's
      query the current capabilities first.
      
      Fixes: 604774ad ("net/mlx5: Dynamically assign MSI-X vectors count")
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      75e8564e
    • E
      {net,vdpa}/mlx5: Configure interface MAC into mpfs L2 table · 7c9f131f
      Eli Cohen 提交于
      net/mlx5: Expose MPFS configuration API
      
      MPFS is the multi physical function switch that bridges traffic between
      the physical port and any physical functions associated with it. The
      driver is required to add or remove MAC entries to properly forward
      incoming traffic to the correct physical function.
      
      We export the API to control MPFS so that other drivers, such as
      mlx5_vdpa are able to add MAC addresses of their network interfaces.
      
      The MAC address of the vdpa interface must be configured into the MPFS L2
      address. Failing to do so could cause, in some NIC configurations, failure
      to forward packets to the vdpa network device instance.
      
      Fix this by adding calls to update the MPFS table.
      
      CC: <mst@redhat.com>
      CC: <jasowang@redhat.com>
      CC: <virtualization@lists.linux-foundation.org>
      Fixes: 1a86b377 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
      Signed-off-by: NEli Cohen <elic@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      7c9f131f
    • A
      net/mlx5e: Fix error path of updating netdev queues · 5e7923ac
      Aya Levin 提交于
      Avoid division by zero in the error flow. In the driver TC number can be
      either 1 or 8. When TC count is set to 1, driver zero netdev->num_tc.
      Hence, need to convert it back from 0 to 1 in the error flow.
      
      Fixes: fa374877 ("net/mlx5e: Handle errors from netif_set_real_num_{tx,rx}_queues")
      Signed-off-by: NAya Levin <ayal@nvidia.com>
      Reviewed-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      5e7923ac
    • V
      net/mlx5e: Reject mirroring on source port change encap rules · 7d1a3d08
      Vlad Buslov 提交于
      Rules with MLX5_ESW_DEST_CHAIN_WITH_SRC_PORT_CHANGE dest flag are
      translated to destination FT in eswitch. Currently it is not possible to
      mirror such rules because firmware doesn't support mixing FT and Vport
      destinations in single rule when one of them adds encapsulation. Since the
      only use case for MLX5_ESW_DEST_CHAIN_WITH_SRC_PORT_CHANGE destination is
      support for tunnel endpoints on VF and trying to offload such rule with
      mirror action causes either crash in fs_core or firmware error with
      syndrome 0xff6a1d, reject all such rules in mlx5 TC layer.
      
      Fixes: 10742efc ("net/mlx5e: VF tunnel TX traffic offloading")
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      7d1a3d08
    • D
      net/mlx5e: Fix multipath lag activation · 97817fcc
      Dima Chumak 提交于
      When handling FIB_EVENT_ENTRY_REPLACE event for a new multipath route,
      lag activation can be missed if a stale (struct lag_mp)->mfi pointer
      exists, which was associated with an older multipath route that had been
      removed.
      
      Normally, when a route is removed, it triggers mlx5_lag_fib_event(),
      which handles FIB_EVENT_ENTRY_DEL and clears mfi pointer. But, if
      mlx5_lag_check_prereq() condition isn't met, for example when eswitch is
      in legacy mode, the fib event is skipped and mfi pointer becomes stale.
      
      Fix by resetting mfi pointer to NULL every time mlx5_lag_mp_init() is
      called.
      
      Fixes: 544fe7c2 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
      Signed-off-by: NDima Chumak <dchumak@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      97817fcc
    • S
      net/mlx5e: reset XPS on error flow if netdev isn't registered yet · 77ecd10d
      Saeed Mahameed 提交于
      mlx5e_attach_netdev can be called prior to registering the netdevice:
      Example stack:
      
      ipoib_new_child_link ->
      ipoib_intf_init->
      rdma_init_netdev->
      mlx5_rdma_setup_rn->
      
      mlx5e_attach_netdev->
      mlx5e_num_channels_changed ->
      mlx5e_set_default_xps_cpumasks ->
      netif_set_xps_queue ->
      __netif_set_xps_queue -> kmalloc
      
      If any later stage fails at any point after mlx5e_num_channels_changed()
      returns, XPS allocated maps will never be freed as they
      are only freed during netdev unregistration, which will never happen for
      yet to be registered netdevs.
      
      Fixes: 3909a12e ("net/mlx5e: Fix configuration of XPS cpumasks and netdev queues in corner cases")
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: NAya Levin <ayal@nvidia.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      77ecd10d
    • R
      net/mlx5e: Make sure fib dev exists in fib event · eb96cc15
      Roi Dayan 提交于
      For unreachable route entry the fib dev does not exists.
      
      Fixes: 8914add2 ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
      Reported-by: NDennis Afanasev <dennis.afanasev@stateless.net>
      Signed-off-by: NRoi Dayan <roid@nvidia.com>
      Reviewed-by: NMaor Dickman <maord@nvidia.com>
      Reviewed-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      eb96cc15
    • R
      net/mlx5e: Fix null deref accessing lag dev · 83026d83
      Roi Dayan 提交于
      It could be the lag dev is null so stop processing the event.
      In bond_enslave() the active/backup slave being set before setting the
      upper dev so first event is without an upper dev.
      After setting the upper dev with bond_master_upper_dev_link() there is
      a second event and in that event we have an upper dev.
      
      Fixes: 7e51891a ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule")
      Signed-off-by: NRoi Dayan <roid@nvidia.com>
      Reviewed-by: NMaor Dickman <maord@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      83026d83
    • D
      net/mlx5e: Fix nullptr in mlx5e_tc_add_fdb_flow() · fe7738eb
      Dima Chumak 提交于
      The result of __dev_get_by_index() is not checked for NULL, which then
      passed to mlx5e_attach_encap() and gets dereferenced.
      
      Also, in case of a successful lookup, the net_device reference count is
      not incremented, which may result in net_device pointer becoming invalid
      at any time during mlx5e_attach_encap() execution.
      
      Fix by using dev_get_by_index(), which does proper reference counting on
      the net_device pointer. Also, handle nullptr return value when mirred
      device is not found.
      
      It's safe to call dev_put() on the mirred net_device pointer, right
      after mlx5e_attach_encap() call, because it's not being saved/copied
      down the call chain.
      
      Fixes: 3c37745e ("net/mlx5e: Properly deal with encap flows add/del under neigh update")
      Addresses-Coverity: ("Dereference null return value")
      Signed-off-by: NDima Chumak <dchumak@nvidia.com>
      Reviewed-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      fe7738eb
    • P
      net/mlx5: SF, Fix show state inactive when its inactivated · 82041634
      Parav Pandit 提交于
      When a SF is inactivated and when it is in a TEARDOWN_REQUEST
      state, driver still returns its state as active. This is incorrect.
      Fix it by treating TEARDOWN_REQEUST as inactive state. When a SF
      is still attached to the driver, on user request to reactivate EINVAL
      error is returned. Inform user about it with better code EBUSY and
      informative error message.
      
      Fixes: 6a327321 ("net/mlx5: SF, Port function state change support")
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NVu Pham <vuhuong@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      82041634
    • R
      net/mlx5: Fix err prints and return when creating termination table · fca08661
      Roi Dayan 提交于
      Fix print to print correct error code and not using IS_ERR() which
      will just result in always printing 1.
      Also return real err instead of always -EOPNOTSUPP.
      
      Fixes: 10caabda ("net/mlx5e: Use termination table for VLAN push actions")
      Signed-off-by: NRoi Dayan <roid@nvidia.com>
      Reviewed-by: NMaor Dickman <maord@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      fca08661
    • J
      net/mlx5: Set reformat action when needed for termination rules · 442b3d7b
      Jianbo Liu 提交于
      For remote mirroring, after the tunnel packets are received, they are
      decapsulated and sent to representor, then re-encapsulated and sent
      out over another tunnel. So reformat action is set only when the
      destination is required to do encapsulation.
      
      Fixes: 249ccc3c ("net/mlx5e: Add support for offloading traffic from uplink to uplink")
      Signed-off-by: NJianbo Liu <jianbol@nvidia.com>
      Reviewed-by: NAriel Levkovich <lariel@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      442b3d7b
    • D
      net/mlx5e: Fix nullptr in add_vlan_push_action() · dca59f4a
      Dima Chumak 提交于
      The result of dev_get_by_index_rcu() is not checked for NULL and then
      gets dereferenced immediately.
      
      Also, the RCU lock must be held by the caller of dev_get_by_index_rcu(),
      which isn't satisfied by the call stack.
      
      Fix by handling nullptr return value when iflink device is not found.
      Add RCU locking around dev_get_by_index_rcu() to avoid possible adverse
      effects while iterating over the net_device's hlist.
      
      It is safe not to increment reference count of the net_device pointer in
      case of a successful lookup, because it's already handled by VLAN code
      during VLAN device registration (see register_vlan_dev and
      netdev_upper_dev_link).
      
      Fixes: 278748a9 ("net/mlx5e: Offload TC e-switch rules with egress VLAN device")
      Addresses-Coverity: ("Dereference null return value")
      Signed-off-by: NDima Chumak <dchumak@nvidia.com>
      Reviewed-by: NVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      dca59f4a