1. 17 11月, 2021 23 次提交
    • D
      Merge tag 'mlx5-fixes-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 9311ccef
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5-fixes-2021-11-16
      
      Please pull this mlx5 fixes series, or let me know in case of any problem.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9311ccef
    • T
      net: stmmac: Fix signed/unsigned wreckage · 3751c3d3
      Thomas Gleixner 提交于
      The recent addition of timestamp correction to compensate the CDC error
      introduced a subtle signed/unsigned bug in stmmac_get_tx_hwtstamp() while
      it managed for some obscure reason to avoid that in stmmac_get_rx_hwtstamp().
      
      The issue is:
      
          s64 adjust = 0;
          u64 ns;
      
          adjust += -(2 * (NSEC_PER_SEC / priv->plat->clk_ptp_rate));
          ns += adjust;
      
      works by chance on 64bit, but falls apart on 32bit because the compiler
      knows that adjust fits into 32bit and then treats the addition as a u64 +
      u32 resulting in an off by ~2 seconds failure.
      
      The RX variant uses an u64 for adjust and does the adjustment via
      
          ns -= adjust;
      
      because consistency is obviously overrated.
      
      Get rid of the pointless zero initialized adjust variable and do:
      
      	ns -= (2 * NSEC_PER_SEC) / priv->plat->clk_ptp_rate;
      
      which is obviously correct and spares the adjust obfuscation. Aside of that
      it yields a more accurate result because the multiplication takes place
      before the integer divide truncation and not afterwards.
      
      Stick the calculation into an inline so it can't be accidentally
      disimproved. Return an u32 from that inline as the result is guaranteed
      to fit which lets the compiler optimize the substraction.
      
      Cc: stable@vger.kernel.org
      Fixes: 3600be5f ("net: stmmac: add timestamp correction to rid CDC sync error")
      Reported-by: NBenedikt Spranger <b.spranger@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NBenedikt Spranger <b.spranger@linutronix.de>
      Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel EHL
      Link: https://lore.kernel.org/r/87mtm578cs.ffs@tglxSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      3751c3d3
    • J
      Merge branch 'net-fix-the-mirred-packet-drop-due-to-the-incorrect-dst' · e4ca7823
      Jakub Kicinski 提交于
      Xin Long says:
      
      ====================
      net: fix the mirred packet drop due to the incorrect dst
      
      This issue was found when using OVS HWOL on OVN-k8s. These packets
      dropped on rx path were seen with output dst, which should've been
      dropped from the skbs when redirecting them.
      
      The 1st patch is to the fix and the 2nd is a selftest to reproduce
      and verify it.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1636734751.git.lucien.xin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      e4ca7823
    • D
      selftests: add a test case for mirred egress to ingress · 1d127eff
      Davide Caratti 提交于
      add a selftest that verifies the correct behavior of TC act_mirred egress
      to ingress: in particular, it checks if the dst_entry is removed from skb
      before redirect egress -> ingress. The correct behavior is: an ICMP 'echo
      request' generated by ping will be received and generate a reply the same
      way as the one generated by mausezahn.
      Suggested-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NCong Wang <cong.wang@bytedance.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      1d127eff
    • X
      net: sched: act_mirred: drop dst for the direction from egress to ingress · f799ada6
      Xin Long 提交于
      Without dropping dst, the packets sent from local mirred/redirected
      to ingress will may still use the old dst. ip_rcv() will drop it as
      the old dst is for output and its .input is dst_discard.
      
      This patch is to fix by also dropping dst for those packets that are
      mirred or redirected from egress to ingress in act_mirred.
      
      Note that we don't drop it for the direction change from ingress to
      egress, as on which there might be a user case attaching a metadata
      dst by act_tunnel_key that would be used later.
      
      Fixes: b57dc7c1 ("net/sched: Introduce action ct")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NCong Wang <cong.wang@bytedance.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f799ada6
    • T
      amt: cancel delayed_work synchronously in amt_fini() · b0024a04
      Taehee Yoo 提交于
      When the amt module is being removed, it calls cancel_delayed_work()
      to cancel pending delayed_work. But this function doesn't wait for
      canceling delayed_work.
      So, workers can be still doing after module delete.
      
      In order to avoid this, cancel_delayed_work_sync() should be used instead.
      Suggested-by: NJakub Kicinski <kuba@kernel.org>
      Fixes: bc54e49c ("amt: add multicast(IGMP) report message handler")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20211116160923.25258-1-ap420073@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      b0024a04
    • P
      MAINTAINERS: remove GR-everest-linux-l2@marvell.com · 0a83f96f
      Pavel Skripkin 提交于
      I've sent a patch to GR-everest-linux-l2@marvell.com few days ago and
      got a reply from postmaster@marvell.com:
      
      	Delivery has failed to these recipients or groups:
      
      	gr-everest-linux-l2@marvell.com<mailto:gr-everest-linux-l2@marvell.com>
      	The email address you entered couldn't be found. Please check the
      	recipient's email address and try to resend the message. If the problem
      	continues, please contact your helpdesk.
      
      As requested by Alok Prasad, replacing GR-everest-linux-l2@marvell.com
      with Manish Chopra's email address. [0]
      
      Link: https://lore.kernel.org/all/20211116081601.11208-1-palok@marvell.com/ [0]
      Signed-off-by: NPavel Skripkin <paskripkin@gmail.com>
      Link: https://lore.kernel.org/r/20211116141303.32180-1-paskripkin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      0a83f96f
    • M
      bnxt_en: Fix compile error regression when CONFIG_BNXT_SRIOV is not set · 9f536391
      Michael Chan 提交于
      bp->sriov_cfg is not defined when CONFIG_BNXT_SRIOV is not set.  Fix
      it by adding a helper function bnxt_sriov_cfg() to handle the logic
      with or without the config option.
      
      Fixes: 46d08f55 ("bnxt_en: extend RTNL to VF check in devlink driver_reinit")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Reviewed-by: NEdwin Peer <edwin.peer@broadcom.com>
      Reviewed-by: NAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/1637090770-22835-1-git-send-email-michael.chan@broadcom.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      9f536391
    • M
      net: mvmdio: fix compilation warning · 2460386b
      Marcin Wojtas 提交于
      The kernel test robot reported a following issue:
      
      >> drivers/net/ethernet/marvell/mvmdio.c:426:36: warning:
      unused variable 'orion_mdio_acpi_match' [-Wunused-const-variable]
         static const struct acpi_device_id orion_mdio_acpi_match[] = {
                                            ^
         1 warning generated.
      
      Fix that by surrounding the variable by appropriate ifdef.
      
      Fixes: c54da4c1 ("net: mvmdio: add ACPI support")
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NMarcin Wojtas <mw@semihalf.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20211115153024.209083-1-mw@semihalf.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      2460386b
    • J
      Merge tag 'mac80211-for-net-2021-11-16' of... · f5c74160
      Jakub Kicinski 提交于
      Merge tag 'mac80211-for-net-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Couple of fixes:
       * bad dont-reorder check
       * throughput LED trigger for various new(ish) paths
       * radiotap header generation
       * locking assertions in mac80211 with monitor mode
       * radio statistics
       * don't try to access IV when not present
       * call stop_ap for P2P_GO as well as we should
      
      * tag 'mac80211-for-net-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
        mac80211: fix throughput LED trigger
        mac80211: fix monitor_sdata RCU/locking assertions
        mac80211: drop check for DONT_REORDER in __ieee80211_select_queue
        mac80211: fix radiotap header generation
        mac80211: do not access the IV when it was stripped
        nl80211: fix radio statistics in survey dump
        cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
      ====================
      
      Link: https://lore.kernel.org/r/20211116160845.157214-1-johannes@sipsolutions.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      f5c74160
    • J
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · f083ec31
      Jakub Kicinski 提交于
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-11-16
      
      We've added 12 non-merge commits during the last 5 day(s) which contain
      a total of 23 files changed, 573 insertions(+), 73 deletions(-).
      
      The main changes are:
      
      1) Fix pruning regression where verifier went overly conservative rejecting
         previsouly accepted programs, from Alexei Starovoitov and Lorenz Bauer.
      
      2) Fix verifier TOCTOU bug when using read-only map's values as constant
         scalars during verification, from Daniel Borkmann.
      
      3) Fix a crash due to a double free in XSK's buffer pool, from Magnus Karlsson.
      
      4) Fix libbpf regression when cross-building runqslower, from Jean-Philippe Brucker.
      
      5) Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_*() helpers in tracing
         programs due to deadlock possibilities, from Dmitrii Banshchikov.
      
      6) Fix checksum validation in sockmap's udp_read_sock() callback, from Cong Wang.
      
      7) Various BPF sample fixes such as XDP stats in xdp_sample_user, from Alexander Lobakin.
      
      8) Fix libbpf gen_loader error handling wrt fd cleanup, from Kumar Kartikeya Dwivedi.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        udp: Validate checksum in udp_read_sock()
        bpf: Fix toctou on read-only map's constant scalar tracking
        samples/bpf: Fix build error due to -isystem removal
        selftests/bpf: Add tests for restricted helpers
        bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs
        libbpf: Perform map fd cleanup for gen_loader in case of error
        samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu
        tools/runqslower: Fix cross-build
        samples/bpf: Fix summary per-sec stats in xdp_sample_user
        selftests/bpf: Check map in map pruning
        bpf: Fix inner map state pruning regression.
        xsk: Fix crash on double free in buffer pool
      ====================
      
      Link: https://lore.kernel.org/r/20211116141134.6490-1-daniel@iogearbox.netSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      f083ec31
    • R
      net/mlx5: E-Switch, return error if encap isn't supported · c4c31767
      Raed Salem 提交于
      On regular ConnectX HCAs getting encap mode isn't supported when the
      E-Switch is in NONE mode. Current code would return no error code when
      trying to get encap mode in such case which is wrong.
      
      Fix by returning error value to indicate failure to caller in such case.
      
      Fixes: 8e0aa4bc ("net/mlx5: E-switch, Protect eswitch mode changes")
      Signed-off-by: NRaed Salem <raeds@nvidia.com>
      Reviewed-by: NMark Bloch <mbloch@nvidia.com>
      Reviewed-by: NMaor Dickman <maord@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      c4c31767
    • M
      net/mlx5: Lag, update tracker when state change event received · ae396d85
      Maher Sanalla 提交于
      Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events
      handling, tracking is not fully completed if the LAG device is not ready
      at the time the events occur. But, we must keep track of the upper and
      lower states after receiving the events because RoCE needs this info in
      mlx5_lag_get_roce_netdev() - in order to return the corresponding port
      that its running on. Returning the wrong (not most recent) port will lead
      to gids table being incorrect.
      
      For example: If during the attachment of a slave to the bond, the other
      non-attached port performs pci_reload, then the LAG device is not ready,
      but that should not result in dismissing attached slave tracker update
      automatically (which is performed in mlx5_handle_changelowerstate()), Since
      these events might not come later, which can lead to both bond ports
      having tx_enabled=0 - which is not a valid state of LAG bond.
      
      Fixes: 9b412cc3 ("net/mlx5e: Add LAG warning if bond slave is not lag master")
      Signed-off-by: NMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: NMark Bloch <mbloch@nvidia.com>
      Reviewed-by: NJianbo Liu <jianbol@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      ae396d85
    • R
      net/mlx5e: CT, Fix multiple allocations and memleak of mod acts · 806401c2
      Roi Dayan 提交于
      CT clear action offload adds additional mod hdr actions to the
      flow's original mod actions in order to clear the registers which
      hold ct_state.
      When such flow also includes encap action, a neigh update event
      can cause the driver to unoffload the flow and then reoffload it.
      
      Each time this happens, the ct clear handling adds that same set
      of mod hdr actions to reset ct_state until the max of mod hdr
      actions is reached.
      
      Also the driver never releases the allocated mod hdr actions and
      causing a memleak.
      
      Fix above two issues by moving CT clear mod acts allocation
      into the parsing actions phase and only use it when offloading the rule.
      The release of mod acts will be done in the normal flow_put().
      
       backtrace:
          [<000000007316e2f3>] krealloc+0x83/0xd0
          [<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core]
          [<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core]
          [<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core]
          [<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core]
          [<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core]
          [<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core]
          [<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core]
          [<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core]
          [<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core]
      
      Fixes: 1ef3018f ("net/mlx5e: CT: Support clear action")
      Signed-off-by: NRoi Dayan <roid@nvidia.com>
      Reviewed-by: NPaul Blakey <paulb@nvidia.com>
      Reviewed-by: NMaor Dickman <maord@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      806401c2
    • A
      net/mlx5: Fix flow counters SF bulk query len · 38a54cae
      Avihai Horon 提交于
      When doing a flow counters bulk query, the number of counters to query
      must be aligned to 4. Current SF bulk query len is not aligned to 4,
      which leads to an error when trying to query more than 4 counters.
      
      Fix it by aligning SF bulk query len to 4.
      
      Fixes: 2fdeb4f4 ("net/mlx5: Reduce flow counters bulk query buffer size for SFs")
      Signed-off-by: NAvihai Horon <avihaih@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      38a54cae
    • M
      net/mlx5: E-Switch, rebuild lag only when needed · 2eb0cb31
      Mark Bloch 提交于
      A user can enable VFs without changing E-Switch mode, this can happen
      when a user moves straight to switchdev mode and only once in switchdev
      VFs are enabled via the sysfs interface.
      
      The cited commit assumed this isn't possible and exposed a single
      API function where the E-switch calls into the lag code, breaks the lag
      and prevents any other lag operations to take place until the
      E-switch update has ended.
      
      Breaking the hardware lag when it isn't needed can make it such that
      hardware lag can't be enabled again.
      
      In the sysfs call path check if the current E-Switch mode is NONE,
      in the context of the function it can only mean the E-Switch is moving
      out of NONE mode and the hardware lag should be disabled and enabled
      once the mode change has ended. If the mode isn't NONE it means
      VFs are about to be enabled and such operation doesn't require
      toggling the hardware lag.
      
      Fixes: cac1eb2c ("net/mlx5: Lag, properly lock eswitch if needed")
      Signed-off-by: NMark Bloch <mbloch@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      2eb0cb31
    • N
      net/mlx5: Update error handler for UCTX and UMEM · ba50cd94
      Neta Ostrovsky 提交于
      In the fast unload flow, the device state is set to internal error,
      which indicates that the driver started the destroy process.
      In this case, when a destroy command is being executed, it should return
      MLX5_CMD_STAT_OK.
      Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK
      instead of EIO.
      
      This fixes a call trace in the umem release process -
      [ 2633.536695] Call Trace:
      [ 2633.537518]  ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs]
      [ 2633.538596]  remove_client_context+0x8b/0xd0 [ib_core]
      [ 2633.539641]  disable_device+0x8c/0x130 [ib_core]
      [ 2633.540615]  __ib_unregister_device+0x35/0xa0 [ib_core]
      [ 2633.541640]  ib_unregister_device+0x21/0x30 [ib_core]
      [ 2633.542663]  __mlx5_ib_remove+0x38/0x90 [mlx5_ib]
      [ 2633.543640]  auxiliary_bus_remove+0x1e/0x30 [auxiliary]
      [ 2633.544661]  device_release_driver_internal+0x103/0x1f0
      [ 2633.545679]  bus_remove_device+0xf7/0x170
      [ 2633.546640]  device_del+0x181/0x410
      [ 2633.547606]  mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core]
      [ 2633.548777]  mlx5_unregister_device+0x27/0x40 [mlx5_core]
      [ 2633.549841]  mlx5_uninit_one+0x21/0xc0 [mlx5_core]
      [ 2633.550864]  remove_one+0x69/0xe0 [mlx5_core]
      [ 2633.551819]  pci_device_remove+0x3b/0xc0
      [ 2633.552731]  device_release_driver_internal+0x103/0x1f0
      [ 2633.553746]  unbind_store+0xf6/0x130
      [ 2633.554657]  kernfs_fop_write+0x116/0x190
      [ 2633.555567]  vfs_write+0xa5/0x1a0
      [ 2633.556407]  ksys_write+0x4f/0xb0
      [ 2633.557233]  do_syscall_64+0x5b/0x1a0
      [ 2633.558071]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [ 2633.559018] RIP: 0033:0x7f9977132648
      [ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
      [ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648
      [ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001
      [ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740
      [ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0
      [ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c
      [ 2633.568725] ---[ end trace 10b4fe52945e544d ]---
      
      Fixes: 6a6fabbf ("net/mlx5: Update pci error handler entries and command translation")
      Signed-off-by: NNeta Ostrovsky <netao@nvidia.com>
      Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      ba50cd94
    • Y
      net/mlx5: DR, Fix check for unsupported fields in match param · 455832d4
      Yevgeny Kliteynik 提交于
      The existing loop doesn't cast the buffer while scanning it, which
      results in out-of-bounds read and failure to create the matcher.
      
      Fixes: 941f1979 ("net/mlx5: DR, Add check for unsupported fields in match param")
      Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      455832d4
    • Y
      net/mlx5: DR, Handle eswitch manager and uplink vports separately · 9091b821
      Yevgeny Kliteynik 提交于
      When querying eswitch manager vport capabilities as "other = 1",
      we encounter a FW compatibility issue with older FW versions.
      To maintain backward compatibility, eswitch manager vport should
      be queried as "other = 0" vport both for ECPF and non-ECPF cases.
      
      This patch fixes these queries and improves the code readability
      by handling eswitch manager and uplink vports separately, avoiding
      the excessive 'if' conditions. Also, uplink caps are stored similar
      to esw manager and not as part of xarray.
      
      Fixes: dd4acb2a ("net/mlx5: DR, Add missing query for vport 0")
      Signed-off-by: NYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      9091b821
    • V
      net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove() · 76ded29d
      Valentine Fatiev 提交于
      Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds
      to rest of destroy operations. mlx5_core_destroy_cq() could be called again
      by user and cause additional call of mlx5_debug_cq_remove().
      cq->dbg was not nullify in previous call and cause the crash.
      
      Fix it by nullify cq->dbg pointer after removal.
      
      Also proceed to destroy operations only if FW return 0
      for MLX5_CMD_OP_DESTROY_CQ command.
      
      general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI
      CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      RIP: 0010:lockref_get+0x1/0x60
      Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02
      00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17
      48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48
      RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206
      RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe
      RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058
      RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000
      R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0
      FS:  00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0
      Call Trace:
        simple_recursive_removal+0x33/0x2e0
        ? debugfs_remove+0x60/0x60
        debugfs_remove+0x40/0x60
        mlx5_debug_cq_remove+0x32/0x70 [mlx5_core]
        mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core]
        devx_obj_cleanup+0x151/0x330 [mlx5_ib]
        ? __pollwait+0xd0/0xd0
        ? xas_load+0x5/0x70
        ? xa_load+0x62/0xa0
        destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs]
        uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs]
        uobj_destroy+0x54/0xa0 [ib_uverbs]
        ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs]
        ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs]
        ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs]
        __x64_sys_ioctl+0x3e4/0x8e0
      
      Fixes: 94b960b9 ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path")
      Signed-off-by: NValentine Fatiev <valentinef@nvidia.com>
      Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      76ded29d
    • P
      net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdev · d7751d64
      Paul Blakey 提交于
      E-Switch encap mode is relevant only when in switchdev mode.
      The RDMA driver can query the encap configuration via
      mlx5_eswitch_get_encap_mode(). Make sure it returns the currently
      used mode and not the set one.
      
      This reverts the cited commit which reset the encap mode
      on entering switchdev and fixes the original issue properly.
      
      Fixes: 9a64144d ("net/mlx5: E-Switch, Fix default encap mode")
      Signed-off-by: NPaul Blakey <paulb@nvidia.com>
      Reviewed-by: NMark Bloch <mbloch@nvidia.com>
      Reviewed-by: NMaor Dickman <maord@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      d7751d64
    • V
      net/mlx5e: Wait for concurrent flow deletion during neigh/fib events · 362980ea
      Vlad Buslov 提交于
      Function mlx5e_take_tmp_flow() skips flows with zero reference count. This
      can cause syndrome 0x179e84 when the called from neigh or route update code
      and the skipped flow is not removed from the hardware by the time
      underlying encap/decap resource is deleted. Add new completion
      'del_hw_done' that is completed when flow is unoffloaded. This is safe to
      do because flow with reference count zero needs to be detached from
      encap/decap entry before its memory is deallocated, which requires taking
      the encap_tbl_lock mutex that is held by the event handlers code.
      
      Fixes: 8914add2 ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
      Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      362980ea
    • T
      net/mlx5e: kTLS, Fix crash in RX resync flow · cc4a9cc0
      Tariq Toukan 提交于
      For the TLS RX resync flow, we maintain a list of TLS contexts
      that require some attention, to communicate their resync information
      to the HW.
      Here we fix list corruptions, by protecting the entries against
      movements coming from resync_handle_seq_match(), until their resync
      handling in napi is fully completed.
      
      Fixes: e9ce991b ("net/mlx5e: kTLS, Add resiliency to RX resync failures")
      Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      cc4a9cc0
  2. 16 11月, 2021 17 次提交
    • D
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 848e5d66
      David S. Miller 提交于
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-11-15
      
      This series contains updates to iavf driver only.
      
      Mateusz adds a wait for reset completion when changing queue count which
      could otherwise cause issues with VF reset.
      
      Nick adds a null check for vf_res in iavf_fix_features(), corrects
      ordering of function calls to resolve dependency issues, and prevents
      possible freeing of a lock which isn't being held.
      
      Piotr fixes logic that did not allow setting all multicast mode without
      promiscuous mode.
      
      Jake prevents possible accidental freeing of filter structure.
      
      Mitch adds null checks for key and indir parameters in iavf_get_rxfh().
      
      Surabhi adds an additional check that would, previously, cause the driver
      to print a false error due to values obtained while the VF is in reset.
      
      Grzegorz prevents a queue request of 0 which would cause queue count to
      reset to default values.
      
      Akeem restores VLAN filters when bringing the interface back up.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      848e5d66
    • C
      udp: Validate checksum in udp_read_sock() · 099f896f
      Cong Wang 提交于
      It turns out the skb's in sock receive queue could have bad checksums, as
      both ->poll() and ->recvmsg() validate checksums. We have to do the same
      for ->read_sock() path too before they are redirected in sockmap.
      
      Fixes: d7f57118 ("udp: Implement ->read_sock() for sockmap")
      Reported-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NCong Wang <cong.wang@bytedance.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20211115044006.26068-1-xiyou.wangcong@gmail.com
      099f896f
    • D
      bpf: Fix toctou on read-only map's constant scalar tracking · 353050be
      Daniel Borkmann 提交于
      Commit a23740ec ("bpf: Track contents of read-only maps as scalars") is
      checking whether maps are read-only both from BPF program side and user space
      side, and then, given their content is constant, reading out their data via
      map->ops->map_direct_value_addr() which is then subsequently used as known
      scalar value for the register, that is, it is marked as __mark_reg_known()
      with the read value at verification time. Before a23740ec, the register
      content was marked as an unknown scalar so the verifier could not make any
      assumptions about the map content.
      
      The current implementation however is prone to a TOCTOU race, meaning, the
      value read as known scalar for the register is not guaranteed to be exactly
      the same at a later point when the program is executed, and as such, the
      prior made assumptions of the verifier with regards to the program will be
      invalid which can cause issues such as OOB access, etc.
      
      While the BPF_F_RDONLY_PROG map flag is always fixed and required to be
      specified at map creation time, the map->frozen property is initially set to
      false for the map given the map value needs to be populated, e.g. for global
      data sections. Once complete, the loader "freezes" the map from user space
      such that no subsequent updates/deletes are possible anymore. For the rest
      of the lifetime of the map, this freeze one-time trigger cannot be undone
      anymore after a successful BPF_MAP_FREEZE cmd return. Meaning, any new BPF_*
      cmd calls which would update/delete map entries will be rejected with -EPERM
      since map_get_sys_perms() removes the FMODE_CAN_WRITE permission. This also
      means that pending update/delete map entries must still complete before this
      guarantee is given. This corner case is not an issue for loaders since they
      create and prepare such program private map in successive steps.
      
      However, a malicious user is able to trigger this TOCTOU race in two different
      ways: i) via userfaultfd, and ii) via batched updates. For i) userfaultfd is
      used to expand the competition interval, so that map_update_elem() can modify
      the contents of the map after map_freeze() and bpf_prog_load() were executed.
      This works, because userfaultfd halts the parallel thread which triggered a
      map_update_elem() at the time where we copy key/value from the user buffer and
      this already passed the FMODE_CAN_WRITE capability test given at that time the
      map was not "frozen". Then, the main thread performs the map_freeze() and
      bpf_prog_load(), and once that had completed successfully, the other thread
      is woken up to complete the pending map_update_elem() which then changes the
      map content. For ii) the idea of the batched update is similar, meaning, when
      there are a large number of updates to be processed, it can increase the
      competition interval between the two. It is therefore possible in practice to
      modify the contents of the map after executing map_freeze() and bpf_prog_load().
      
      One way to fix both i) and ii) at the same time is to expand the use of the
      map's map->writecnt. The latter was introduced in fc970227 ("bpf: Add mmap()
      support for BPF_MAP_TYPE_ARRAY") and further refined in 1f6cb19b ("bpf:
      Prevent re-mmap()'ing BPF map as writable for initially r/o mapping") with
      the rationale to make a writable mmap()'ing of a map mutually exclusive with
      read-only freezing. The counter indicates writable mmap() mappings and then
      prevents/fails the freeze operation. Its semantics can be expanded beyond
      just mmap() by generally indicating ongoing write phases. This would essentially
      span any parallel regular and batched flavor of update/delete operation and
      then also have map_freeze() fail with -EBUSY. For the check_mem_access() in
      the verifier we expand upon the bpf_map_is_rdonly() check ensuring that all
      last pending writes have completed via bpf_map_write_active() test. Once the
      map->frozen is set and bpf_map_write_active() indicates a map->writecnt of 0
      only then we are really guaranteed to use the map's data as known constants.
      For map->frozen being set and pending writes in process of still being completed
      we fall back to marking that register as unknown scalar so we don't end up
      making assumptions about it. With this, both TOCTOU reproducers from i) and
      ii) are fixed.
      
      Note that the map->writecnt has been converted into a atomic64 in the fix in
      order to avoid a double freeze_mutex mutex_{un,}lock() pair when updating
      map->writecnt in the various map update/delete BPF_* cmd flavors. Spanning
      the freeze_mutex over entire map update/delete operations in syscall side
      would not be possible due to then causing everything to be serialized.
      Similarly, something like synchronize_rcu() after setting map->frozen to wait
      for update/deletes to complete is not possible either since it would also
      have to span the user copy which can sleep. On the libbpf side, this won't
      break d66562fb ("libbpf: Add BPF object skeleton support") as the
      anonymous mmap()-ed "map initialization image" is remapped as a BPF map-backed
      mmap()-ed memory where for .rodata it's non-writable.
      
      Fixes: a23740ec ("bpf: Track contents of read-only maps as scalars")
      Reported-by: w1tcher.bupt@gmail.com
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      353050be
    • A
      samples/bpf: Fix build error due to -isystem removal · 6060a6cb
      Alexander Lobakin 提交于
      Since recent Kbuild updates we no longer include files from compiler
      directories. However, samples/bpf/hbm_kern.h hasn't been tuned for
      this (LLVM 13):
      
        CLANG-bpf  samples/bpf/hbm_out_kern.o
      In file included from samples/bpf/hbm_out_kern.c:55:
      samples/bpf/hbm_kern.h:12:10: fatal error: 'stddef.h' file not found
               ^~~~~~~~~~
      1 error generated.
        CLANG-bpf  samples/bpf/hbm_edt_kern.o
      In file included from samples/bpf/hbm_edt_kern.c:53:
      samples/bpf/hbm_kern.h:12:10: fatal error: 'stddef.h' file not found
               ^~~~~~~~~~
      1 error generated.
      
      It is enough to just drop both stdbool.h and stddef.h from includes
      to fix those.
      
      Fixes: 04e85bbf ("isystem: delete global -isystem compile option")
      Signed-off-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Link: https://lore.kernel.org/bpf/20211115130741.3584-1-alexandr.lobakin@intel.comSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
      6060a6cb
    • A
      Merge branch 'Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs' · 9e4dc892
      Alexei Starovoitov 提交于
      Dmitrii Banshchikov says:
      
      ====================
      
      Various locking issues are possible with bpf_ktime_get_coarse_ns() and
      bpf_timer_* set of helpers.
      
      syzbot found a locking issue with bpf_ktime_get_coarse_ns() helper executed in
      BPF_PROG_TYPE_PERF_EVENT prog type - [1]. The issue is possible because the
      helper uses non fast version of time accessor that isn't safe for any context.
      The helper was added because it provided performance benefits in comparison to
      bpf_ktime_get_ns() helper.
      
      A similar locking issue is possible with bpf_timer_* set of helpers when used
      in tracing progs.
      
      The solution is to restrict use of the helpers in tracing progs.
      
      In the [1] discussion it was stated that bpf_spin_lock related helpers shall
      also be excluded for tracing progs. The verifier has a compatibility check
      between a map and a program. If a tracing program tries to use a map which
      value has struct bpf_spin_lock the verifier fails that is why bpf_spin_lock is
      already restricted.
      
      Patch 1 restricts helpers
      Patch 2 adds tests
      
      v1 -> v2:
       * Limit the helpers via func proto getters instead of allowed callback
       * Add note about helpers' restrictions to linux/bpf.h
       * Add Fixes tag
       * Remove extra \0 from btf_str_sec
       * Beside asm tests add prog tests
       * Trim CC
      
      1. https://lore.kernel.org/all/00000000000013aebd05cff8e064@google.com/
      ====================
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      9e4dc892
    • D
      selftests/bpf: Add tests for restricted helpers · e60e6962
      Dmitrii Banshchikov 提交于
      This patch adds tests that bpf_ktime_get_coarse_ns(), bpf_timer_* and
      bpf_spin_lock()/bpf_spin_unlock() helpers are forbidden in tracing progs
      as their use there may result in various locking issues.
      Signed-off-by: NDmitrii Banshchikov <me@ubique.spb.ru>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211113142227.566439-3-me@ubique.spb.ru
      e60e6962
    • D
      bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs · 5e0bc308
      Dmitrii Banshchikov 提交于
      Use of bpf_ktime_get_coarse_ns() and bpf_timer_* helpers in tracing
      progs may result in locking issues.
      
      bpf_ktime_get_coarse_ns() uses ktime_get_coarse_ns() time accessor that
      isn't safe for any context:
      ======================================================
      WARNING: possible circular locking dependency detected
      5.15.0-syzkaller #0 Not tainted
      ------------------------------------------------------
      syz-executor.4/14877 is trying to acquire lock:
      ffffffff8cb30008 (tk_core.seq.seqcount){----}-{0:0}, at: ktime_get_coarse_ts64+0x25/0x110 kernel/time/timekeeping.c:2255
      
      but task is already holding lock:
      ffffffff90dbf200 (&obj_hash[i].lock){-.-.}-{2:2}, at: debug_object_deactivate+0x61/0x400 lib/debugobjects.c:735
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&obj_hash[i].lock){-.-.}-{2:2}:
             lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5625
             __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
             _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
             __debug_object_init+0xd9/0x1860 lib/debugobjects.c:569
             debug_hrtimer_init kernel/time/hrtimer.c:414 [inline]
             debug_init kernel/time/hrtimer.c:468 [inline]
             hrtimer_init+0x20/0x40 kernel/time/hrtimer.c:1592
             ntp_init_cmos_sync kernel/time/ntp.c:676 [inline]
             ntp_init+0xa1/0xad kernel/time/ntp.c:1095
             timekeeping_init+0x512/0x6bf kernel/time/timekeeping.c:1639
             start_kernel+0x267/0x56e init/main.c:1030
             secondary_startup_64_no_verify+0xb1/0xbb
      
      -> #0 (tk_core.seq.seqcount){----}-{0:0}:
             check_prev_add kernel/locking/lockdep.c:3051 [inline]
             check_prevs_add kernel/locking/lockdep.c:3174 [inline]
             validate_chain+0x1dfb/0x8240 kernel/locking/lockdep.c:3789
             __lock_acquire+0x1382/0x2b00 kernel/locking/lockdep.c:5015
             lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5625
             seqcount_lockdep_reader_access+0xfe/0x230 include/linux/seqlock.h:103
             ktime_get_coarse_ts64+0x25/0x110 kernel/time/timekeeping.c:2255
             ktime_get_coarse include/linux/timekeeping.h:120 [inline]
             ktime_get_coarse_ns include/linux/timekeeping.h:126 [inline]
             ____bpf_ktime_get_coarse_ns kernel/bpf/helpers.c:173 [inline]
             bpf_ktime_get_coarse_ns+0x7e/0x130 kernel/bpf/helpers.c:171
             bpf_prog_a99735ebafdda2f1+0x10/0xb50
             bpf_dispatcher_nop_func include/linux/bpf.h:721 [inline]
             __bpf_prog_run include/linux/filter.h:626 [inline]
             bpf_prog_run include/linux/filter.h:633 [inline]
             BPF_PROG_RUN_ARRAY include/linux/bpf.h:1294 [inline]
             trace_call_bpf+0x2cf/0x5d0 kernel/trace/bpf_trace.c:127
             perf_trace_run_bpf_submit+0x7b/0x1d0 kernel/events/core.c:9708
             perf_trace_lock+0x37c/0x440 include/trace/events/lock.h:39
             trace_lock_release+0x128/0x150 include/trace/events/lock.h:58
             lock_release+0x82/0x810 kernel/locking/lockdep.c:5636
             __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:149 [inline]
             _raw_spin_unlock_irqrestore+0x75/0x130 kernel/locking/spinlock.c:194
             debug_hrtimer_deactivate kernel/time/hrtimer.c:425 [inline]
             debug_deactivate kernel/time/hrtimer.c:481 [inline]
             __run_hrtimer kernel/time/hrtimer.c:1653 [inline]
             __hrtimer_run_queues+0x2f9/0xa60 kernel/time/hrtimer.c:1749
             hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1811
             local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
             __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1103
             sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1097
             asm_sysvec_apic_timer_interrupt+0x12/0x20
             __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
             _raw_spin_unlock_irqrestore+0xd4/0x130 kernel/locking/spinlock.c:194
             try_to_wake_up+0x702/0xd20 kernel/sched/core.c:4118
             wake_up_process kernel/sched/core.c:4200 [inline]
             wake_up_q+0x9a/0xf0 kernel/sched/core.c:953
             futex_wake+0x50f/0x5b0 kernel/futex/waitwake.c:184
             do_futex+0x367/0x560 kernel/futex/syscalls.c:127
             __do_sys_futex kernel/futex/syscalls.c:199 [inline]
             __se_sys_futex+0x401/0x4b0 kernel/futex/syscalls.c:180
             do_syscall_x64 arch/x86/entry/common.c:50 [inline]
             do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
             entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      There is a possible deadlock with bpf_timer_* set of helpers:
      hrtimer_start()
        lock_base();
        trace_hrtimer...()
          perf_event()
            bpf_run()
              bpf_timer_start()
                hrtimer_start()
                  lock_base()         <- DEADLOCK
      
      Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_* helpers in
      BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT
      and BPF_PROG_TYPE_RAW_TRACEPOINT prog types.
      
      Fixes: d0551261 ("bpf: Add bpf_ktime_get_coarse_ns helper")
      Fixes: b00628b1 ("bpf: Introduce bpf timers.")
      Reported-by: syzbot+43fd005b5a1b4d10781e@syzkaller.appspotmail.com
      Signed-off-by: NDmitrii Banshchikov <me@ubique.spb.ru>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211113142227.566439-2-me@ubique.spb.ru
      5e0bc308
    • A
      iavf: Restore VLAN filters after link down · 42930142
      Akeem G Abodunrin 提交于
      Restore VLAN filters after the link is brought down, and up - since all
      filters are deleted from HW during the netdev link down routine.
      
      Fixes: ed1f5b58 ("i40evf: remove VLAN filters on close")
      Signed-off-by: NAkeem G Abodunrin <akeem.g.abodunrin@intel.com>
      Tested-by: NGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      42930142
    • G
      iavf: Fix for setting queues to 0 · 9a6e9e48
      Grzegorz Szczurek 提交于
      Now setting combine to 0 will be rejected with the
      appropriate error code.
      This has been implemented by adding a condition that checks
      the value of combine equal to zero.
      Without this patch, when the user requested it, no error was
      returned and combine was set to the default value for VF.
      
      Fixes: 5520deb1 ("iavf: Enable support for up to 16 queues")
      Signed-off-by: NGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      9a6e9e48
    • S
      iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset · 321421b5
      Surabhi Boob 提交于
      While issuing VF Reset from the guest OS, the VF driver prints
      logs about critical / Overflow error detection. This is not an
      actual error since the VF_MBX_ARQLEN register is set to all FF's
      for a short period of time and the VF would catch the bits set if
      it was reading the register during that spike of time.
      This patch introduces an additional check to ignore this condition
      since the VF is in reset.
      
      Fixes: 19b73d8e ("i40evf: Add additional check for reset")
      Signed-off-by: NSurabhi Boob <surabhi.boob@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      321421b5
    • M
      iavf: validate pointers · 131b0edc
      Mitch Williams 提交于
      In some cases, the ethtool get_rxfh handler may be called with a null
      key or indir parameter. So check these pointers, or you will have a very
      bad day.
      
      Fixes: 43a3d9ba ("i40evf: Allow PF driver to configure RSS")
      Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      131b0edc
    • J
      iavf: prevent accidental free of filter structure · 4f040080
      Jacob Keller 提交于
      In iavf_config_clsflower, the filter structure could be accidentally
      released at the end, if iavf_parse_cls_flower or iavf_handle_tclass ever
      return a non-zero but positive value.
      
      In this case, the function continues through to the end, and will call
      kfree() on the filter structure even though it has been added to the
      linked list.
      
      This can actually happen because iavf_parse_cls_flower will return
      a positive IAVF_ERR_CONFIG value instead of the traditional negative
      error codes.
      
      Fix this by ensuring that the kfree() check and error checks are
      similar. Use the more idiomatic "if (err)" to catch all non-zero error
      codes.
      
      Fixes: 0075fa0f ("i40evf: Add support to apply cloud filters")
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      4f040080
    • P
      iavf: Fix failure to exit out from last all-multicast mode · 8905072a
      Piotr Marczak 提交于
      The driver could only quit allmulti when allmulti and promisc modes are
      turn on at the same time. If promisc had been off there was no way to turn
      off allmulti mode.
      The patch corrects this behavior. Switching allmulti does not depends on
      promisc state mode anymore
      
      Fixes: f42a5c74 ("i40e: Add allmulti support for the VF")
      Signed-off-by: NPiotr Marczak <piotr.marczak@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      8905072a
    • N
      iavf: don't clear a lock we don't hold · 2135a8d5
      Nicholas Nunley 提交于
      In iavf_configure_clsflower() the function will bail out if it is unable
      to obtain the crit_section lock in a reasonable time. However, it will
      clear the lock when exiting, so fix this.
      
      Fixes: 640a8af5 ("i40evf: Reorder configure_clsflower to avoid deadlock on error")
      Signed-off-by: NNicholas Nunley <nicholas.d.nunley@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      2135a8d5
    • N
      iavf: free q_vectors before queues in iavf_disable_vf · 89f22f12
      Nicholas Nunley 提交于
      iavf_free_queues() clears adapter->num_active_queues, which
      iavf_free_q_vectors() relies on, so swap the order of these two function
      calls in iavf_disable_vf(). This resolves a panic encountered when the
      interface is disabled and then later brought up again after PF
      communication is restored.
      
      Fixes: 65c7006f ("i40evf: assign num_active_queues inside i40evf_alloc_queues")
      Signed-off-by: NNicholas Nunley <nicholas.d.nunley@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      89f22f12
    • N
      iavf: check for null in iavf_fix_features · 8a4a126f
      Nicholas Nunley 提交于
      If the driver has lost contact with the PF then it enters a disabled state
      and frees adapter->vf_res. However, ndo_fix_features can still be called on
      the interface, so we need to check for this condition first. Since we have
      no information on the features at this time simply leave them unmodified
      and return.
      
      Fixes: c4445aed ("i40evf: Fix VLAN features")
      Signed-off-by: NNicholas Nunley <nicholas.d.nunley@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      8a4a126f
    • M
      iavf: Fix return of set the new channel count · 4e5e6b5d
      Mateusz Palczewski 提交于
      Fixed return correct code from set the new channel count.
      Implemented by check if reset is done in appropriate time.
      This solution give a extra time to pf for reset vf in case
      when user want set new channel count for all vfs.
      Without this patch it is possible to return misleading output
      code to user and vf reset not to be correctly performed by pf.
      
      Fixes: 5520deb1 ("iavf: Enable support for up to 16 queues")
      Signed-off-by: NGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
      Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      4e5e6b5d