1. 25 6月, 2021 1 次提交
    • T
      intel: Remove rcu_read_lock() around XDP program invocation · 49589b23
      Toke Høiland-Jørgensen 提交于
      The Intel drivers all have rcu_read_lock()/rcu_read_unlock() pairs around
      XDP program invocations. However, the actual lifetime of the objects
      referred by the XDP program invocation is longer, all the way through to
      the call to xdp_do_flush(), making the scope of the rcu_read_lock() too
      small. This turns out to be harmless because it all happens in a single
      NAPI poll cycle (and thus under local_bh_disable()), but it makes the
      rcu_read_lock() misleading.
      
      Rather than extend the scope of the rcu_read_lock(), just get rid of it
      entirely. With the addition of RCU annotations to the XDP_REDIRECT map
      types that take bh execution into account, lockdep even understands this to
      be safe, so there's really no reason to keep it around.
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> # i40e
      Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
      Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
      Cc: intel-wired-lan@lists.osuosl.org
      Link: https://lore.kernel.org/bpf/20210624160609.292325-12-toke@redhat.com
      49589b23
  2. 03 6月, 2021 1 次提交
  3. 08 5月, 2021 1 次提交
    • M
      i40e: fix broken XDP support · ae4393df
      Magnus Karlsson 提交于
      Commit 12738ac4 ("i40e: Fix sparse errors in i40e_txrx.c") broke
      XDP support in the i40e driver. That commit was fixing a sparse error
      in the code by introducing a new variable xdp_res instead of
      overloading this into the skb pointer. The problem is that the code
      later uses the skb pointer in if statements and these where not
      extended to also test for the new xdp_res variable. Fix this by adding
      the correct tests for xdp_res in these places.
      
      The skb pointer was used to store the result of the XDP program by
      overloading the results in the error pointer
      ERR_PTR(-result). Therefore, the allocation failure test that used to
      only test for !skb now need to be extended to also consider !xdp_res.
      
      i40e_cleanup_headers() had a check that based on the skb value being
      an error pointer, i.e. a result from the XDP program != XDP_PASS, and
      if so start to process a new packet immediately, instead of populating
      skb fields and sending the skb to the stack. This check is not needed
      anymore, since we have added an explicit test for xdp_res being set
      and if so just do continue to pick the next packet from the NIC.
      
      Fixes: 12738ac4 ("i40e: Fix sparse errors in i40e_txrx.c")
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Tested-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      ae4393df
  4. 09 4月, 2021 1 次提交
  5. 24 3月, 2021 1 次提交
  6. 18 3月, 2021 1 次提交
  7. 12 3月, 2021 1 次提交
  8. 20 2月, 2021 1 次提交
    • N
      i40e: Fix endianness conversions · b32cddd2
      Norbert Ciosek 提交于
      Fixes the following sparse warnings:
      i40e_main.c:5953:32: warning: cast from restricted __le16
      i40e_main.c:8008:29: warning: incorrect type in assignment (different base types)
      i40e_main.c:8008:29:    expected unsigned int [assigned] [usertype] ipa
      i40e_main.c:8008:29:    got restricted __le32 [usertype]
      i40e_main.c:8008:29: warning: incorrect type in assignment (different base types)
      i40e_main.c:8008:29:    expected unsigned int [assigned] [usertype] ipa
      i40e_main.c:8008:29:    got restricted __le32 [usertype]
      i40e_txrx.c:1950:59: warning: incorrect type in initializer (different base types)
      i40e_txrx.c:1950:59:    expected unsigned short [usertype] vlan_tag
      i40e_txrx.c:1950:59:    got restricted __le16 [usertype] l2tag1
      i40e_txrx.c:1953:40: warning: cast to restricted __le16
      i40e_xsk.c:448:38: warning: invalid assignment: |=
      i40e_xsk.c:448:38:    left side has type restricted __le64
      i40e_xsk.c:448:38:    right side has type int
      
      Fixes: 2f4b411a ("i40e: Enable cloud filters via tc-flower")
      Fixes: 2a508c64 ("i40e: fix VLAN.TCI == 0 RX HW offload")
      Fixes: 3106c580 ("i40e: Use batched xsk Tx interfaces to increase performance")
      Fixes: 8f88b303 ("i40e: Add infrastructure for queue channel support")
      Signed-off-by: NNorbert Ciosek <norbertx.ciosek@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      b32cddd2
  9. 19 2月, 2021 1 次提交
  10. 13 2月, 2021 3 次提交
  11. 11 2月, 2021 2 次提交
    • P
      i40e: VLAN field for flow director · a9219b33
      Przemyslaw Patynowski 提交于
      Allow user to specify VLAN field and add it to flow director. Show VLAN
      field in "ethtool -n ethx" command.
      Handle VLAN type and tag field provided by ethtool command. Refactored
      filter addition, by replacing static arrays with runtime dummy packet
      creation, which allows specifying VLAN field.
      Previously, VLAN field was omitted.
      Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      a9219b33
    • P
      i40e: Add flow director support for IPv6 · efca91e8
      Przemyslaw Patynowski 提交于
      Flow director for IPv6 is not supported.
      1) Implementation of support for IPv6 flow director.
      2) Added handlers for addition of TCP6, UDP6, SCTP6, IPv6.
      3) Refactored legacy code to make it more generic.
      4) Added packet templates for TCP6, UDP6, SCTP6, IPv6.
      5) Added handling of IPv6 source and destination address for flow director.
      6) Improved argument passing for source and destination portin TCP6, UDP6
         and SCTP6.
      7) Added handling of ethtool -n for IPv6, TCP6,UDP6, SCTP6.
      8) Used correct bit flag regarding FLEXOFF field of flow director data
         descriptor.
      
      Without this patch, there would be no support for flow director on IPv6,
      TCP6, UDP6, SCTP6.
      Tested based on x710 datasheet by using:
      ethtool -N enp133s0f0 flow-type tcp4 src-port 13 dst-port 37 user-def 0x44142 action 1
      ethtool -N enp133s0f0 flow-type tcp6 src-port 13 dst-port 40 user-def 0x44142 action 2
      ethtool -N enp133s0f0 flow-type udp4 src-port 20 dst-port 40 user-def 0x44142 action 3
      ethtool -N enp133s0f0 flow-type udp6 src-port 25 dst-port 40 user-def 0x44142 action 4
      ethtool -N enp133s0f0 flow-type sctp4 src-port 55 dst-port 65 user-def 0x44142 action 5
      ethtool -N enp133s0f0 flow-type sctp6 src-port 60 dst-port 40 user-def 0x44142 action 6
      ethtool -N enp133s0f0 flow-type ip4 src-ip 1.1.1.1 dst-ip 1.1.1.4 user-def 0x44142 action 7
      ethtool -N enp133s0f0 flow-type ip6 src-ip fe80::3efd:feff:fe6f:bbbb dst-ip fe80::3efd:feff:fe6f:aaaa user-def 0x44142 action 8
      Then send traffic from client which matches the criteria provided to ethtool.
      Observe that packets are redirected to user set queues with ethtool -S <interface>
      Signed-off-by: NPrzemyslaw Patynowski <przemyslawx.patynowski@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      efca91e8
  12. 05 2月, 2021 1 次提交
  13. 09 1月, 2021 2 次提交
  14. 10 12月, 2020 1 次提交
    • B
      i40e: avoid premature Rx buffer reuse · 75aab4e1
      Björn Töpel 提交于
      The page recycle code, incorrectly, relied on that a page fragment
      could not be freed inside xdp_do_redirect(). This assumption leads to
      that page fragments that are used by the stack/XDP redirect can be
      reused and overwritten.
      
      To avoid this, store the page count prior invoking xdp_do_redirect().
      
      Longer explanation:
      
      Intel NICs have a recycle mechanism. The main idea is that a page is
      split into two parts. One part is owned by the driver, one part might
      be owned by someone else, such as the stack.
      
      t0: Page is allocated, and put on the Rx ring
                    +---------------
      used by NIC ->| upper buffer
      (rx_buffer)   +---------------
                    | lower buffer
                    +---------------
        page count  == USHRT_MAX
        rx_buffer->pagecnt_bias == USHRT_MAX
      
      t1: Buffer is received, and passed to the stack (e.g.)
                    +---------------
                    | upper buff (skb)
                    +---------------
      used by NIC ->| lower buffer
      (rx_buffer)   +---------------
        page count  == USHRT_MAX
        rx_buffer->pagecnt_bias == USHRT_MAX - 1
      
      t2: Buffer is received, and redirected
                    +---------------
                    | upper buff (skb)
                    +---------------
      used by NIC ->| lower buffer
      (rx_buffer)   +---------------
      
      Now, prior calling xdp_do_redirect():
        page count  == USHRT_MAX
        rx_buffer->pagecnt_bias == USHRT_MAX - 2
      
      This means that buffer *cannot* be flipped/reused, because the skb is
      still using it.
      
      The problem arises when xdp_do_redirect() actually frees the
      segment. Then we get:
        page count  == USHRT_MAX - 1
        rx_buffer->pagecnt_bias == USHRT_MAX - 2
      
      From a recycle perspective, the buffer can be flipped and reused,
      which means that the skb data area is passed to the Rx HW ring!
      
      To work around this, the page count is stored prior calling
      xdp_do_redirect().
      
      Note that this is not optimal, since the NIC could actually reuse the
      "lower buffer" again. However, then we need to track whether
      XDP_REDIRECT consumed the buffer or not.
      
      Fixes: d9314c47 ("i40e: add support for XDP_REDIRECT")
      Reported-and-analyzed-by: NLi RongQing <lirongqing@baidu.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Tested-by: NGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      75aab4e1
  15. 01 12月, 2020 1 次提交
  16. 18 11月, 2020 1 次提交
  17. 26 9月, 2020 1 次提交
    • J
      intel-ethernet: clean up W=1 warnings in kdoc · b50f7bca
      Jesse Brandeburg 提交于
      This takes care of all of the trivial W=1 fixes in the Intel
      Ethernet drivers, which allows developers and maintainers to
      build more of the networking tree with more complete warning
      checks.
      
      There are three classes of kdoc warnings fixed:
       - cannot understand function prototype: 'x'
       - Excess function parameter 'x' description in 'y'
       - Function parameter or member 'x' not described in 'y'
      
      All of the changes were trivial comment updates on
      function headers.
      
      Inspired by Lee Jones' series of wireless work to do the same.
      Compile tested only, and passes simple test of
      $ git ls-files *.[ch] | egrep drivers/net/ethernet/intel | \
        xargs scripts/kernel-doc -none
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b50f7bca
  18. 15 9月, 2020 3 次提交
  19. 01 9月, 2020 1 次提交
  20. 27 8月, 2020 1 次提交
  21. 02 7月, 2020 3 次提交
  22. 02 6月, 2020 1 次提交
  23. 22 5月, 2020 2 次提交
  24. 15 5月, 2020 1 次提交
    • J
      i40e: Add XDP frame size to driver · 24104024
      Jesper Dangaard Brouer 提交于
      This driver uses different memory models depending on PAGE_SIZE at
      compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
      normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
      larger MTUs the driver still use page splitting, by allocating
      order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
      4K, driver instead advance its rx_buffer->page_offset with the frame
      size "truesize".
      
      For XDP frame size calculations, this mean that in PAGE_SIZE larger
      than 4K mode the frame_sz change on a per packet basis. For the page
      split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
      updated once outside the main NAPI loop.
      
      The default setting in the driver uses build_skb(), which provides
      the necessary headroom and tailroom for XDP-redirect in RX-frame
      (in both modes).
      
      There is one complication, which is legacy-rx mode (configurable via
      ethtool priv-flags). There are zero headroom in this mode, which is a
      requirement for XDP-redirect to work. The conversion to xdp_frame
      (convert_to_xdp_frame) will detect this insufficient space, and
      xdp_do_redirect() call will fail. This is deemed acceptable, as it
      allows other XDP actions to still work in legacy-mode. In
      legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
      accept that xdp_adjust_tail shrink doesn't work.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: intel-wired-lan@lists.osuosl.org
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Link: https://lore.kernel.org/bpf/158945346494.97035.12809400414566061815.stgit@firesoul
      24104024
  25. 30 10月, 2019 1 次提交
  26. 31 7月, 2019 1 次提交
  27. 23 7月, 2019 1 次提交
  28. 15 6月, 2019 1 次提交
  29. 24 4月, 2019 1 次提交
    • S
      net: pass net_device argument to the eth_get_headlen · c43f1255
      Stanislav Fomichev 提交于
      Update all users of eth_get_headlen to pass network device, fetch
      network namespace from it and pass it down to the flow dissector.
      This commit is a noop until administrator inserts BPF flow dissector
      program.
      
      Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: intel-wired-lan@lists.osuosl.org
      Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
      Cc: Salil Mehta <salil.mehta@huawei.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: Igor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c43f1255
  30. 08 4月, 2019 1 次提交
    • W
      drivers: Remove explicit invocations of mmiowb() · fb24ea52
      Will Deacon 提交于
      mmiowb() is now implied by spin_unlock() on architectures that require
      it, so there is no reason to call it from driver code. This patch was
      generated using coccinelle:
      
      	@mmiowb@
      	@@
      	- mmiowb();
      
      and invoked as:
      
      $ for d in drivers include/linux/qed sound; do \
      spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done
      
      NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
      spin_unlock(). However, pairing each mmiowb() removal in this patch with
      the corresponding call to spin_unlock() is not at all trivial, so there
      is a small chance that this change may regress any drivers incorrectly
      relying on mmiowb() to order MMIO writes between CPUs using lock-free
      synchronisation. If you've ended up bisecting to this commit, you can
      reintroduce the mmiowb() calls using wmb() instead, which should restore
      the old behaviour on all architectures other than some esoteric ia64
      systems.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb24ea52
  31. 02 4月, 2019 1 次提交
    • F
      net: move skb->xmit_more hint to softnet data · 6b16f9ee
      Florian Westphal 提交于
      There are two reasons for this.
      
      First, the xmit_more flag conceptually doesn't fit into the skb, as
      xmit_more is not a property related to the skb.
      Its only a hint to the driver that the stack is about to transmit another
      packet immediately.
      
      Second, it was only done this way to not have to pass another argument
      to ndo_start_xmit().
      
      We can place xmit_more in the softnet data, next to the device recursion.
      The recursion counter is already written to on each transmit. The "more"
      indicator is placed right next to it.
      
      Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
      to check the "more packets coming" hint.
      
      skb->xmit_more is retained (but always 0) to not cause build breakage.
      
      This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
      conversions.  Remaining drivers are converted in the next patches.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b16f9ee