1. 22 5月, 2020 2 次提交
  2. 15 5月, 2020 1 次提交
    • J
      i40e: Add XDP frame size to driver · 24104024
      Jesper Dangaard Brouer 提交于
      This driver uses different memory models depending on PAGE_SIZE at
      compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
      normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
      larger MTUs the driver still use page splitting, by allocating
      order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
      4K, driver instead advance its rx_buffer->page_offset with the frame
      size "truesize".
      
      For XDP frame size calculations, this mean that in PAGE_SIZE larger
      than 4K mode the frame_sz change on a per packet basis. For the page
      split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
      updated once outside the main NAPI loop.
      
      The default setting in the driver uses build_skb(), which provides
      the necessary headroom and tailroom for XDP-redirect in RX-frame
      (in both modes).
      
      There is one complication, which is legacy-rx mode (configurable via
      ethtool priv-flags). There are zero headroom in this mode, which is a
      requirement for XDP-redirect to work. The conversion to xdp_frame
      (convert_to_xdp_frame) will detect this insufficient space, and
      xdp_do_redirect() call will fail. This is deemed acceptable, as it
      allows other XDP actions to still work in legacy-mode. In
      legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
      accept that xdp_adjust_tail shrink doesn't work.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: intel-wired-lan@lists.osuosl.org
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Link: https://lore.kernel.org/bpf/158945346494.97035.12809400414566061815.stgit@firesoul
      24104024
  3. 30 10月, 2019 1 次提交
  4. 31 7月, 2019 1 次提交
  5. 23 7月, 2019 1 次提交
  6. 15 6月, 2019 1 次提交
  7. 24 4月, 2019 1 次提交
    • S
      net: pass net_device argument to the eth_get_headlen · c43f1255
      Stanislav Fomichev 提交于
      Update all users of eth_get_headlen to pass network device, fetch
      network namespace from it and pass it down to the flow dissector.
      This commit is a noop until administrator inserts BPF flow dissector
      program.
      
      Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: intel-wired-lan@lists.osuosl.org
      Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
      Cc: Salil Mehta <salil.mehta@huawei.com>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: Igor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c43f1255
  8. 08 4月, 2019 1 次提交
    • W
      drivers: Remove explicit invocations of mmiowb() · fb24ea52
      Will Deacon 提交于
      mmiowb() is now implied by spin_unlock() on architectures that require
      it, so there is no reason to call it from driver code. This patch was
      generated using coccinelle:
      
      	@mmiowb@
      	@@
      	- mmiowb();
      
      and invoked as:
      
      $ for d in drivers include/linux/qed sound; do \
      spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done
      
      NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
      spin_unlock(). However, pairing each mmiowb() removal in this patch with
      the corresponding call to spin_unlock() is not at all trivial, so there
      is a small chance that this change may regress any drivers incorrectly
      relying on mmiowb() to order MMIO writes between CPUs using lock-free
      synchronisation. If you've ended up bisecting to this commit, you can
      reintroduce the mmiowb() calls using wmb() instead, which should restore
      the old behaviour on all architectures other than some esoteric ia64
      systems.
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb24ea52
  9. 02 4月, 2019 1 次提交
    • F
      net: move skb->xmit_more hint to softnet data · 6b16f9ee
      Florian Westphal 提交于
      There are two reasons for this.
      
      First, the xmit_more flag conceptually doesn't fit into the skb, as
      xmit_more is not a property related to the skb.
      Its only a hint to the driver that the stack is about to transmit another
      packet immediately.
      
      Second, it was only done this way to not have to pass another argument
      to ndo_start_xmit().
      
      We can place xmit_more in the softnet data, next to the device recursion.
      The recursion counter is already written to on each transmit. The "more"
      indicator is placed right next to it.
      
      Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
      to check the "more packets coming" hint.
      
      skb->xmit_more is retained (but always 0) to not cause build breakage.
      
      This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
      conversions.  Remaining drivers are converted in the next patches.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b16f9ee
  10. 22 2月, 2019 1 次提交
    • B
      i40e: fix XDP_REDIRECT/XDP xmit ring cleanup race · 59eb2a88
      Björn Töpel 提交于
      When the driver clears the XDP xmit ring due to re-configuration or
      teardown, in-progress ndo_xdp_xmit must be taken into consideration.
      
      The ndo_xdp_xmit function is typically called from a NAPI context that
      the driver does not control. Therefore, we must be careful not to
      clear the XDP ring, while the call is on-going. This patch adds a
      synchronize_rcu() to wait for napi(s) (preempt-disable regions and
      softirqs), prior clearing the queue. Further, the __I40E_CONFIG_BUSY
      flag is checked in the ndo_xdp_xmit implementation to avoid touching
      the XDP xmit queue during re-configuration.
      
      Fixes: d9314c47 ("i40e: add support for XDP_REDIRECT")
      Fixes: 123cecd4 ("i40e: added queue pair disable/enable functions")
      Reported-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      59eb2a88
  11. 13 12月, 2018 2 次提交
  12. 22 11月, 2018 1 次提交
    • J
      ethernet/intel: consolidate NAPI and NAPI exit · 0bcd952f
      Jesse Brandeburg 提交于
      While reviewing code, I noticed that Eric Dumazet recommends that
      drivers check the return code of napi_complete_done, and use that
      to decide to enable interrupts or not when exiting poll.  One of
      the Intel drivers was already fixed (ixgbe).
      
      Upon looking at the Intel drivers as a whole, we are handling our
      polling and NAPI exit in a few different ways based on whether we
      have multiqueue and whether we have Tx cleanup included. Several
      drivers had the bug of exiting NAPI with return 0, which appears
      to mess up the accounting in the stack.
      
      Consolidate all the NAPI routines to do best known way of exiting
      and to just mostly look like each other.
      1) check return code of napi_complete_done to control interrupt enable
      2) return the actual amount of work done.
      3) return budget immediately if need NAPI poll again
      
      Tested the changes on e1000e with a high interrupt rate set, and
      it shows about an 8% reduction in the CPU utilization when busy
      polling because we aren't re-enabling interrupts when we're about
      to be polled.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Reviewed-by: NJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0bcd952f
  13. 15 11月, 2018 1 次提交
  14. 08 11月, 2018 1 次提交
  15. 26 10月, 2018 1 次提交
  16. 26 9月, 2018 2 次提交
  17. 30 8月, 2018 5 次提交
  18. 08 8月, 2018 1 次提交
  19. 28 6月, 2018 1 次提交
  20. 20 6月, 2018 1 次提交
  21. 05 6月, 2018 2 次提交
  22. 03 6月, 2018 2 次提交
  23. 25 5月, 2018 1 次提交
    • J
      xdp: change ndo_xdp_xmit API to support bulking · 735fc405
      Jesper Dangaard Brouer 提交于
      This patch change the API for ndo_xdp_xmit to support bulking
      xdp_frames.
      
      When kernel is compiled with CONFIG_RETPOLINE, XDP sees a huge slowdown.
      Most of the slowdown is caused by DMA API indirect function calls, but
      also the net_device->ndo_xdp_xmit() call.
      
      Benchmarked patch with CONFIG_RETPOLINE, using xdp_redirect_map with
      single flow/core test (CPU E5-1650 v4 @ 3.60GHz), showed
      performance improved:
       for driver ixgbe: 6,042,682 pps -> 6,853,768 pps = +811,086 pps
       for driver i40e : 6,187,169 pps -> 6,724,519 pps = +537,350 pps
      
      With frames avail as a bulk inside the driver ndo_xdp_xmit call,
      further optimizations are possible, like bulk DMA-mapping for TX.
      
      Testing without CONFIG_RETPOLINE show the same performance for
      physical NIC drivers.
      
      The virtual NIC driver tun sees a huge performance boost, as it can
      avoid doing per frame producer locking, but instead amortize the
      locking cost over the bulk.
      
      V2: Fix compile errors reported by kbuild test robot <lkp@intel.com>
      V4: Isolated ndo, driver changes and callers.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      735fc405
  24. 01 5月, 2018 1 次提交
  25. 28 4月, 2018 1 次提交
  26. 17 4月, 2018 3 次提交
    • J
      xdp: transition into using xdp_frame for ndo_xdp_xmit · 44fa2dbd
      Jesper Dangaard Brouer 提交于
      Changing API ndo_xdp_xmit to take a struct xdp_frame instead of struct
      xdp_buff.  This brings xdp_return_frame and ndp_xdp_xmit in sync.
      
      This builds towards changing the API further to become a bulk API,
      because xdp_buff is not a queue-able object while xdp_frame is.
      
      V4: Adjust for commit 59655a5b ("tuntap: XDP_TX can use native XDP")
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44fa2dbd
    • J
      xdp: transition into using xdp_frame for return API · 03993094
      Jesper Dangaard Brouer 提交于
      Changing API xdp_return_frame() to take struct xdp_frame as argument,
      seems like a natural choice. But there are some subtle performance
      details here that needs extra care, which is a deliberate choice.
      
      When de-referencing xdp_frame on a remote CPU during DMA-TX
      completion, result in the cache-line is change to "Shared"
      state. Later when the page is reused for RX, then this xdp_frame
      cache-line is written, which change the state to "Modified".
      
      This situation already happens (naturally) for, virtio_net, tun and
      cpumap as the xdp_frame pointer is the queued object.  In tun and
      cpumap, the ptr_ring is used for efficiently transferring cache-lines
      (with pointers) between CPUs. Thus, the only option is to
      de-referencing xdp_frame.
      
      It is only the ixgbe driver that had an optimization, in which it can
      avoid doing the de-reference of xdp_frame.  The driver already have
      TX-ring queue, which (in case of remote DMA-TX completion) have to be
      transferred between CPUs anyhow.  In this data area, we stored a
      struct xdp_mem_info and a data pointer, which allowed us to avoid
      de-referencing xdp_frame.
      
      To compensate for this, a prefetchw is used for telling the cache
      coherency protocol about our access pattern.  My benchmarks show that
      this prefetchw is enough to compensate the ixgbe driver.
      
      V7: Adjust for commit d9314c47 ("i40e: add support for XDP_REDIRECT")
      V8: Adjust for commit bd658dda ("net/mlx5e: Separate dma base address
      and offset in dma_sync call")
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      03993094
    • J
      i40e: convert to use generic xdp_frame and xdp_return_frame API · b411ef11
      Jesper Dangaard Brouer 提交于
      Also convert driver i40e, which very recently got XDP_REDIRECT support
      in commit d9314c47 ("i40e: add support for XDP_REDIRECT").
      
      V7: This patch got added in V7 of this patchset.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b411ef11
  27. 27 3月, 2018 3 次提交