1. 02 7月, 2020 1 次提交
    • M
      i40e: optimize AF_XDP Tx completion path · 5574ff7b
      Magnus Karlsson 提交于
      Improve the performance of the AF_XDP zero-copy Tx completion
      path. When there are no XDP buffers being sent using XDP_TX or
      XDP_REDIRECT, we do not have go through the SW ring to clean up any
      entries since the AF_XDP path does not use these. In these cases, just
      fast forward the next-to-use counter and skip going through the SW
      ring. The limit on the maximum number of entries to complete is also
      removed since the algorithm is now O(1). To simplify the code path, the
      maximum number of entries to complete for the XDP path is therefore
      also increased from 256 to 512 (the default number of Tx HW
      descriptors). This should be fine since the completion in the XDP path
      is faster than in the SKB path that has 256 as the maximum number.
      
      This patch provides around 4% throughput improvement for the l2fwd
      application in xdpsock on my machine.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Reviewed-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      5574ff7b
  2. 26 6月, 2020 1 次提交
  3. 22 5月, 2020 2 次提交
  4. 23 7月, 2019 1 次提交
  5. 30 8月, 2018 1 次提交
    • B
      i40e: add AF_XDP zero-copy Rx support · 0a714186
      Björn Töpel 提交于
      This patch adds zero-copy Rx support for AF_XDP sockets. Instead of
      allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are
      allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain
      queue.
      
      All AF_XDP specific functions are added to a new file, i40e_xsk.c.
      
      Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS
      will allocate a new buffer and copy the zero-copy frame prior passing
      it to the kernel stack.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      0a714186
  6. 05 6月, 2018 1 次提交
  7. 03 6月, 2018 1 次提交
  8. 25 5月, 2018 1 次提交
    • J
      xdp: change ndo_xdp_xmit API to support bulking · 735fc405
      Jesper Dangaard Brouer 提交于
      This patch change the API for ndo_xdp_xmit to support bulking
      xdp_frames.
      
      When kernel is compiled with CONFIG_RETPOLINE, XDP sees a huge slowdown.
      Most of the slowdown is caused by DMA API indirect function calls, but
      also the net_device->ndo_xdp_xmit() call.
      
      Benchmarked patch with CONFIG_RETPOLINE, using xdp_redirect_map with
      single flow/core test (CPU E5-1650 v4 @ 3.60GHz), showed
      performance improved:
       for driver ixgbe: 6,042,682 pps -> 6,853,768 pps = +811,086 pps
       for driver i40e : 6,187,169 pps -> 6,724,519 pps = +537,350 pps
      
      With frames avail as a bulk inside the driver ndo_xdp_xmit call,
      further optimizations are possible, like bulk DMA-mapping for TX.
      
      Testing without CONFIG_RETPOLINE show the same performance for
      physical NIC drivers.
      
      The virtual NIC driver tun sees a huge performance boost, as it can
      avoid doing per frame producer locking, but instead amortize the
      locking cost over the bulk.
      
      V2: Fix compile errors reported by kbuild test robot <lkp@intel.com>
      V4: Isolated ndo, driver changes and callers.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      735fc405
  9. 28 4月, 2018 1 次提交
  10. 17 4月, 2018 2 次提交
  11. 27 3月, 2018 1 次提交
  12. 24 3月, 2018 1 次提交
  13. 27 2月, 2018 1 次提交
    • A
      i40e/i40evf: use SW variables for hang detection · 04d41051
      Alan Brady 提交于
      The i40e_detect_recover_hung function uses the i40e_get_tx_pending
      function to determine if there are packets stalled on the ring.
      i40e_get_tx_pending calculates the pending packets using the head
      writeback value and HW tail.  If the queue is stopped and we lose the
      interrupt to update our next_to_clean then we a) won't get another
      interrupt to clean because queue is stopped b) we won't catch the
      problem with i40e_detect_recover_hung because the HW values look like
      there's no packets waiting to be transmitted.  Using the SW values we
      can catch the issue because next_to_clean will be out of sync with head
      writeback.
      
      This has the added benefit being less CPU intensive because we don't
      need to reach into the hardware to get the values.
      Signed-off-by: NAlan Brady <alan.brady@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      04d41051
  14. 13 2月, 2018 4 次提交
  15. 30 1月, 2018 1 次提交
  16. 24 1月, 2018 1 次提交
    • S
      i40e/i40evf: Detect and recover hung queue scenario · 07d44190
      Sudheer Mogilappagari 提交于
      In VFs, there is a known issue which can cause writebacks
      to not occur when interrupts are disabled and there are
      less than 4 descriptors resulting in TX timeout. Timeout
      can also occur due to lost interrupt.
      
      The current implementation for detecting and recovering
      from hung queues in the PF is problematic because it actually
      actively encourages lost interrupts.  By triggering a SW
      interrupt, interrupts are forced on.  If we are already in
      napi_poll and an interrupt fires, napi_poll will not be
      rescheduled and the interrupt is effectively lost; thereby
      potentially *causing* hung queues.
      
      This patch checks whether packets are being processed between
      every watchdog cycle and determine potential hung queue and
      fires triggers SW interrupt only for that particular queue.
      Signed-off-by: NSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      07d44190
  17. 06 1月, 2018 1 次提交
    • J
      i40e: setup xdp_rxq_info · 87128824
      Jesper Dangaard Brouer 提交于
      The i40e driver has a special "FDIR" RX-ring (I40E_VSI_FDIR) which is
      a sideband channel for configuring/updating the flow director tables.
      This (i40e_vsi_)type does not invoke XDP-ebpf code.
      
      As suggested by Björn (V2): Instead of marking this I40E_VSI_FDIR RX-ring
      a special case, reverse the logic and only select RX-rings of type
      I40E_VSI_MAIN to register xdp_rxq_info's for.
      
      Driver hook points for xdp_rxq_info:
       * reg  : i40e_setup_rx_descriptors (via i40e_vsi_setup_rx_resources)
       * unreg: i40e_free_rx_resources    (via i40e_vsi_free_rx_resources)
      
      Tested on actual hardware with samples/bpf program.
      
      V2: Fixed bug in i40e_set_ringparam (memset zero) + match on I40E_VSI_MAIN.
      V4: Update patch desc that got out-of-sync with code.
      
      Cc: intel-wired-lan@lists.osuosl.org
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Paul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NPaul Menzel <pmenzel@molgen.mpg.de>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      87128824
  18. 14 10月, 2017 1 次提交
  19. 10 10月, 2017 2 次提交
  20. 06 10月, 2017 1 次提交
  21. 28 8月, 2017 2 次提交
    • J
      i40e/i40evf: avoid dynamic ITR updates when polling or low packet rate · 742c9875
      Jacob Keller 提交于
      The dynamic ITR algorithm depends on a calculation of usecs which
      assumes that the interrupts have been firing constantly at the interrupt
      throttle rate. This is not guaranteed because we could have a low packet
      rate, or have been polling in software.
      
      We'll estimate whether this is the case by using jiffies to determine if
      we've been too long. If the time difference of jiffies is larger we are
      guaranteed to have an incorrect calculation. If the time difference of
      jiffies is smaller we might have been polling some but the difference
      shouldn't affect the calculation too much.
      
      This ensures that we don't get stuck in BULK latency during certain rare
      situations where we receive bursts of packets that force us into NAPI
      polling.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      742c9875
    • J
      i40e/i40evf: remove ULTRA latency mode · 0a2c7722
      Jacob Keller 提交于
      Since commit c56625d5 ("i40e/i40evf: change dynamic interrupt
      thresholds") a new higher latency ITR setting called I40E_ULTRA_LATENCY
      was added with a cryptic comment about how it was meant for adjusting Rx
      more aggressively when streaming small packets.
      
      This mode was attempting to calculate packets per second and then kick
      in when we have a huge number of small packets.
      
      Unfortunately, the ULTRA setting was kicking in for workloads it wasn't
      intended for including single-thread UDP_STREAM workloads.
      
      This wasn't caught for a variety of reasons. First, the ip_defrag
      routines were improved somewhat which makes the UDP_STREAM test still
      reasonable at 10GbE, even when dropped down to 8k interrupts a second.
      Additionally, some other obvious workloads appear to work fine, such
      as TCP_STREAM.
      
      The number 40k doesn't make sense for a number of reasons. First, we
      absolutely can do more than 40k packets per second. Second, we calculate
      the value inline in an integer, which sometimes can overflow resulting
      in using incorrect values.
      
      If we fix this overflow it makes it even more likely that we'll enter
      ULTRA mode which is the opposite of what we want.
      
      The ULTRA mode was added originally as a way to reduce CPU utilization
      during a small packet workload where we weren't keeping up anyways. It
      should never have been kicking in during these other workloads.
      
      Given the issues outlined above, let's remove the ULTRA latency mode. If
      necessary, a better solution to the CPU utilization issue for small
      packet workloads will be added in a future patch.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0a2c7722
  22. 26 8月, 2017 2 次提交
    • J
      i40e: separate hw_features from runtime changing flags · d36e41dc
      Jacob Keller 提交于
      The number of flags found in pf->flags has grown quite large, and there
      are a lot of different types of flags. Most of the flags are simply
      hardware features which are enabled on some firmware or some MAC types.
      Other flags are dynamic run-time flags which enable or disable certain
      features of the driver.
      
      Separate these two types of flags into pf->hw_features and pf->flags.
      The hw_features list will contain a set of features which are enabled at
      init time. This will not contain toggles or otherwise dynamically
      changing features. These flags should not need atomic protections, as
      they will be set once during init and then be essentially read only.
      
      Everything else will remain in the flags variable. These flags may be
      modified at any time during run time. A future patch may wish to convert
      these flags into set_bit/clear_bit/test_bit or similar approach to
      ensure atomic correctness.
      
      The I40E_FLAG_MFP_ENABLED flag may be a good fit for hw_features but
      currently is used by ethtool in the private flags settings, and thus has
      been left as part of flags.
      
      Additionally, I40E_FLAG_DCB_CAPABLE may be a good fit for the
      hw_features but this patch has not tried to untangle it yet.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      d36e41dc
    • M
      i40e/i40evf: adjust packet size to account for double VLANs · 1e3a5fd5
      Mitch Williams 提交于
      Now that the kernel supports double VLAN tags, we should at least play
      nice. Adjust the max packet size to account for two VLAN tags, not just
      one.
      Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      1e3a5fd5
  23. 21 6月, 2017 2 次提交
  24. 08 4月, 2017 3 次提交
  25. 29 3月, 2017 1 次提交
  26. 28 3月, 2017 2 次提交
  27. 15 3月, 2017 1 次提交
  28. 12 2月, 2017 1 次提交