1. 15 4月, 2021 1 次提交
    • J
      ice: replace custom AIM algorithm with kernel's DIM library · cdf1f1f1
      Jacob Keller 提交于
      The ice driver has support for adaptive interrupt moderation, an
      algorithm for tuning the interrupt rate dynamically. This algorithm
      is based on various assumptions about ring size, socket buffer size,
      link speed, SKB overhead, ethernet frame overhead and more.
      
      The Linux kernel has support for a dynamic interrupt moderation
      algorithm known as "dimlib". Replace the custom driver-specific
      implementation of dynamic interrupt moderation with the kernel's
      algorithm.
      
      The Intel hardware has a different hardware implementation than the
      originators of the dimlib code had to work with, which requires the
      driver to use a slightly different set of inputs for the actual
      moderation values, while getting all the advice from dimlib of
      better/worse, shift left or right.
      
      The change made for this implementation is to use a pair of values
      for each of the 5 "slots" that the dimlib moderation expects, and
      the driver will program those pairs when dimlib recommends a slot to
      use. The currently implementation uses two tables, one for receive
      and one for transmit, and the pairs of values in each slot set the
      maximum delay of an interrupt and a maximum number of interrupts per
      second (both expressed in microseconds).
      
      There are two separate kinds of bugs fixed by using DIMLIB, one is
      UDP single stream send was too slow, and the other is that 8K
      ping-pong was going to the most aggressive moderation and has much
      too high latency.
      
      The overall result of using DIMLIB is that we meet or exceed our
      performance expectations set based on the old algorithm.
      Co-developed-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      cdf1f1f1
  2. 08 4月, 2021 1 次提交
  3. 01 4月, 2021 2 次提交
  4. 13 2月, 2021 2 次提交
  5. 09 2月, 2021 1 次提交
    • J
      ice: fix writeback enable logic · 1d9f7ca3
      Jesse Brandeburg 提交于
      The writeback enable logic was incorrectly implemented (due to
      misunderstanding what the side effects of the implementation would be
      during polling).
      
      Fix this logic issue, while implementing a new feature allowing the user
      to control the writeback frequency using the knobs for controlling
      interrupt throttling that we already have.  Basically if you leave
      adaptive interrupts enabled, the writeback frequency will be varied even
      if busy_polling or if napi-poll is in use.  If the interrupt rates are
      set to a fixed value by ethtool -C and adaptive is off, the driver will
      allow the user-set interrupt rate to guide how frequently the hardware
      will complete descriptors to the driver.
      
      Effectively the user will get a control over the hardware efficiency,
      allowing the choice between immediate interrupts or delayed up to a
      maximum of the interrupt rate, even when interrupts are disabled
      during polling.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Co-developed-by: NBrett Creeley <brett.creeley@intel.com>
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      1d9f7ca3
  6. 26 9月, 2020 1 次提交
    • J
      intel-ethernet: clean up W=1 warnings in kdoc · b50f7bca
      Jesse Brandeburg 提交于
      This takes care of all of the trivial W=1 fixes in the Intel
      Ethernet drivers, which allows developers and maintainers to
      build more of the networking tree with more complete warning
      checks.
      
      There are three classes of kdoc warnings fixed:
       - cannot understand function prototype: 'x'
       - Excess function parameter 'x' description in 'y'
       - Function parameter or member 'x' not described in 'y'
      
      All of the changes were trivial comment updates on
      function headers.
      
      Inspired by Lee Jones' series of wireless work to do the same.
      Compile tested only, and passes simple test of
      $ git ls-files *.[ch] | egrep drivers/net/ethernet/intel | \
        xargs scripts/kernel-doc -none
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b50f7bca
  7. 01 9月, 2020 1 次提交
  8. 01 8月, 2020 2 次提交
  9. 28 5月, 2020 1 次提交
  10. 23 5月, 2020 2 次提交
  11. 22 5月, 2020 2 次提交
  12. 20 2月, 2020 1 次提交
  13. 13 2月, 2020 1 次提交
  14. 04 1月, 2020 1 次提交
  15. 05 11月, 2019 6 次提交
  16. 21 8月, 2019 1 次提交
  17. 24 5月, 2019 2 次提交
  18. 02 5月, 2019 1 次提交
  19. 18 4月, 2019 2 次提交
  20. 27 3月, 2019 3 次提交
  21. 26 3月, 2019 2 次提交
  22. 20 3月, 2019 1 次提交
  23. 16 1月, 2019 2 次提交
  24. 07 11月, 2018 1 次提交
    • B
      ice: Fix tx_timeout in PF driver · c585ea42
      Brett Creeley 提交于
      Prior to this commit the driver was running into tx_timeouts when a
      queue was stressed enough. This was happening because the HW tail
      and SW tail (NTU) were incorrectly out of sync. Consequently this was
      causing the HW head to collide with the HW tail, which to the hardware
      means that all descriptors posted for Tx have been processed.
      
      Due to the Tx logic used in the driver SW tail and HW tail are allowed
      to be out of sync. This is done as an optimization because it allows the
      driver to write HW tail as infrequently as possible, while still
      updating the SW tail index to keep track. However, there are situations
      where this results in the tail never getting updated, resulting in Tx
      timeouts.
      
      Tx HW tail write condition:
      	if (netif_xmit_stopped(txring_txq(tx_ring) || !skb->xmit_more)
      		writel(sw_tail, tx_ring->tail);
      
      An issue was found in the Tx logic that was causing the afore mentioned
      condition for updating HW tail to never happen, causing tx_timeouts.
      
      In ice_xmit_frame_ring we calculate how many descriptors we need for the
      Tx transaction based on the skb the kernel hands us. This is then passed
      into ice_maybe_stop_tx along with some extra padding to determine if we
      have enough descriptors available for this transaction. If we don't then
      we return -EBUSY to the stack, otherwise we move on and eventually
      prepare the Tx descriptors accordingly in ice_tx_map and set
      next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx
      with a value of MAX_SKB_FRAGS + 4. The key here is that this value is
      possibly less than the value we sent in the first call to
      ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused
      descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first
      call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update
      the HW tail because of the "Tx HW tail write condition" above. This is
      because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx
      instead of calling __ice_maybe_stop_tx and subsequently calling
      netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This
      bit is then checked in the "Tx HW tail write condition" by calling
      netif_xmit_stopped and subsequently updating HW tail if the
      afore mentioned bit is set.
      
      In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning
      the descriptors that HW sets the DD bit on and we have the budget. The
      HW head will eventually run into the HW tail in response to the
      description in the paragraph above.
      
      The next time through ice_xmit_frame_ring we make the initial call to
      ice_maybe_stop_tx with another skb from the stack. This time we do not
      have enough descriptors available and we return NETDEV_TX_BUSY to the
      stack and end up setting next_to_watch to NULL.
      
      This is where we are stuck. In ice_clean_tx_irq we never clean anything
      because next_to_watch is always NULL and in ice_xmit_frame_ring we never
      update HW tail because we already return NETDEV_TX_BUSY to the stack and
      eventually we hit a tx_timeout.
      
      This issue was fixed by making sure that the second call to
      ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value
      that was used on the initial call to ice_maybe_stop_tx in
      ice_xmit_frame_ring. This was done by adding the following defines to
      make the logic more clear and to reduce the chance of mucking this up
      again:
      
      ICE_CACHE_LINE_BYTES		64
      ICE_DESCS_PER_CACHE_LINE	(ICE_CACHE_LINE_BYTES / \
      				 sizeof(struct ice_tx_desc))
      ICE_DESCS_FOR_CTX_DESC		1
      ICE_DESCS_FOR_SKB_DATA_PTR	1
      
      The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we
      don't have to figure this out on every pass through the Tx path. Instead
      I added a sanity check in ice_probe to verify cache line size and print
      a message if it's not 64 Bytes. This will make it easier to file issues
      if they are seen when the cache line size is not 64 Bytes when reading
      from the GLPCI_CNF2 register.
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c585ea42