1. 10 10月, 2017 1 次提交
  2. 28 8月, 2017 4 次提交
    • J
      i40e/i40evf: avoid dynamic ITR updates when polling or low packet rate · 742c9875
      Jacob Keller 提交于
      The dynamic ITR algorithm depends on a calculation of usecs which
      assumes that the interrupts have been firing constantly at the interrupt
      throttle rate. This is not guaranteed because we could have a low packet
      rate, or have been polling in software.
      
      We'll estimate whether this is the case by using jiffies to determine if
      we've been too long. If the time difference of jiffies is larger we are
      guaranteed to have an incorrect calculation. If the time difference of
      jiffies is smaller we might have been polling some but the difference
      shouldn't affect the calculation too much.
      
      This ensures that we don't get stuck in BULK latency during certain rare
      situations where we receive bursts of packets that force us into NAPI
      polling.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      742c9875
    • J
      i40e/i40evf: remove ULTRA latency mode · 0a2c7722
      Jacob Keller 提交于
      Since commit c56625d5 ("i40e/i40evf: change dynamic interrupt
      thresholds") a new higher latency ITR setting called I40E_ULTRA_LATENCY
      was added with a cryptic comment about how it was meant for adjusting Rx
      more aggressively when streaming small packets.
      
      This mode was attempting to calculate packets per second and then kick
      in when we have a huge number of small packets.
      
      Unfortunately, the ULTRA setting was kicking in for workloads it wasn't
      intended for including single-thread UDP_STREAM workloads.
      
      This wasn't caught for a variety of reasons. First, the ip_defrag
      routines were improved somewhat which makes the UDP_STREAM test still
      reasonable at 10GbE, even when dropped down to 8k interrupts a second.
      Additionally, some other obvious workloads appear to work fine, such
      as TCP_STREAM.
      
      The number 40k doesn't make sense for a number of reasons. First, we
      absolutely can do more than 40k packets per second. Second, we calculate
      the value inline in an integer, which sometimes can overflow resulting
      in using incorrect values.
      
      If we fix this overflow it makes it even more likely that we'll enter
      ULTRA mode which is the opposite of what we want.
      
      The ULTRA mode was added originally as a way to reduce CPU utilization
      during a small packet workload where we weren't keeping up anyways. It
      should never have been kicking in during these other workloads.
      
      Given the issues outlined above, let's remove the ULTRA latency mode. If
      necessary, a better solution to the CPU utilization issue for small
      packet workloads will be added in a future patch.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0a2c7722
    • J
      i40e: invert logic for checking incorrect cpu vs irq affinity · 6d977729
      Jacob Keller 提交于
      In commit 96db776a ("i40e/vf: fix interrupt affinity bug")
      we added some code to force exit of polling in case we did
      not have the correct CPU. This is important since it was possible for
      the IRQ affinity to be changed while the CPU is pegged at 100%. This can
      result in the polling routine being stuck on the wrong CPU until
      traffic finally stops.
      
      Unfortunately, the implementation, "if the CPU is correct, exit as
      normal, otherwise, fall-through to the end-polling exit" is incredibly
      confusing to reason about. In this case, the normal flow looks like the
      exception, while the exception actually occurs far away from the if
      statement and comment.
      
      We recently discovered and fixed a bug in this code because we were
      incorrectly initializing the affinity mask.
      
      Re-write the code so that the exceptional case is handled at the check,
      rather than having the logic be spread through the regular exit flow.
      This does end up with minor code duplication, but the resulting code is
      much easier to reason about.
      
      The new logic is identical, but inverted. If we are running on a CPU not
      in our affinity mask, we'll exit polling. However, the code flow is much
      easier to understand.
      
      Note that we don't actually have to check for MSI-X, because in the MSI
      case we'll only have one q_vector, but its default affinity mask should
      be correct as it includes all CPUs when it's initialized. Further, we
      could at some point add code to setup the notifier for the non-MSI-X
      case and enable this workaround for that case too, if desired, though
      there isn't much gain since its unlikely to be the common case.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6d977729
    • J
      i40e: move enabling icr0 into i40e_update_enable_itr · 9254c0e3
      Jacob Keller 提交于
      If we don't have MSI-X enabled, we handle interrupts on all icr0. This
      is a special case, so let's move the conditional into
      i40e_update_enable_itr() in order to make i40e_napi_poll easier to
      read about.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      9254c0e3
  3. 02 8月, 2017 1 次提交
  4. 26 7月, 2017 2 次提交
  5. 21 6月, 2017 2 次提交
  6. 13 6月, 2017 1 次提交
    • J
      i40e: fix handling of HW ATR eviction · 6964e53f
      Jacob Keller 提交于
      A recent commit to refactor the driver and remove the hw_disabled_flags
      field accidentally introduced two regressions. First, we overwrote
      pf->flags which removed various key flags including the MSI-X settings.
      
      Additionally, it was intended that we have now two flags,
      HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done,
      and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere.
      
      This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely
      updates pf->flags instead of overwriting it.
      
      Without this patch we will have many problems including disabling MSI-X
      support, and we'll attempt to use HW ATR eviction on devices which do
      not support it.
      
      Fixes: 47994c11 ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19)
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6964e53f
  7. 06 6月, 2017 1 次提交
  8. 31 5月, 2017 3 次提交
  9. 30 4月, 2017 3 次提交
    • J
      i40e: remove hw_disabled_flags in favor of using separate flag bits · 47994c11
      Jacob Keller 提交于
      The hw_disabled_flags field was added as a way of signifying that
      a feature was automatically or temporarily disabled. However, we
      actually only use this for FDir features. Replace its use with new
      _AUTO_DISABLED flags instead. This is more readable, because you aren't
      setting an *_ENABLED flag to *disable* the feature.
      
      Additionally, clean up a few areas where we used these bits. First, we
      don't really need to set the auto-disable flag for ATR if we're fully
      disabling the feature via ethtool.
      
      Second, we should always clear the auto-disable bits in case they somehow
      got set when the feature was disabled. However, avoid displaying
      a message that we've re-enabled the feature.
      
      Third, we shouldn't be re-enabling ATR in the SB ntuple add flow,
      because it might have been disabled due to space constraints. Instead,
      we should just wait for the fdir_check_and_reenable to be called by the
      watchdog.
      
      Overall, this change allows us to simplify some code by removing an
      extra field we didn't need, and the result should make it more clear as
      to what we're actually doing with these flags.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      47994c11
    • J
      i40e: use DECLARE_BITMAP for state fields · 0da36b97
      Jacob Keller 提交于
      Instead of assuming our flags fit within an unsigned long, use
      DECLARE_BITMAP which will ensure that we always allocate enough space.
      Additionally, use __I40E_STATE_SIZE__ markers as the last element of the
      enumeration so that the size of the BITMAP is compile-time assigned
      rather than programmer-time assigned. This ensures that potential future
      flag additions do not actually overrun the array. This is especially
      important as 32bit systems would only have 32bit longs instead of 64bit
      longs as we generally have assumed in the prior code.
      
      This change also removes a dereference of the state fields throughout
      the code, so it does have a bit of code churn. The conversions were
      automated using sed replacements with an alternation
      
        s/&(vsi->back|vsi|pf)->state/\1->state/
        s/&adapter->vsi.state/adapter->vsi.state/
      
      For debugfs, we modify the printing so that we can display chunks of the
      state value on new lines. This ensures that we can print the entire set
      of state values. Additionally, we now print them as 08lx to ensure that
      they display nicely.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0da36b97
    • J
      i40e: separate PF and VSI state flags · d19cb64b
      Jacob Keller 提交于
      Avoid using the same named flags for both vsi->state and pf->state. This
      makes code review easier, as it is more likely that future authors will
      use the correct state field when checking bits. Previous commits already
      found issues with at least one check, and possibly others may be
      incorrect.
      
      This reduces confusion as it is more clear what each flag represents,
      and which flags are valid for which state field.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      d19cb64b
  10. 20 4月, 2017 2 次提交
    • S
      i40e/i40evf: Add tracepoints · ed0980c4
      Scott Peterson 提交于
      This patch adds tracepoints to the i40e and i40evf drivers to which
      BPF programs can be attached for feature testing and verification.
      It's expected that an attached BPF program will identify and count or
      log some interesting subset of traffic. The bcc-tools package is
      helpful there for containing all the BPF arcana in a handy Python
      wrapper. Though you can make these tracepoints log trace messages, the
      messages themselves probably won't be very useful (other to verify the
      tracepoint is being called while you're debugging your BPF program).
      
      The idea here is that tracepoints have such low performance cost when
      disabled that we can leave these in the upstream drivers. This may
      eventually enable the instrumentation of unmodified customer systems
      should the need arise to verify a NIC feature is working as expected.
      In general this enables one set of feature verification tools to be
      used on these drivers whether they're built with the kernel or
      separately.
      
      Users are advised against using these tracepoints for anything other
      than a diagnostic tool. They have a performance impact when enabled,
      and their exact placement and form may change as we see how well they
      work in practice for the purposes above.
      
      Change-ID: Id6014a7322c0e6d08068114dd20bd156f2f6435e
      Signed-off-by: NScott Peterson <scott.d.peterson@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ed0980c4
    • A
      i40e: Fix support for flow director programming status · 0e626ff7
      Alexander Duyck 提交于
      This patch fixes an issue I introduced when I converted the code over to
      using the length field to determine if a descriptor was done or not. It
      turns out that we are also processing programming descriptors in the Rx
      path and need to have these processed even though the length field will be
      0 on these packets.  What will happen with a programming descriptor is that
      we will receive a descriptor that has the SPH bit set, and the header
      length and packet length fields cleared.
      
      To account for this we should be checking for the bit for split header
      being set even though we aren't actually using header split. This bit is
      set in the length field to indicate if a programming descriptor response is
      contained in the descriptor. Since we don't support header split we don't
      need to perform the extra checks of using a fixed value for the entire
      length field.
      
      In addition I am moving the function for checking if a filter is a
      programming status filter into the i40e_txrx.c file since there is no
      longer support for FCoE it doesn't make sense to keep this file in i40e.h.
      
      Change-ID: I12c359c3dc70adb9d6b92b27324bb2c7f04c1a06
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0e626ff7
  11. 08 4月, 2017 6 次提交
  12. 29 3月, 2017 4 次提交
  13. 28 3月, 2017 4 次提交
  14. 24 3月, 2017 2 次提交
    • J
      i40e: add support for SCTPv4 FDir filters · f223c875
      Jacob Keller 提交于
      Enable FDir filters for SCTPv4 packets using the ethtool ntuple
      interface to enable filters. The ethtool API does not allow masking on
      the verification tag.
      
      Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f223c875
    • J
      i40e: implement support for flexible word payload · 0e588de1
      Jacob Keller 提交于
      Add support for flexible payloads passed via ethtool user-def field.
      This support is somewhat limited due to hardware design. The input set
      can only be programmed once per filter type, and the flexible offset is
      part of this filter input set. This means that the user cannot program
      both a regular and a flexible filter at the same time for a given flow
      type. Additionally, the user may not program two flexible filters of the
      same flow type with different offsets, although they are allowed to
      configure different values at that offset location.
      
      We support a single flexible word (2byte) value per protocol type, and
      we handle the FLX_PIT register using a list of flexible entries so that
      each flow type may be configured separately.
      
      Due to hardware implementation, the flexible data is offset from the
      start of the packet payload, and thus may not be in part of the header
      data. For this reason, the offset provided by the user defined data is
      interpreted as a byte offset from the start of the matching payload.
      Previous implementations have tried to represent the offset as from the
      start of the frame, but this is not feasible because header sizes may
      change due to options.
      
      Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0e588de1
  15. 21 3月, 2017 4 次提交