1. 30 9月, 2017 5 次提交
    • M
      i40e: refactor FW version checking · 22b96551
      Mitch Williams 提交于
      The i40e driver now supports two different devices with two different
      firmware versions. So be smart about how we handle these. Move the FW
      version macros to the appropriate header file, and add a convenience
      macro that checks the version based on the device. Then use this macro
      to check whether or not the driver can use the new link info API.
      Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      22b96551
    • A
      i40evf: fix ring to vector mapping · c97fc9b6
      Alan Brady 提交于
      The current implementation for mapping queues to vectors is broken
      because it attempts to map each Tx and Rx ring to its own vector,
      however we use combined queues so we should actually be mapping the
      Tx/Rx rings together on one vector.
      
      Also in the current implementation, in the case where we have more
      queues than vectors, we attempt to group the queues together into
      'chunks' and map each 'chunk' of queues to a vector.  Chunking them
      together would be more ideal if, and only if, we only had RSS because of
      the way the hashing algorithm works but in the case of a future patch
      that enables VF ADq, round robin assignment is better and still works
      with RSS.
      
      This patch resolves both those issues and simplifies the code needed to
      accomplish this.  Instead of treating the case where we have more queues
      than vectors as special, if we notice our vector index is greater than
      vectors, reset the vector index to zero and continue mapping.  This
      should ensure that in both cases, whether we have enough vectors for
      each queue or not, the queues get appropriately mapped.
      Signed-off-by: NAlan Brady <alan.brady@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c97fc9b6
    • J
      i40e: shutdown all IRQs and disable MSI-X when suspended · b980c063
      Jacob Keller 提交于
      On some platforms with a large number of CPUs, we will allocate many IRQ
      vectors. When hibernating, the system will attempt to migrate all of the
      vectors back to CPU0 when shutting down all the other CPUs. It is
      possible that we have so many vectors that it cannot re-assign them to
      CPU0. This is even more likely if we have many devices installed in one
      platform.
      
      The end result is failure to hibernate, as it is not possible to
      shutdown the CPUs. We can avoid this by disabling MSI-X and clearing our
      interrupt scheme when the device is suspended. A more ideal solution
      would be some method for the stack to properly handle this for all
      drivers, rather than on a case-by-case basis for each driver to fix
      itself.
      
      However, until this more ideal solution exists, we can do our part and
      shutdown our IRQs during suspend, which should allow systems with
      a large number of CPUs to safely suspend or hibernate.
      
      It may be worth investigating if we should shut down even further when
      we suspend as it may make the path cleaner, but this was the minimum fix
      for the hibernation issue mentioned here.
      
      Testing-hints:
        This affects systems with a large number of CPUs, and with multiple
        devices enabled. Without this change, those platforms are unable to
        hibernate at all.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b980c063
    • M
      i40evf: lower message level · 905770fa
      Mitch Williams 提交于
      We see this message regularly on VF reset or unload (which invokes a
      reset). It's essentially meaningless unless it's happening constantly.
      To prevent consternation, lower the log level to debug so it's not seen
      under normal circumstance.
      Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      905770fa
    • J
      i40e/i40evf: rename bytes_per_int to bytes_per_usec · 2b634bb0
      Jacob Keller 提交于
      This value is not calculating bytes_per_int, which would actually just
      be bytes/ITR_COUNTDOWN_START, but rather it's calculating bytes/usecs.
      
      Rename the variable for clarity so that future developers understand
      what the value is actually calculating.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      2b634bb0
  2. 22 9月, 2017 1 次提交
  3. 28 8月, 2017 8 次提交
    • J
      i40e/i40evf: avoid dynamic ITR updates when polling or low packet rate · 742c9875
      Jacob Keller 提交于
      The dynamic ITR algorithm depends on a calculation of usecs which
      assumes that the interrupts have been firing constantly at the interrupt
      throttle rate. This is not guaranteed because we could have a low packet
      rate, or have been polling in software.
      
      We'll estimate whether this is the case by using jiffies to determine if
      we've been too long. If the time difference of jiffies is larger we are
      guaranteed to have an incorrect calculation. If the time difference of
      jiffies is smaller we might have been polling some but the difference
      shouldn't affect the calculation too much.
      
      This ensures that we don't get stuck in BULK latency during certain rare
      situations where we receive bursts of packets that force us into NAPI
      polling.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      742c9875
    • J
      i40e/i40evf: remove ULTRA latency mode · 0a2c7722
      Jacob Keller 提交于
      Since commit c56625d5 ("i40e/i40evf: change dynamic interrupt
      thresholds") a new higher latency ITR setting called I40E_ULTRA_LATENCY
      was added with a cryptic comment about how it was meant for adjusting Rx
      more aggressively when streaming small packets.
      
      This mode was attempting to calculate packets per second and then kick
      in when we have a huge number of small packets.
      
      Unfortunately, the ULTRA setting was kicking in for workloads it wasn't
      intended for including single-thread UDP_STREAM workloads.
      
      This wasn't caught for a variety of reasons. First, the ip_defrag
      routines were improved somewhat which makes the UDP_STREAM test still
      reasonable at 10GbE, even when dropped down to 8k interrupts a second.
      Additionally, some other obvious workloads appear to work fine, such
      as TCP_STREAM.
      
      The number 40k doesn't make sense for a number of reasons. First, we
      absolutely can do more than 40k packets per second. Second, we calculate
      the value inline in an integer, which sometimes can overflow resulting
      in using incorrect values.
      
      If we fix this overflow it makes it even more likely that we'll enter
      ULTRA mode which is the opposite of what we want.
      
      The ULTRA mode was added originally as a way to reduce CPU utilization
      during a small packet workload where we weren't keeping up anyways. It
      should never have been kicking in during these other workloads.
      
      Given the issues outlined above, let's remove the ULTRA latency mode. If
      necessary, a better solution to the CPU utilization issue for small
      packet workloads will be added in a future patch.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0a2c7722
    • J
      i40e: invert logic for checking incorrect cpu vs irq affinity · 6d977729
      Jacob Keller 提交于
      In commit 96db776a ("i40e/vf: fix interrupt affinity bug")
      we added some code to force exit of polling in case we did
      not have the correct CPU. This is important since it was possible for
      the IRQ affinity to be changed while the CPU is pegged at 100%. This can
      result in the polling routine being stuck on the wrong CPU until
      traffic finally stops.
      
      Unfortunately, the implementation, "if the CPU is correct, exit as
      normal, otherwise, fall-through to the end-polling exit" is incredibly
      confusing to reason about. In this case, the normal flow looks like the
      exception, while the exception actually occurs far away from the if
      statement and comment.
      
      We recently discovered and fixed a bug in this code because we were
      incorrectly initializing the affinity mask.
      
      Re-write the code so that the exceptional case is handled at the check,
      rather than having the logic be spread through the regular exit flow.
      This does end up with minor code duplication, but the resulting code is
      much easier to reason about.
      
      The new logic is identical, but inverted. If we are running on a CPU not
      in our affinity mask, we'll exit polling. However, the code flow is much
      easier to understand.
      
      Note that we don't actually have to check for MSI-X, because in the MSI
      case we'll only have one q_vector, but its default affinity mask should
      be correct as it includes all CPUs when it's initialized. Further, we
      could at some point add code to setup the notifier for the non-MSI-X
      case and enable this workaround for that case too, if desired, though
      there isn't much gain since its unlikely to be the common case.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6d977729
    • J
      i40e: initialize our affinity_mask based on cpu_possible_mask · 759dc4a7
      Jacob Keller 提交于
      On older kernels a call to irq_set_affinity_hint does not guarantee that
      the IRQ affinity will be set. If nothing else on the system sets the IRQ
      affinity this can result in a bug in the i40e_napi_poll() routine where
      we notice that our interrupt fired on the "wrong" CPU according to our
      internal affinity_mask variable.
      
      This results in a bug where we continuously tell NAPI to stop polling to
      move the interrupt to a new CPU, but the CPU never changes because our
      affinity mask does not match the actual mask setup for the IRQ.
      
      The root problem is a mismatched affinity mask value. So lets initialize
      the value to cpu_possible_mask instead. This ensures that prior to the
      first time we get an IRQ affinity notification we'll have the mask set
      to include every possible CPU.
      
      We use cpu_possible_mask instead of cpu_online_mask since the former is
      almost certainly never going to change, while the later might change
      after we've made a copy.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      759dc4a7
    • M
      i40e/i40evf: support for VF VLAN tag stripping control · 8774370d
      Mariusz Stachura 提交于
      This patch gives VF capability to control VLAN tag stripping via
      ethtool. As rx-vlan-offload was fixed before, now the VF is able to
      change it using "ethtool --offload <IF> rxvlan on/off" settings.
      Signed-off-by: NMariusz Stachura <mariusz.stachura@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      8774370d
    • J
      i40evf: fix possible snprintf truncation of q_vector->name · 696ac80a
      Jacob Keller 提交于
      The q_vector names are based on the interface name with a driver prefix,
      the type of q_vector setup, and the queue number. We previously set the
      size of this variable to IFNAMSIZ + 9, which is incorrect, because we
      actually include a minimum of 14 characters extra beyond the interface
      name size.
      
      New versions of GCC since 7 include a new warning that detects this
      possible truncation and complains. We can fix this by increasing the
      size in case our interface name is too large to avoid truncation. We
      don't need to go beyond 14 because the compiler is smart enough to
      realize our values can never exceed size of 1. We do go up to 15 here
      because possible future changes may increase the number of queues beyond
      one digit.
      
      While we are here, also change some variables to be unsigned (since they
      are never negative) and stop using an extra unnecessary %s format
      specifier.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      696ac80a
    • J
      i40e: prevent snprintf format specifier truncation · b5d5504a
      Jacob Keller 提交于
      Increase the size of the prefix buffer so that it can hold enough
      characters for every possible input. Although 20 is enough for all
      expected inputs, it is possible for the values to be larger than
      expected, resulting in a possibly truncated string. Additionally, lets
      use sizeof(prefix) in order to ensure we use the correct size if we need
      to change the array length in the future.
      
      New versions of GCC starting at 7 now include warnings to prevent
      truncation unless you handle the return code. At most 27 bytes can be
      written here, so lets just increase the buffer size even if for all
      expected hw->bus.* values we only needed 20.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b5d5504a
    • M
      i40e: Store the requested FEC information · ed601f66
      Mariusz Stachura 提交于
      Store information about FEC modes, that were requested. It will be used
      in printing link status information function and this way there is no
      need to call admin queue there.
      Signed-off-by: NMariusz Stachura <mariusz.stachura@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ed601f66
  4. 26 8月, 2017 7 次提交
  5. 26 7月, 2017 4 次提交
  6. 21 6月, 2017 2 次提交
  7. 06 6月, 2017 1 次提交
  8. 02 6月, 2017 6 次提交
  9. 31 5月, 2017 3 次提交
  10. 30 4月, 2017 3 次提交