1. 29 3月, 2017 3 次提交
  2. 28 3月, 2017 2 次提交
  3. 24 3月, 2017 5 次提交
    • J
      i40e: add support for SCTPv4 FDir filters · f223c875
      Jacob Keller 提交于
      Enable FDir filters for SCTPv4 packets using the ethtool ntuple
      interface to enable filters. The ethtool API does not allow masking on
      the verification tag.
      
      Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f223c875
    • J
      i40e: implement support for flexible word payload · 0e588de1
      Jacob Keller 提交于
      Add support for flexible payloads passed via ethtool user-def field.
      This support is somewhat limited due to hardware design. The input set
      can only be programmed once per filter type, and the flexible offset is
      part of this filter input set. This means that the user cannot program
      both a regular and a flexible filter at the same time for a given flow
      type. Additionally, the user may not program two flexible filters of the
      same flow type with different offsets, although they are allowed to
      configure different values at that offset location.
      
      We support a single flexible word (2byte) value per protocol type, and
      we handle the FLX_PIT register using a list of flexible entries so that
      each flow type may be configured separately.
      
      Due to hardware implementation, the flexible data is offset from the
      start of the packet payload, and thus may not be in part of the header
      data. For this reason, the offset provided by the user defined data is
      interpreted as a byte offset from the start of the matching payload.
      Previous implementations have tried to represent the offset as from the
      start of the frame, but this is not feasible because header sizes may
      change due to options.
      
      Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0e588de1
    • J
      i40e: add parsing of flexible filter fields from userdef · e793095e
      Jacob Keller 提交于
      Add code to parse the user-def field into a data structure format. This
      code is intended to allow future extensions of the user-def field by
      keeping all code that actually reads and writes the field into a single
      location. This ensures that we do not litter the driver with references
      to the user-def field and minimizes the amount of bitwise operations we
      need to do on the data.
      
      Add code which parses the lower 32bits into a flexible word and its
      offset. This will be used in a future patch to enable flexible filters
      which can match on some arbitrary data in the packet payload. For now,
      we just return -EOPNOTSUPP when this is used.
      
      Add code to fill in the user-def field when reporting the filter back,
      even though we don't actually implement any user-def fields yet.
      
      Additionally, ensure that we mask the extended FLOW_EXT bit from the
      flow_type now that we will be accepting filters which have the FLOW_EXT
      bit set (and thus make use of the user-def field).
      
      Change-Id: I238845035c179380a347baa8db8223304f5f6dd7
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e793095e
    • J
      i40e: restore default input set for each flow type · 3bcee1e6
      Jacob Keller 提交于
      Ensure that the default input set is correctly reprogrammed when
      cleaning up after disabling flow director support. This ensures that the
      programmed value will be in a clean state.
      
      Although we do not yet have support for SCTPv4 filters, a future patch
      will add support for this protocol, so we will correctly restore the
      SCTPv4 input set here as well. Note that strictly speaking the default
      hardware value for SCTP includes matching the verification tag. However,
      the ethtool API does not have support for specifying this value, so
      there is no reason to keep the verification field enabled.
      
      This patch is the next step on the way to enabling partial tuple filters
      which will be implemented in a following patch.
      
      Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3bcee1e6
    • J
      i40e: check current configured input set when adding ntuple filters · 36777d9f
      Jacob Keller 提交于
      Do not assume that hardware has been programmed with the default mask,
      but instead read the input set registers to determine what is currently
      programmed. This ensures that all programmed filters match exactly how
      the hardware will interpret them, avoiding confusion regarding filter
      behavior.
      
      This sets the initial ground-work for allowing custom input sets where
      some fields are disabled. A future patch will fully implement this
      feature.
      
      Instead of using bitwise negation, we'll just explicitly check for the
      correct value. The use of htonl and htons are used to silence sparse
      warnings. The compiler should be able to handle the constant value and
      avoid actually performing a byteswap.
      
      Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      36777d9f
  4. 21 3月, 2017 3 次提交
  5. 15 3月, 2017 2 次提交
    • H
      i40e: rename auto_disable_flags to hw_disabled_flags · b77ac975
      Harshitha Ramamurthy 提交于
      A previous commit introduced a field that tracks the features
      that are disabled due to HW resource limitations as opposed
      to the featured disabled by the user. This patch changes the
      name of the field to make it more readable since it might get
      confusing when looking at code containing both the flags
      field and the auto_disable_features field together.
      
      Change-ID: Idcc9888659698f6fe3ccff17c8c3f09b5026f708
      Signed-off-by: NHarshitha Ramamurthy <harshitha.ramamurthy@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b77ac975
    • M
      i40e: KISS the client interface · 0ef2d5af
      Mitch Williams 提交于
      (KISS is Keep It Simple, Stupid. Or is it?)
      
      The client interface vastly overengineered for what it needs to do.
      It was originally designed to support multiple clients on multiple
      netdevs, possibly even with multiple drivers. None of this happened,
      and now we know that there will only ever be one client for i40e
      (i40iw) and one for i40evf (i40iwvf). So, time for some KISS. Since
      i40e and i40evf are a Dynasty, we'll simplify this one to match the
      VF interface.
      
      First, be a Destroyer and remove all of the lists and locks required
      to support multiple clients. Keep one static around to keep track of
      one client, and track the client instances for each netdev in the
      driver's pf (or adapter) struct. Now it's Almost Human.
      
      Since we already know the client type is iWarp, get rid of any checks
      for this. Same for VSI type - it's always going to be the same type,
      so it's just a Parasite.
      
      While we're at it, fix up some comments. This makes the function
      headers actually match the functions.
      
      These changes reduce code complexity, simplify maintenance,
      squash some lurking timing bugs, and allow us to Rock and Roll All
      Nite.
      
      Change-ID: I1ea79948ad73b8685272451440a34507f9a9012e
      Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0ef2d5af
  6. 19 2月, 2017 2 次提交
    • H
      i40e: Error handling for link event · ae136708
      Harshitha Ramamurthy 提交于
      There exists an intermittent bug which causes the 'Link Detected'
      field reported by the 'ethtool <iface>' command to be 'Yes' when
      in fact, there is no link. This patch fixes the problem by
      enabling temporary link polling when i40e_get_link_status returns
      an error. This causes the driver to remember that an admin queue
      command failed and polls, until the function returns with a success.
      
      Change-Id: I64c69b008db4017b8729f3fc27b8f65c8fe2eaa0
      Signed-off-by: NHarshitha Ramamurthy <harshitha.ramamurthy@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ae136708
    • J
      i40e: enable mc magic pkt wakeup during power down · 1d68005d
      Joshua Hay 提交于
      This patch adds a call to the mac_address_write admin q function during
      power down to update the PRTPM_SAH/SAL registers with the MC_MAG_EN bit
      thus enabling multicast magic packet wakeup.
      
      A FW workaround is needed to write the multicast magic wake up enable
      bit in the PRTPM_SAH register. The FW expects the mac address write
      admin q cmd to be called first with one of the WRITE_TYPE_LAA flags
      and then with the multicast relevant flags.
      
      *Note: This solution only works for X722 devices currently. A PFR will
      clear the previously mentioned bit by default, but X722 has support for a
      WOL_PRESERVE_ON_PFR flag which prevents the bit from being cleared. Once
      other devices support this flag, this solution should work as well.
      
      Change-ID: I51bd5b8535bd9051c2676e27c999c1657f786827
      Signed-off-by: NJoshua Hay <joshua.a.hay@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      1d68005d
  7. 12 2月, 2017 1 次提交
    • J
      i40e: avoid race condition when sending filters to firmware for addition · 671889e6
      Jacob Keller 提交于
      Refactor how we add new filters to firmware to avoid a race condition
      that can occur due to removing filters from the hash temporarily.
      
      To understand the race condition, suppose that you have a number of MAC
      filters, but have not yet added any VLANs. Now, add two VLANs in rapid
      succession. A possible resulting flow would look something like the
      following:
      
      (1) lock hash for add VLAN
      (2) add the new MAC/VLAN combos for each current MAC filter
      (3) unlock hash
      (4) lock hash for filter sync
      (5) notice that we have a VLAN, so prepare to update all MAC filters
          with VLAN=-1 to be VLAN=0.
      (6) move NEW and REMOVE filters to temporary list
      (7) unlock hash
      (8) lock hash for add VLAN
      (9) add new MAC/VLAN combos. Notice that no MAC filters are currently in
          the hash list, so we don't add any VLANs <--- BUG!
      (10) unlock hash
      (11) sync the temporary lists to firmware
      (12) lock hash for post-sync
      (13) move the temporary elements back to the main list
      ....
      
      Because we take filters out of the main hash into temporary lists, we
      introduce a narrow window where it is possible that other callers to the
      list will not see some of the filters which were previously added but
      have not yet been finalized. This results in sometimes dropping VLAN
      additions, and could also result in failing to add a MAC address on the
      newly added VLAN.
      
      One obvious way to avoid this race condition would be to lock the entire
      firmware process. Unfortunately this does not work because adminq
      firmware commands take a mutex which results in a sleep while atomic
      BUG(). So, we can't use the simplest approach.
      
      An alternative approach is to simply not remove the filters from the
      hash list while adding. Instead, add an i40e_new_mac_filter structure
      which we will use to track added filters. This avoids the need to remove
      the filter from the hash list. We'll store a pointer to the original
      i40e_mac_filter, along with our own copy of the state.
      
      We won't update the state directly, so as to avoid race with other code
      that may modify the state while under the lock. We are safe to read
      f->macaddr and f->vlan since these only change in two locations. The
      first is on filter creation, which must have already occurred. The
      second is inside i40e_correct_vlan_filters which was previously run
      after creation of this object and can't be run again until after. Thus,
      we should be safe to read the MAC address and VLAN while outside the
      lock.
      
      We also aren't going to run into a use-after-free issue because the only
      place where we free filters is when they are marked FAILED or when we
      remove them inside the sync subtask. Since the subtask has its own
      critical flag to prevent duplicate runs, we know this won't happen. We
      also know that the only location to transition a filter from NEW to
      FAILED is inside the subtask also, so we aren't worried about that
      either.
      
      Use the wrapper i40e_new_mac_filter for additions, and once we've
      finalized the addition to firmware, we will update the filter state
      inside a lock, and then free the wrapper structure.
      
      In order to avoid a possible race condition with filter deletion, we
      won't update the original filter state unless it is still
      I40E_FILTER_NEW when we finish the firmware sync.
      
      This approach is more complex, but avoids race conditions related to
      filters being temporarily removed from the list. We do not need the same
      behavior for deletion because we always unconditionally removed the
      filters from the list regardless of the firmware status.
      
      Change-Id: I14b74bc2301f8e69433fbe77ebca532db20c5317
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      671889e6
  8. 03 2月, 2017 5 次提交
  9. 09 1月, 2017 1 次提交
  10. 07 12月, 2016 4 次提交
  11. 03 12月, 2016 1 次提交
  12. 01 11月, 2016 8 次提交
    • F
      i40e: removed unreachable code · 3aa7b74d
      Filip Sadowski 提交于
      Removed some of unnecessary if statements and unreachable code found by
      static code analysis tool.
      The return value of i40e_vsi_control_rings(..., false) is always 0. So,
      test for non-zero will never be true. The function has been split into
      "int i40e_vsi_start_rings()" and "void i40e_vsi_stop_rings()" for better
      understanding.
      Similarly, the function i40e_vsi_kill_vlan() never fails. So, checking
      for return value is also unnecessary. Function definition changed to void.
      The i40e_loopback_test() function is not implemented. The function and
      all references to loopback testing were removed.
      
      Change-ID: Id45cf66f6689ce2bc4e887de13f073e30e8431bd
      Signed-off-by: NFilip Sadowski <filip.sadowski@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3aa7b74d
    • A
      i40e: Add common function for finding VSI by type · 4b816446
      Alexander Duyck 提交于
      This patch adds a common method for finding a VSI by type.  The main
      motivation for doing this is that the Flow Director path actually had two
      ways of handling this, one stopped on first match and one did not.  This
      patch makes it so that all callers of this function will get the same
      approach for finding a VSI.
      
      Change-ID: Ibf25de8acd8466582520694424aa87da66965fbd
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NBimmy Pujari <bimmy.pujari@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4b816446
    • J
      i40e: replace PTP Rx timestamp hang logic · 12490501
      Jacob Keller 提交于
      The current Rx timestamp hang logic is not very robust because it does
      not notice a register is hung until all four timestamps have been
      latched and we wait a full 5 seconds. Replace this logic with a newer Rx
      hang detection based on storing the jiffies when we first notice
      a receive timestamp event. We store each register's time separately,
      along with a flag indicating if it is currently latched. Upon first
      transitioning to latch, we will update the latch_events[i] jiffies
      value. This indicates the time we first noticed this event. The watchdog
      routine will simply check that the either the flag has been cleared, or
      we have passed at least one second. In this case, it is able to clear
      the Rx timestamp register under the assumption that it was for a dropped
      frame. The benefit if this strategy is that we should be able to
      detect and clear out stalled RXTIME_H registers before we exhaust the
      supply of 4, and avoid complete stall of Rx timestamp events.
      
      Change-ID: Id55458c0cd7a5dd0c951ff2b8ac0b2509364131f
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      12490501
    • J
      i40e: use a mutex instead of spinlock in PTP user entry points · 19551262
      Jacob Keller 提交于
      We need a locking mechanism to protect the hardware SYSTIME register
      which is split over 2 values, and has internal hardware latching. We
      can't allow multiple accesses at the same time. However....
      
      The spinlock_t is overkill here, especially use of spin_lock_irqsave,
      since every PTP access will halt hardirqs. Notice that the only places
      which need the SYSTIME value are user context and are capable of sleeping.
      Thus, it is safe to use a mutex here instead of the spinlock.
      
      Change-ID: I971761a89b58c6aad953590162e85a327fbba232
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      19551262
    • J
      i40e: avoid looping to check whether we're in VLAN mode · cbebb85f
      Jacob Keller 提交于
      We determine that a VSI is in vlan_mode whenever it has any filters
      with a VLAN other than -1 (I40E_VLAN_ALL). The previous method of doing
      so was to perform a loop whenever we needed the check. However, we can
      notice that only place where filters are added (i40e_add_filter) can
      change the condition from false to true, and the only place we can
      return to false is in i40e_vsi_sync_filters_subtask. Thus, we can remove
      the loop and use a boolean directly.
      
      Doing this avoids looping over filters repeatedly especially while we're
      already inside a loop over all the filters. This should reduce the
      latency of filter operations throughout the driver.
      
      Change-ID: Iafde08df588da2a2ea666997d05e11fad8edc338
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      cbebb85f
    • J
      i40e: store MAC/VLAN filters in a hash with the MAC Address as key · 278e7d0b
      Jacob Keller 提交于
      Replace the mac_filter_list with a static size hash table of 8bits. The
      primary advantage of this is a decrease in latency of operations related
      to searching for specific MAC filters, including .set_rx_mode. Using
      a linked list resulted in several locations which were O(n^2). Using
      a hash table should give us latency growth closer to O(n*log(n)).
      
      Change-ID: I5330bd04053b880e670210933e35830b95948ebb
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      278e7d0b
    • J
      i40e: make use of __dev_uc_sync and __dev_mc_sync · 6622f5cd
      Jacob Keller 提交于
      The kernel provides __dev_uc_sync and __dev_mc_sync in order for drivers
      which need individual notification of add and delete for each filter.
      These functions allow us to vastly simplify our .set_rx_mode handler. We
      need to implement two functions for sync and unsync which add and remove
      filters respectively.
      
      This change avoids a very complex and inefficient algorithm which
      resulted in an abnormal latency for the .set_rx_mode NDO operation. The
      resulting code after this change is more readable, more efficient, and
      less code.
      
      Due to the callback signature used by these functions we also must
      update several other functions to take a const u8 * pointer.
      
      Change-Id: I2ca7fd4e10c0c07ed2291db1ea41bf5987fc6474
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6622f5cd
    • J
      i40e: drop is_vf and is_netdev fields in struct i40e_mac_filter · 1bc87e80
      Jacob Keller 提交于
      Originally the is_vf and is_netdev fields were added in order to
      distinguish between VF and netdev filters in a single VSI. However, it
      can be noted that we use separate VSI for SRIOV VFs and for netdev VSI.
      Thus, since a single VSI should only ever have one type of filter, we
      can simply remove the checks and remove the typing.
      
      In a similar fashion, we can note that the only remaining way to get
      multiple filters of a single type is through a debug command that was
      added to debugfs. This command is useless in practice, and results in
      causing bugs if we keep counter tracking but lose the is_vf and
      is_netdev protections as desired above.
      
      Since the only time we'd actually have a counter value besides 0 and
      1 is through use of this debugfs hook, we can remove this unnecessary
      command, and the entire counter logic it required.
      
      We vastly simplify mac filters by removing
      
      (a) the distinction between VF and netdev filters
      (b) counting logic
      (c) the ability to add and remove filters bypassing the stack via debugfs
      
      Change-ID: Idf916dd2a1159b1188ddbab5bef6b85ea6bf27d9
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      1bc87e80
  13. 29 10月, 2016 2 次提交
    • A
      i40e/i40evf: fix interrupt affinity bug · 96db776a
      Alan Brady 提交于
      There exists a bug in which a 'perfect storm' can occur and cause
      interrupts to fail to be correctly affinitized. This causes unexpected
      behavior and has a substantial impact on performance when it happens.
      
      The bug occurs if there is heavy traffic, any number of CPUs that have
      an i40e interrupt are pegged at 100%, and the interrupt afffinity for
      those CPUs is changed.  Instead of moving to the new CPU, the interrupt
      continues to be polled while there is heavy traffic.
      
      The bug is most readily realized as the driver is first brought up and
      all interrupts start on CPU0. If there is heavy traffic and the
      interrupt starts polling before the interrupt is affinitized, the
      interrupt will be stuck on CPU0 until traffic stops. The bug, however,
      can also be wrought out more simply by affinitizing all the interrupts
      to a single CPU and then attempting to move any of those interrupts off
      while there is heavy traffic.
      
      This patch fixes the bug by registering for update notifications from
      the kernel when the interrupt affinity changes. When that fires, we
      cache the intended affinity mask. Then, while polling, if the cpu is
      pegged at 100% and we failed to clean the rings, we check to make sure
      we have the correct affinity and stop polling if we're firing on the
      wrong CPU.  When the kernel successfully moves the interrupt, it will
      start polling on the correct CPU. The performance impact is minimal
      since the only time this section gets executed is when performance is
      already compromised by the CPU.
      
      Change-ID: I4410a880159b9dba1f8297aa72bef36dca34e830
      Signed-off-by: NAlan Brady <alan.brady@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      96db776a
    • A
      i40e: Remove unused function i40e_vsi_lookup · dc762120
      Alexander Duyck 提交于
      The function is not used so there is no need to carry it forward.  I have
      plans to add a slightly different function that can be inlined to handle
      the same kind of functionality.
      
      Change-ID: Ie2dfcb189dc75e5fbc156bac23003e3b4210ae0f
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      dc762120
  14. 28 10月, 2016 1 次提交
    • D
      i40e: Fix configure TCs after initial DCB disable · ea6acb7e
      David Ertman 提交于
      in commit a036244c a fix
      was put into place to avoid a kernel panic when a non-
      supported traffic class configuration was put into place
      and then lldp was enabled/disabled on the link partner
      switch.  This fix caused it to be necessary to
      unload/reload the driver to reenable DCB once a supported
      TC config was in place.
      
      The root cause of the original panic was that the function
      i40e_pf_get_default_tc was allowing for a default TC other
      than TC 0, and only TC 0 is supported as a default.
      
      This patch removes the get_default_tc function and replaces
      it with a #define since there is only one TC supported as
      a default.
      
      Change-Id: I448371974e946386d0a7718d73668b450b7c72ef
      Signed-off-by: NDave Ertman <david.m.ertman@intel.com>
      Tested-by: NRonald Bynoe <ronald.j.bynoe@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ea6acb7e