1. 30 4月, 2017 4 次提交
  2. 20 4月, 2017 5 次提交
    • J
      i40e: use i40e_stop_rings_no_wait to implement PORT_SUSPENDED state · 3480756f
      Jacob Keller 提交于
      This state bit was added as a way for DCB to avoid having to wait for
      the queues to disable when handling LLDP events. The logic for this was
      burried deep within stop Tx and stop Rx queue code. First, let's rename
      it so that it does not appear to only affect Tx when infact it modifies
      both Tx and Rx flow. Second we can move it up into the i40e_stop_rings()
      function, and we can simply re-use the i40e_stop_rings_no_wait() so that
      we don't have to bury the implementation as deep into the call stack.
      
      An alternative might be to remove the state bit and instead attempt to
      shut down everything directly in DCP flow. This, however, is not ideal
      because it creates yet another separate shutdown routine that we'd have
      to maintain. In the current implementation any changes will be made to
      both flows.
      
      Change-ID: I68e1ccb901af320862bca395e9c9746f08e8b17c
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3480756f
    • J
      i40e: reset all VFs in parallel when rebuilding PF · e4b433f4
      Jacob Keller 提交于
      When there are a lot of active VFs, it can take multiple seconds to
      finish resetting all of them during certain flows., which can cause some
      VFs to fail to wait long enough for the reset to occur. The user might
      see messages like "Never saw reset" or "Reset never finished" and the VF
      driver will stop functioning properly.
      
      The naive solution would be to simply increase the wait timer. We can
      get much more clever. Notice that i40e_reset_vf is run in a serialized
      fashion, and includes lots of delays.
      
      There are two prominent delays which take most of the time. First, when
      we begin resetting VFs, we have multiple 10ms delays which accrue
      because we reset each VF in a serial fashion. These delays accumulate to
      almost 4 seconds when handling the maximum number of VFs (128).
      
      Secondly, there is a massive 50ms delay for each time we disable queues
      on a VSI. This delay is necessary to allow HW to finish disabling queues
      before we restore functionality. However, just like with the first case,
      we are paying the cost for each VF, rather than disabling all VFs and
      waiting once.
      
      Both of these can be fixed, but required some previous refactoring to
      handle the special case. First, we will need the
      i40e_vsi_wait_queues_disabled function which was previously DCB
      specific. Second, we will need to implement our own
      i40e_vsi_stop_rings_no_wait function which will handle the stopping of
      rings without the delays.
      
      Finally, implement an i40e_reset_all_vfs function, which will first
      start the reset of all VFs, and pay the wait cost all at once, rather
      than serially waiting for each VF before we start processing then next
      one. After the VF has been reset, we'll disable all the VF queues, and
      then wait for them to disable. Again, we'll organize the flow such that
      we pay the wait cost only once.
      
      Finally, after we've disabled queues we'll go ahead and begin restoring
      VF functionality. The result is reducing the wait time by a large factor
      and ensuring that VFs do not timeout when waiting in the VF driver.
      
      Change-ID: Ia6e8cf8d98131b78aec89db78afb8d905c9b12be
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e4b433f4
    • J
      i40e: factor out queue control from i40e_vsi_control_(tx|rx) · c768e490
      Jacob Keller 提交于
      A future patch will need to be able to handle controlling queues without
      waiting until all VSIs are handled. Factor out the direct queue
      modification so that we can easily re-use this code. The result is also
      a bit easier to read since we don't embed multiple single-letter loop
      counters.
      
      Change-ID: Id923cbfa43127b1c24d8ed4f809b1012c736d9ac
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c768e490
    • J
      i40e: don't hold RTNL lock while waiting for VF reset to finish · 024b05f4
      Jacob Keller 提交于
      We made some effort to reduce the RTNL lock scope when resetting and
      rebuilding the PF. Unfortunately we still held the RTNL lock during the
      VF reset operation, which meant that multiple PFs could not reset in
      parallel due to the global lock. For now, further reduce the scope by
      not holding the RTNL lock while resetting VFs. This allows multiple PFs
      to reset in a timely manner.
      
      Change-ID: I2fbf823a0063f24dff67676cad09f0bbf83ee4ce
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      024b05f4
    • S
      i40e/i40evf: Add tracepoints · ed0980c4
      Scott Peterson 提交于
      This patch adds tracepoints to the i40e and i40evf drivers to which
      BPF programs can be attached for feature testing and verification.
      It's expected that an attached BPF program will identify and count or
      log some interesting subset of traffic. The bcc-tools package is
      helpful there for containing all the BPF arcana in a handy Python
      wrapper. Though you can make these tracepoints log trace messages, the
      messages themselves probably won't be very useful (other to verify the
      tracepoint is being called while you're debugging your BPF program).
      
      The idea here is that tracepoints have such low performance cost when
      disabled that we can leave these in the upstream drivers. This may
      eventually enable the instrumentation of unmodified customer systems
      should the need arise to verify a NIC feature is working as expected.
      In general this enables one set of feature verification tools to be
      used on these drivers whether they're built with the kernel or
      separately.
      
      Users are advised against using these tracepoints for anything other
      than a diagnostic tool. They have a performance impact when enabled,
      and their exact placement and form may change as we see how well they
      work in practice for the purposes above.
      
      Change-ID: Id6014a7322c0e6d08068114dd20bd156f2f6435e
      Signed-off-by: NScott Peterson <scott.d.peterson@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      ed0980c4
  3. 08 4月, 2017 8 次提交
  4. 07 4月, 2017 2 次提交
  5. 29 3月, 2017 4 次提交
  6. 28 3月, 2017 4 次提交
  7. 25 3月, 2017 1 次提交
  8. 24 3月, 2017 4 次提交
    • J
      i40e: make use of hlist_for_each_entry_continue · 584a8870
      Jacob Keller 提交于
      Replace a complex if->continue->else->break construction in
      i40e_next_filter. We can simply use hlist_for_each_entry_continue
      instead. This drops a lot of confusing code. The resulting code is much
      easier to understand the intention, and follows the more normal pattern
      for using hlist loops. We could have also used a break with a "return
      next" at the end of the function, instead of return NULL, but the
      current implementation is explicitly clear that when you reach the end
      of the loop you get a NULL value. The alternative construction is less
      clear since the reader would have to know that next is NULL at the end
      of the loop.
      
      Change-Id: Ife74ca451dd79d7f0d93c672bd42092d324d4a03
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      584a8870
    • J
      i40e: add support for SCTPv4 FDir filters · f223c875
      Jacob Keller 提交于
      Enable FDir filters for SCTPv4 packets using the ethtool ntuple
      interface to enable filters. The ethtool API does not allow masking on
      the verification tag.
      
      Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f223c875
    • J
      i40e: implement support for flexible word payload · 0e588de1
      Jacob Keller 提交于
      Add support for flexible payloads passed via ethtool user-def field.
      This support is somewhat limited due to hardware design. The input set
      can only be programmed once per filter type, and the flexible offset is
      part of this filter input set. This means that the user cannot program
      both a regular and a flexible filter at the same time for a given flow
      type. Additionally, the user may not program two flexible filters of the
      same flow type with different offsets, although they are allowed to
      configure different values at that offset location.
      
      We support a single flexible word (2byte) value per protocol type, and
      we handle the FLX_PIT register using a list of flexible entries so that
      each flow type may be configured separately.
      
      Due to hardware implementation, the flexible data is offset from the
      start of the packet payload, and thus may not be in part of the header
      data. For this reason, the offset provided by the user defined data is
      interpreted as a byte offset from the start of the matching payload.
      Previous implementations have tried to represent the offset as from the
      start of the frame, but this is not feasible because header sizes may
      change due to options.
      
      Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0e588de1
    • J
      i40e: restore default input set for each flow type · 3bcee1e6
      Jacob Keller 提交于
      Ensure that the default input set is correctly reprogrammed when
      cleaning up after disabling flow director support. This ensures that the
      programmed value will be in a clean state.
      
      Although we do not yet have support for SCTPv4 filters, a future patch
      will add support for this protocol, so we will correctly restore the
      SCTPv4 input set here as well. Note that strictly speaking the default
      hardware value for SCTP includes matching the verification tag. However,
      the ethtool API does not have support for specifying this value, so
      there is no reason to keep the verification field enabled.
      
      This patch is the next step on the way to enabling partial tuple filters
      which will be implemented in a following patch.
      
      Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3bcee1e6
  9. 21 3月, 2017 5 次提交
  10. 16 3月, 2017 1 次提交
    • A
      mqprio: Modify mqprio to pass user parameters via ndo_setup_tc. · 56f36acd
      Amritha Nambiar 提交于
      The configurable priority to traffic class mapping and the user specified
      queue ranges are used to configure the traffic class, overriding the
      hardware defaults when the 'hw' option is set to 0. However, when the 'hw'
      option is non-zero, the hardware QOS defaults are used.
      
      This patch makes it so that we can pass the data the user provided to
      ndo_setup_tc. This allows us to pull in the queue configuration if the
      user requested it as well as any additional hardware offload type
      requested by using a value other than 1 for the hw value.
      
      Finally it also provides a means for the device driver to return the level
      supported for the offload type via the qopt->hw value. Previously we were
      just always assuming the value to be 1, in the future values beyond just 1
      may be supported.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56f36acd
  11. 15 3月, 2017 2 次提交