1. 15 3月, 2022 4 次提交
    • J
      ice: rename ICE_MAX_VF_COUNT to avoid confusion · dc36796e
      Jacob Keller 提交于
      The ICE_MAX_VF_COUNT field is defined in ice_sriov.h. This count is true
      for SR-IOV but will not be true for all VF implementations, such as when
      the ice driver supports Scalable IOV.
      
      Rename this definition to clearly indicate ICE_MAX_SRIOV_VFS.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      dc36796e
    • J
      ice: remove unused definitions from ice_sriov.h · 00a57e29
      Jacob Keller 提交于
      A few more macros exist in ice_sriov.h which are not used anywhere.
      These can be safely removed. Note that ICE_VIRTCHNL_VF_CAP_L2 capability
      is set but never checked anywhere in the driver. Thus it is also safe to
      remove.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      00a57e29
    • J
      ice: convert vf->vc_ops to a const pointer · a7e11710
      Jacob Keller 提交于
      The vc_ops structure is used to allow different handlers for virtchnl
      commands when the driver is in representor mode. The current
      implementation uses a copy of the ops table in each VF, and modifies
      this copy dynamically.
      
      The usual practice in kernel code is to store the ops table in a
      constant structure and point to different versions. This has a number of
      advantages:
      
        1. Reduced memory usage. Each VF merely points to the correct table,
           so they're able to re-use the same constant lookup table in memory.
        2. Consistency. It becomes more difficult to accidentally update or
           edit only one op call. Instead, the code switches to the correct
           able by a single pointer write. In general this is atomic, either
           the pointer is updated or its not.
        3. Code Layout. The VF structure can store a pointer to the table
           without needing to have the full structure definition defined prior
           to the VF structure definition. This will aid in future refactoring
           of code by allowing the VF pointer to be kept in ice_vf_lib.h while
           the virtchnl ops table can be maintained in ice_virtchnl.h
      
      There is one major downside in the case of the vc_ops structure. Most of
      the operations in the table are the same between the two current
      implementations. This can appear to lead to duplication since each
      implementation must now fill in the complete table. It could make
      spotting the differences in the representor mode more challenging.
      Unfortunately, methods to make this less error prone either add
      complexity overhead (macros using CPP token concatenation) or don't work
      on all compilers we support (constant initializer from another constant
      structure).
      
      The cost of maintaining two structures does not out weigh the benefits
      of the constant table model.
      
      While we're making these changes, go ahead and rename the structure and
      implementations with "virtchnl" instead of "vc_vf_". This will more
      closely align with the planned file renaming, and avoid similar names when
      we later introduce a "vf ops" table for separating Scalable IOV and
      Single Root IOV implementations.
      
      Leave the accessor/assignment functions in order to avoid issues with
      compiling with options disabled. The interface makes it easier to handle
      when CONFIG_PCI_IOV is disabled in the kernel.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      a7e11710
    • J
      ice: rename ice_virtchnl_pf.c to ice_sriov.c · 0deb0bf7
      Jacob Keller 提交于
      The ice_virtchnl_pf.c and ice_virtchnl_pf.h files are where most of the
      code for implementing Single Root IOV virtualization resides. This code
      includes support for bringing up and tearing down VFs, hooks into the
      kernel SR-IOV netdev operations, and for handling virtchnl messages from
      VFs.
      
      In the future, we plan to support Scalable IOV in addition to Single
      Root IOV as an alternative virtualization scheme. This implementation
      will re-use some but not all of the code in ice_virtchnl_pf.c
      
      To prepare for this future, we want to refactor and split up the code in
      ice_virtchnl_pf.c into the following scheme:
      
       * ice_vf_lib.[ch]
      
         Basic VF structures and accessors. This is where scheme-independent
         code will reside.
      
       * ice_virtchnl.[ch]
      
         Virtchnl message handling. This is where the bulk of the logic for
         processing messages from VFs using the virtchnl messaging scheme will
         reside. This is separated from ice_vf_lib.c because it is distinct
         and has a bulk of the processing code.
      
       * ice_sriov.[ch]
      
         Single Root IOV implementation, including initialization and the
         routines for interacting with SR-IOV based netdev operations.
      
       * (future) ice_siov.[ch]
      
         Scalable IOV implementation.
      
      As a first step, lets assume that all of the code in
      ice_virtchnl_pf.[ch] is for Single Root IOV. Rename this file to
      ice_sriov.c and its header to ice_sriov.h
      
      Future changes will further split out the code in these files following
      the plan outlined here.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      0deb0bf7
  2. 09 3月, 2022 1 次提交
    • J
      ice: stop disabling VFs due to PF error responses · 79498d5a
      Jacob Keller 提交于
      The ice_vc_send_msg_to_vf function has logic to detect "failure"
      responses being sent to a VF. If a VF is sent more than
      ICE_DFLT_NUM_INVAL_MSGS_ALLOWED then the VF is marked as disabled.
      Almost identical logic also existed in the i40e driver.
      
      This logic was added to the ice driver in commit 1071a835 ("ice:
      Implement virtchnl commands for AVF support") which itself copied from
      the i40e implementation in commit 5c3c48ac ("i40e: implement virtual
      device interface").
      
      Neither commit provides a proper explanation or justification of the
      check. In fact, later commits to i40e changed the logic to allow
      bypassing the check in some specific instances.
      
      The "logic" for this seems to be that error responses somehow indicate a
      malicious VF. This is not really true. The PF might be sending an error
      for any number of reasons such as lack of resources, etc.
      
      Additionally, this causes the PF to log an info message for every failed
      VF response which may confuse users, and can spam the kernel log.
      
      This behavior is not documented as part of any requirement for our
      products and other operating system drivers such as the FreeBSD
      implementation of our drivers do not include this type of check.
      
      In fact, the change from dev_err to dev_info in i40e commit 18b7af57
      ("i40e: Lower some message levels") explains that these messages
      typically don't actually indicate a real issue. It is quite likely that
      a user who hits this in practice will be very confused as the VF will be
      disabled without an obvious way to recover.
      
      We already have robust malicious driver detection logic using actual
      hardware detection mechanisms that detect and prevent invalid device
      usage. Remove the logic since its not a documented requirement and the
      behavior is not intuitive.
      
      Fixes: 1071a835 ("ice: Implement virtchnl commands for AVF support")
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      79498d5a
  3. 04 3月, 2022 10 次提交
    • J
      ice: convert VF storage to hash table with krefs and RCU · 3d5985a1
      Jacob Keller 提交于
      The ice driver stores VF structures in a simple array which is allocated
      once at the time of VF creation. The VF structures are then accessed
      from the array by their VF ID. The ID must be between 0 and the number
      of allocated VFs.
      
      Multiple threads can access this table:
      
       * .ndo operations such as .ndo_get_vf_cfg or .ndo_set_vf_trust
       * interrupts, such as due to messages from the VF using the virtchnl
         communication
       * processing such as device reset
       * commands to add or remove VFs
      
      The current implementation does not keep track of when all threads are
      done operating on a VF and can potentially result in use-after-free
      issues caused by one thread accessing a VF structure after it has been
      released when removing VFs. Some of these are prevented with various
      state flags and checks.
      
      In addition, this structure is quite static and does not support a
      planned future where virtualization can be more dynamic. As we begin to
      look at supporting Scalable IOV with the ice driver (as opposed to just
      supporting Single Root IOV), this structure is not sufficient.
      
      In the future, VFs will be able to be added and removed individually and
      dynamically.
      
      To allow for this, and to better protect against a whole class of
      use-after-free bugs, replace the VF storage with a combination of a hash
      table and krefs to reference track all of the accesses to VFs through
      the hash table.
      
      A hash table still allows efficient look up of the VF given its ID, but
      also allows adding and removing VFs. It does not require contiguous VF
      IDs.
      
      The use of krefs allows the cleanup of the VF memory to be delayed until
      after all threads have released their reference (by calling ice_put_vf).
      
      To prevent corruption of the hash table, a combination of RCU and the
      mutex table_lock are used. Addition and removal from the hash table use
      the RCU-aware hash macros. This allows simple read-only look ups that
      iterate to locate a single VF can be fast using RCU. Accesses which
      modify the hash table, or which can't take RCU because they sleep, will
      hold the mutex lock.
      
      By using this design, we have a stronger guarantee that the VF structure
      can't be released until after all threads are finished operating on it.
      We also pave the way for the more dynamic Scalable IOV implementation in
      the future.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      3d5985a1
    • J
      ice: introduce VF accessor functions · fb916db1
      Jacob Keller 提交于
      Before we switch the VF data structure storage mechanism to a hash,
      introduce new accessor functions to define the new interface.
      
      * ice_get_vf_by_id is a function used to obtain a reference to a VF from
        the table based on its VF ID
      * ice_has_vfs is used to quickly check if any VFs are configured
      * ice_get_num_vfs is used to get an exact count of how many VFs are
        configured
      
      We can drop the old ice_validate_vf_id function, since every caller was
      just going to immediately access the VF table to get a reference
      anyways. This way we simply use the single ice_get_vf_by_id to both
      validate the VF ID is within range and that there exists a VF with that
      ID.
      
      This change enables us to more easily convert the codebase to the hash
      table since most callers now properly use the interface.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      fb916db1
    • J
      ice: factor VF variables to separate structure · 000773c0
      Jacob Keller 提交于
      We maintain a number of values for VFs within the ice_pf structure. This
      includes the VF table, the number of allocated VFs, the maximum number
      of supported SR-IOV VFs, the number of queue pairs per VF, the number of
      MSI-X vectors per VF, and a bitmap of the VFs with detected MDD events.
      
      We're about to add a few more variables to this list. Clean this up
      first by extracting these members out into a new ice_vfs structure
      defined in ice_virtchnl_pf.h
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      000773c0
    • J
      ice: convert ice_for_each_vf to include VF entry iterator · c4c2c7db
      Jacob Keller 提交于
      The ice_for_each_vf macro is intended to be used to loop over all VFs.
      The current implementation relies on an iterator that is the index into
      the VF array in the PF structure. This forces all users to perform a
      look up themselves.
      
      This abstraction forces a lot of duplicate work on callers and leaks the
      interface implementation to the caller. Replace this with an
      implementation that includes the VF pointer the primary iterator. This
      version simplifies callers which just want to iterate over every VF, as
      they no longer need to perform their own lookup.
      
      The "i" iterator value is replaced with a new unsigned int "bkt"
      parameter, as this will match the necessary interface for replacing
      the VF array with a hash table. For now, the bkt is the VF ID, but in
      the future it will simply be the hash bucket index. Document that it
      should not be treated as a VF ID.
      
      This change aims to simplify switching from the array to a hash table. I
      considered alternative implementations such as an xarray but decided
      that the hash table was the simplest and most suitable implementation. I
      also looked at methods to hide the bkt iterator entirely, but I couldn't
      come up with a feasible solution that worked for hash table iterators.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      c4c2c7db
    • J
      ice: use ice_for_each_vf for iteration during removal · 19281e86
      Jacob Keller 提交于
      When removing VFs, the driver takes a weird approach of assigning
      pf->num_alloc_vfs to 0 before iterating over the VFs using a temporary
      variable.
      
      This logic has been in the driver for a long time, and seems to have
      been carried forward from i40e.
      
      We want to refactor the way VFs are stored, and iterating over the data
      structure without the ice_for_each_vf interface impedes this work.
      
      The logic relies on implicitly using the num_alloc_vfs as a sort of
      "safe guard" for accessing VF data.
      
      While this sort of guard makes sense for Single Root IOV where all VFs
      are added at once, the data structures don't work for VFs which can be
      added and removed dynamically. We also have a separate state flag,
      ICE_VF_DEINIT_IN_PROGRESS which is a stronger protection against
      concurrent removal and access.
      
      Avoid the custom tmp iteration and replace it with the standard
      ice_for_each_vf iterator. Delay the assignment of num_alloc_vfs until
      after this loop finishes.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      19281e86
    • J
      ice: remove checks in ice_vc_send_msg_to_vf · 59e1f857
      Jacob Keller 提交于
      The ice_vc_send_msg_to_vf function is used by the PF to send a response
      to a VF. This function has overzealous checks to ensure its not passed a
      NULL VF pointer and to ensure that the passed in struct ice_vf has a
      valid vf_id sub-member.
      
      These checks have existed since commit 1071a835 ("ice: Implement
      virtchnl commands for AVF support") and function as simple sanity
      checks.
      
      We are planning to refactor the ice driver to use a hash table along
      with appropriate locks in a future refactor. This change will modify how
      the ice_validate_vf_id function works. Instead of a simple >= check to
      ensure the VF ID is between some range, it will check the hash table to
      see if the specified VF ID is actually in the table. This requires that
      the function properly lock the table to prevent race conditions.
      
      The checks may seem ok at first glance, but they don't really provide
      much benefit.
      
      In order for ice_vc_send_msg_to_vf to have these checks fail, the
      callers must either (1) pass NULL as the VF, (2) construct an invalid VF
      pointer manually, or (3) be using a VF pointer which becomes invalid
      after they obtain it properly using ice_get_vf_by_id.
      
      For (1), a cursory glance over callers of ice_vc_send_msg_to_vf can show
      that in most cases the functions already operate assuming their VF
      pointer is valid, such as by derferencing vf->pf or other members.
      
      They obtain the VF pointer by accessing the VF array using the VF ID,
      which can never produce a NULL value (since its a simple address
      operation on the array it will not be NULL.
      
      The sole exception for (1) is that ice_vc_process_vf_msg will forward a
      NULL VF pointer to this function as part of its goto error handler
      logic. This requires some minor cleanup to simply exit immediately when
      an invalid VF ID is detected (Rather than use the same error flow as
      the rest of the function).
      
      For (2), it is unexpected for a flow to construct a VF pointer manually
      instead of accessing the VF array. Defending against this is likely to
      just hide bad programming.
      
      For (3), it is definitely true that VF pointers could become invalid,
      for example if a thread is processing a VF message while the VF gets
      removed. However, the correct solution is not to add additional checks
      like this which do not guarantee to prevent the race. Instead we plan to
      solve the root of the problem by preventing the possibility entirely.
      
      This solution will require the change to a hash table with proper
      locking and reference counts of the VF structures. When this is done,
      ice_validate_vf_id will require locking of the hash table. This will be
      problematic because all of the callers of ice_vc_send_msg_to_vf will
      already have to take the lock to obtain the VF pointer anyways. With a
      mutex, this leads to a double lock that could hang the kernel thread.
      
      Avoid this by removing the checks which don't provide much value, so
      that we can safely add the necessary protections properly.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      59e1f857
    • J
      ice: move VFLR acknowledge during ice_free_vfs · 44efe75f
      Jacob Keller 提交于
      After removing all VFs, the driver clears the VFLR indication for VFs.
      This has been in ice since the beginning of SR-IOV support in the ice
      driver.
      
      The implementation was copied from i40e, and the motivation for the VFLR
      indication clearing is described in the commit f7414531 ("i40e:
      acknowledge VFLR when disabling SR-IOV")
      
      The commit explains that we need to clear the VFLR indication because
      the virtual function undergoes a VFLR event. If we don't indicate that
      it is complete it can cause an issue when VFs are re-enabled due to
      a "phantom" VFLR.
      
      The register block read was added under a pci_vfs_assigned check
      originally. This was done because we added the check after calling
      pci_disable_sriov. This was later moved to disable SRIOV earlier in the
      flow so that the VF drivers could be torn down before we removed
      functionality.
      
      Move the VFLR acknowledge into the main loop that tears down VF
      resources. This avoids using the tmp value for iterating over VFs
      multiple times. The result will make it easier to refactor the VF array
      in a future change.
      
      It's possible we might want to modify this flow to also stop checking
      pci_vfs_assigned. However, it seems reasonable to keep this change: we
      should only clear the VFLR if we actually disabled SR-IOV.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      44efe75f
    • J
      ice: move clear_malvf call in ice_free_vfs · 294627a6
      Jacob Keller 提交于
      The ice_mbx_clear_malvf function is used to clear the indication and
      count of how many times a VF was detected as malicious. During
      ice_free_vfs, we use this function to ensure that all removed VFs are
      reset to a clean state.
      
      The call currently is done at the end of ice_free_vfs() using a tmp
      value to iterate over all of the entries in the bitmap.
      
      This separate iteration using tmp is problematic for a planned refactor
      of the VF array data structure. To avoid this, lets move the call
      slightly higher into the function inside the loop where we teardown all
      of the VFs. This avoids one use of the tmp value used for iteration.
      We'll fix the other user in a future change.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      294627a6
    • J
      ice: pass num_vfs to ice_set_per_vf_res() · cd0f4f3b
      Jacob Keller 提交于
      We are planning to replace the simple array structure tracking VFs with
      a hash table. This change will also remove the "num_alloc_vfs" variable.
      
      Instead, new access functions to use the hash table as the source of
      truth will be introduced. These will generally be equivalent to existing
      checks, except during VF initialization.
      
      Specifically, ice_set_per_vf_res() cannot use the hash table as it will
      be operating prior to VF structures being inserted into the hash table.
      
      Instead of using pf->num_alloc_vfs, simply pass the num_vfs value in
      from the caller.
      
      Note that a sub-function of ice_set_per_vf_res, ice_determine_res, also
      implicitly depends on pf->num_alloc_vfs. Replace ice_determine_res with
      a simpler inline implementation based on rounddown_pow_of_two. Note that
      we must explicitly check that the argument is non-zero since it does not
      play well with zero as a value.
      
      Instead of using the function and while loop, simply calculate the
      number of queues we have available by dividing by num_vfs. Check if the
      desired queues are available. If not, round down to the nearest power of
      2 that fits within our available queues.
      
      This matches the behavior of ice_determine_res but is easier to follow
      as simple in-line logic. Remove ice_determine_res entirely.
      
      With this change, we no longer depend on the pf->num_alloc_vfs during
      the initialization phase of VFs. This will allow us to safely remove it
      in a future planned refactor of the VF data structures.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      cd0f4f3b
    • J
      ice: store VF pointer instead of VF ID · b03d519d
      Jacob Keller 提交于
      The VSI structure contains a vf_id field used to associate a VSI with a
      VF. This is used mainly for ICE_VSI_VF as well as partially for
      ICE_VSI_CTRL associated with the VFs.
      
      This API was designed with the idea that VFs are stored in a simple
      array that was expected to be static throughout most of the driver's
      life.
      
      We plan on refactoring VF storage in a few key ways:
      
        1) converting from a simple static array to a hash table
        2) using krefs to track VF references obtained from the hash table
        3) use RCU to delay release of VF memory until after all references
           are dropped
      
      This is motivated by the goal to ensure that the lifetime of VF
      structures is accounted for, and prevent various use-after-free bugs.
      
      With the existing vsi->vf_id, the reference tracking for VFs would
      become somewhat convoluted, because each VSI maintains a vf_id field
      which will then require performing a look up. This means all these flows
      will require reference tracking and proper usage of rcu_read_lock, etc.
      
      We know that the VF VSI will always be backed by a valid VF structure,
      because the VSI is created during VF initialization and removed before
      the VF is destroyed. Rely on this and store a reference to the VF in the
      VSI structure instead of storing a VF ID. This will simplify the usage
      and avoid the need to perform lookups on the hash table in the future.
      
      For ICE_VSI_VF, it is expected that vsi->vf is always non-NULL after
      ice_vsi_alloc succeeds. Because of this, use WARN_ON when checking if a
      vsi->vf pointer is valid when dealing with VF VSIs. This will aid in
      debugging code which violates this assumption and avoid more disastrous
      panics.
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      b03d519d
  4. 19 2月, 2022 1 次提交
    • J
      ice: fix concurrent reset and removal of VFs · fadead80
      Jacob Keller 提交于
      Commit c503e632 ("ice: Stop processing VF messages during teardown")
      introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is
      intended to prevent some issues with concurrently handling messages from
      VFs while tearing down the VFs.
      
      This change was motivated by crashes caused while tearing down and
      bringing up VFs in rapid succession.
      
      It turns out that the fix actually introduces issues with the VF driver
      caused because the PF no longer responds to any messages sent by the VF
      during its .remove routine. This results in the VF potentially removing
      its DMA memory before the PF has shut down the device queues.
      
      Additionally, the fix doesn't actually resolve concurrency issues within
      the ice driver. It is possible for a VF to initiate a reset just prior
      to the ice driver removing VFs. This can result in the remove task
      concurrently operating while the VF is being reset. This results in
      similar memory corruption and panics purportedly fixed by that commit.
      
      Fix this concurrency at its root by protecting both the reset and
      removal flows using the existing VF cfg_lock. This ensures that we
      cannot remove the VF while any outstanding critical tasks such as a
      virtchnl message or a reset are occurring.
      
      This locking change also fixes the root cause originally fixed by commit
      c503e632 ("ice: Stop processing VF messages during teardown"), so we
      can simply revert it.
      
      Note that I kept these two changes together because simply reverting the
      original commit alone would leave the driver vulnerable to worse race
      conditions.
      
      Fixes: c503e632 ("ice: Stop processing VF messages during teardown")
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      fadead80
  5. 10 2月, 2022 10 次提交
  6. 07 1月, 2022 1 次提交
    • W
      ice: improve switchdev's slow-path · c1e5da5d
      Wojciech Drewek 提交于
      In current switchdev implementation, every VF PR is assigned to
      individual ring on switchdev ctrl VSI. For slow-path traffic, there
      is a mapping VF->ring done in software based on src_vsi value (by
      calling ice_eswitch_get_target_netdev function).
      
      With this change, HW solution is introduced which is more
      efficient. For each VF, src MAC (VF's MAC) filter will be created,
      which forwards packets to the corresponding switchdev ctrl VSI queue
      based on src MAC address.
      
      This filter has to be removed and then replayed in case of
      resetting one VF. Keep information about this rule in repr->mac_rule,
      thanks to that we know which rule has to be removed and replayed
      for a given VF.
      
      In case of CORE/GLOBAL all rules are removed
      automatically. We have to take care of readding them. This is done
      by ice_replay_vsi_adv_rule.
      
      When driver leaves switchdev mode, remove all advanced rules
      from switchdev ctrl VSI. This is done by ice_rem_adv_rule_for_vsi.
      
      Flag repr->rule_added is needed because in some cases reset
      might be triggered before VF sends request to add MAC.
      Co-developed-by: NGrzegorz Nitka <grzegorz.nitka@intel.com>
      Signed-off-by: NGrzegorz Nitka <grzegorz.nitka@intel.com>
      Signed-off-by: NWojciech Drewek <wojciech.drewek@intel.com>
      Tested-by: NSandeep Penigalapati <sandeep.penigalapati@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      c1e5da5d
  7. 15 12月, 2021 8 次提交
  8. 08 12月, 2021 2 次提交
  9. 03 11月, 2021 3 次提交
    • B
      ice: Fix race conditions between virtchnl handling and VF ndo ops · e6ba5273
      Brett Creeley 提交于
      The VF can be configured via the PF's ndo ops at the same time the PF is
      receiving/handling virtchnl messages. This has many issues, with
      one of them being the ndo op could be actively resetting a VF (i.e.
      resetting it to the default state and deleting/re-adding the VF's VSI)
      while a virtchnl message is being handled. The following error was seen
      because a VF ndo op was used to change a VF's trust setting while the
      VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing:
      
      [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
      [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
      [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
      
      Fix this by making sure the virtchnl handling and VF ndo ops that
      trigger VF resets cannot run concurrently. This is done by adding a
      struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex
      will be locked around the critical operations and VFR. Since the ndo ops
      will trigger a VFR, the virtchnl thread will use mutex_trylock(). This
      is done because if any other thread (i.e. VF ndo op) has the mutex, then
      that means the current VF message being handled is no longer valid, so
      just ignore it.
      
      This issue can be seen using the following commands:
      
      for i in {0..50}; do
              rmmod ice
              modprobe ice
      
              sleep 1
      
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      
              sleep 2
      
              echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
              sleep 1
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      done
      
      Fixes: 7c710869 ("ice: Add handlers for VF netdevice operations")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      e6ba5273
    • B
      ice: Fix not stopping Tx queues for VFs · b385cca4
      Brett Creeley 提交于
      When a VF is removed and/or reset its Tx queues need to be
      stopped from the PF. This is done by calling the ice_dis_vf_qs()
      function, which calls ice_vsi_stop_lan_tx_rings(). Currently
      ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
      Unfortunately, this is causing the Tx queues to not be disabled in some
      cases and when the VF tries to re-enable/reconfigure its Tx queues over
      virtchnl the op is failing. This is because a VF can be reset and/or
      removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
      were already configured via ice_vsi_cfg_single_txq() in the
      VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
      is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
      happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.
      
      This was causing the following error message when loading the ice
      driver, creating VFs, and modifying VF trust in an endless loop:
      
      [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
      [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
      [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
      
      Fix this by always calling ice_dis_vf_qs() and silencing the error
      message in ice_vsi_stop_tx_ring() since the calling code ignores the
      return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
      catch the error, so this doesn't affect those flows since there was no
      change to the values the function returns.
      
      Other solutions were considered (i.e. tracking which VF queues had been
      "started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
      more complicated than it was worth. This solution also brings in the
      chance for other unexpected conditions due to invalid state bit checks.
      So, the proposed solution seemed like the best option since there is no
      harm in failing to stop Tx queues that were never started.
      
      This issue can be seen using the following commands:
      
      for i in {0..50}; do
              rmmod ice
              modprobe ice
      
              sleep 1
      
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      
              sleep 2
      
              echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
              sleep 1
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      done
      
      Fixes: 77ca27c4 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
      Signed-off-by: NBrett Creeley <brett.creeley@intel.com>
      Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      b385cca4
    • S
      ice: Fix replacing VF hardware MAC to existing MAC filter · ce572a5b
      Sylwester Dziedziuch 提交于
      VF was not able to change its hardware MAC address in case
      the new address was already present in the MAC filter list.
      Change the handling of VF add mac request to not return
      if requested MAC address is already present on the list
      and check if its hardware MAC needs to be updated in this case.
      
      Fixes: ed4c068d ("ice: Enable ip link show on the PF to display VF unicast MAC(s)")
      Signed-off-by: NSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Tested-by: NTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
      ce572a5b