1. 29 3月, 2017 22 次提交
  2. 28 3月, 2017 18 次提交
    • D
      Merge tag 'mlx5e-failsafe' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · cc628c96
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      mlx5e-failsafe 27-03-2017
      
      This series provides a fail-safe mechanism to allow safely re-configuring
      mlx5e netdevice and provides a resiliency against sporadic
      configuration failures.
      
      To enable this we do some refactoring and code reorganizing to allow
      breaking the drivers open/close flows to stages:
            open -> activate -> deactivate -> close.
      
      In addition we need to allow creating fresh HW ring resources
      (mlx5e_channels) with their own "new" set of parameters, while keeping
      the current ones running and active until the new channels are
      successfully created with the new configuration, and only then we can
      safly replace (switch) old channels with new ones.
      
      For that we introduce mlx5e_channels object and an API to manage it:
       - channels = open_channels(new_params):
         open fresh TX/RX channels
       - activate_channels(channels):
         redirect traffic to them and attach them to the netdev
       - deactivate_channes(channels)
         stop traffic and detach from netdev
       - close(channels)
         Free the TX/RX HW resources of those channels
      
      With the above strategy it is straightforward to achieve the desired
      behavior of fail-safe configuration.  In pseudo code:
      
      make_new_config(new_params)
      {
      	old_channels = current_active_channels;
      	new_channels = create_channels(new_params);
      	if (!new_channels)
      		return "Failed, but current channels are still active :)"
      
      	deactivate_channels(old_channels); /* Can't fail */
      	set_hw_new_state();                /* If needed  */
      	activate_channels(new_channels);   /* Can't fail */
      	close_channels(old_channels);
      	current_active_channels = new_channels;
      
              return "SUCCESS";
      }
      
      At the top of this series, we change the following flows to be fail-safe:
      ethtool:
         - ring parameters
         - coalesce parameters
         - tx copy break parameters
         - cqe compressing/moderation mode setting (priv flags)
      ndos:
         - tc setup
         - set features: LRO
         - change mtu
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc628c96
    • D
      Merge branch 'bond-link-status-fixes' · 95ed0edd
      David S. Miller 提交于
      Mahesh Bandewar says:
      
      ====================
      link-status fixes for mii-monitoring
      
      The mii monitoring is divided into two phases - inspect and commit. The
      inspect phase technically should not make any changes to the state and
      defer it to the commit phase. However detected link state inconsistencies
      on several machines and discovered that it's the result of some
      inconsistent update to link states and assumption that you *always* get
      rtnl-mutex. In reality when trylock() fails to acquire rtnl-mutex, the
      commit phase is postponed until next mii-mon run. At the next round
      because of the state change performed in the previous inspect-run, this
      round does not detect any changes and would skip calling commit phase.
      This would result in an inconsistent state until next link event happens
      (if it ever happens).
      
      During the the commit phase, it's always assumed that speed and duplex
      fetch is always successful, but that's always not the case. However the
      slave state is marked UP irrespective of speed / duplex fetch operation.
      If the speed / duplex fetch operation results in insane values for either
      of these two fields, then keeping internal link state UP is not going to
      provide fruitful results either.
      
      Please see into individual patches for more details.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95ed0edd
    • M
      e292dcae
    • M
      bonding: correctly update link status during mii-commit phase · b5bf0f5b
      Mahesh Bandewar 提交于
      bond_miimon_commit() marks the link UP after attempting to get the speed
      and duplex settings for the link. There is a possibility that
      bond_update_speed_duplex() could fail. This is another place where it
      could result into an inconsistent bonding link state.
      
      With this patch the link will be marked UP only if the speed and duplex
      values retrieved have sane values and processed further.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b5bf0f5b
    • M
      bonding: make speed, duplex setting consistent with link state · c4adfc82
      Mahesh Bandewar 提交于
      bond_update_speed_duplex() retrieves speed and duplex settings. There
      is a possibility of failure in retrieving these values but caller has
      to assume it's always successful. This leads to having inconsistent
      slave link settings. If these (speed, duplex) values cannot be
      retrieved, then keeping the link UP causes problems.
      
      The updated bond_update_speed_duplex() returns 0 on success if it
      retrieves sane values for speed and duplex. On failure it returns 1
      and marks the link down.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c4adfc82
    • M
      bonding: improve link-status update in mii-monitoring · de77ecd4
      Mahesh Bandewar 提交于
      The primary issue is that mii-inspect phase updates link-state and
      expects changes to be committed during the mii-commit phase. After
      the inspect phase if it fails to acquire rtnl-mutex, the commit
      phase (bond_mii_commit) doesn't get to run. This partially updated
      state stays and makes the internal-state inconsistent.
      
      e.g. setup bond0 => slaves: eth1, eth2
      eth1 goes DOWN -> UP
         mii_monitor()
      	mii-inspect()
      	    bond_set_slave_link_state(eth1, UP, DontNotify)
      	rtnl_trylock() <- fails!
      
      Next mii-monitor round
      eth1: No change
         mii_monitor()
      	mii-inspect()
      	    eth1->link == current-status (ethtool_ops->get_link)
      	    no-change-detected
      
      End result:
          eth1:
            Link = BOND_LINK_UP
            Speed = 0xfffff  [SpeedUnknown]
            Duplex = 0xff    [DuplexUnknown]
      
      This doesn't always happen but for some unlucky machines in a large set
      of machines it creates problems.
      
      The fix for this is to avoid making changes during inspect phase and
      postpone them until acquiring the rtnl-mutex / invoking commit phase.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de77ecd4
    • M
      bonding: split bond_set_slave_link_state into two parts · f307668b
      Mahesh Bandewar 提交于
      Split the function into two (a) propose (b) commit phase without
      changing the semantics for the original API.
      Signed-off-by: NMahesh Bandewar <maheshb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f307668b
    • D
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 205ed44e
      David S. Miller 提交于
      Jeff Kirsher says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2017-03-27
      
      This series contains updates to i40e and i40evf only.
      
      Alex updates the driver code so that we can do bulk updates of the page
      reference count instead of just incrementing it by one reference at a
      time.  Fixed an issue where we were not resetting skb back to NULL when
      we have freed it.  Cleaned up the i40e_process_skb_fields() to align with
      other Intel drivers.  Removed FCoE code, since it is not supported in any
      of the Fortville/Fortpark hardware, so there is not much point of carrying
      the code around, especially if it is broken and untested.
      
      Harshitha fixes a bug in the driver where the calculation of the RSS size
      was not taking into account the number of traffic classes enabled.
      
      Robert fixes a potential race condition during VF reset by eliminating
      IOMMU DMAR Faults caused by VF hardware and when the OS initiates a VF
      reset and before the reset is finished we modify the VF's settings.
      
      Bimmy removes a delay that is no longer needed, since it was only needed
      for preproduction hardware.
      
      Colin King fixes null pointer dereference, where VSI was being
      dereferenced before the VSI NULL check.
      
      Jake fixes an issue with the recent addition of the "client code" to the
      driver, where we attempt to use an uninitialized variable, so correctly
      initialize the params variable by calling i40e_client_get_params().
      
      v2: dropped patch 5 of the original series from Carolyn since we need
          more documentation and reason why the added delay, so Carolyn is
          taking the time to update the patch before we re-submit it for
          kernel inclusion.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      205ed44e
    • J
      i40e: initialize params before notifying of l2_param_changes · 7be147dc
      Jacob Keller 提交于
      Probably due to some mis-merging fix a bug associated with commits
      d7ce6422 ("i40e: don't check params until after checking for client
      instance", 2017-02-09) and 3140aa9a78c9 ("i40e: KISS the client
      interface", 2017-03-14)
      
      The first commit tried to move the initialization of the params
      structure so that we didn't bother doing this if we didn't have a client
      interface. You can already see that it looks fishy because of the
      indentation. The second commit refactors a bunch of the interface, and
      incorrectly drops the params initialization.
      
      I believe what occurred is that internally the two patches were
      re-ordered, and the merge conflicts as a result were performed
      incorrectly.
      
      Fix the use of an uninitialized variable by correctly initializing the
      params variable via i40e_client_get_params().
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      7be147dc
    • C
      i40evf: dereference VSI after VSI has been null checked · 703ba885
      Colin Ian King 提交于
      VSI is being dereferenced before the VSI null check; if VSI is
      null we end up with a null pointer dereference.  Fix this by
      performing VSI deference after the VSI null check.  Also remove
      the need for using adapter by using vsi->back->cinst.
      
      Detected by CoverityScan, CID#1419696, CID#1419697
      ("Dereference before null check")
      
      Fixes: ed0e894d ("i40evf: add client interface")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      703ba885
    • A
      i40e: Drop FCoE code that always evaluates to false or 0 · c76cb6ed
      Alexander Duyck 提交于
      Since FCoE isn't supported by the i40e products there isn't much point in
      carrying around code that will always evaluate to false. This patch goes
      through and strips out the code in several spots so that we don't go around
      carrying variables and/or code that is always going to evaluate to false or
      0.
      
      Change-ID: I39d1d779c66c638b75525839db2b6208fdc809d7
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      c76cb6ed
    • A
      i40e: Drop FCoE code from core driver files · 9eed69a9
      Alexander Duyck 提交于
      Looking over the code for FCoE it looks like the Rx path has been broken at
      least since the last major Rx refactor almost a year ago.  It seems like
      FCoE isn't supported for any of the Fortville/Fortpark hardware so there
      isn't much point in carrying the code around, especially if it is broken
      and untested.
      
      Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      9eed69a9
    • A
      i40e/i40evf: Clean-up process_skb_fields · a5b268e4
      Alexander Duyck 提交于
      This is a minor clean-up to make the i40e/i40evf process_skb_fields
      function look a little more like what we have in igb.  The Rx checksum
      function called out a need for skb->protocol but I can't see where it
      actually needs it.  I am assuming this is something that was likely
      refactored out some time ago as the Rx checksum code has gone through a few
      rewrites.
      
      Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      a5b268e4
    • B
      i40e: removed no longer needed delays · 0a25b731
      Bimmy Pujari 提交于
      Removed no longer needed delays.  At preproduction stage those delays were
      needed but now these delays are not needed.
      Signed-off-by: NBimmy Pujari <bimmy.pujari@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0a25b731
    • R
      i40e: Fixed race conditions in VF reset · beff3e9d
      Robert Konklewski 提交于
      First, this patch eliminates IOMMU DMAR Faults caused by VF hardware.
      This is done by enabling VF hardware only after VSI resources are
      freed. Otherwise, hardware could DMA into memory that is (or just has
      been) being freed.
      
      Then, the VF driver is activated only after VSI resources have been
      reallocated. That's because the VF driver can request resources
      immediately after it's activated. So they need to be ready at that
      point.
      
      The second race condition happens when the OS initiates a VF reset,
      and then before it's finished modifies VF's settings by changing its
      MAC, VLAN ID, bandwidth allocation, anti-spoof checking, etc. These
      functions needed to be blocked while VF is undergoing reset. Otherwise,
      they could operate on data structures that had just been freed or not
      yet fully initialized.
      
      Change-ID: I43ba5a7ae2c9a1cce3911611ffc4598ae33ae3ff
      Signed-off-by: NRobert Konklewski <robertx.konklewski@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      beff3e9d
    • A
      i40e/i40evf: Fix use after free in Rx cleanup path · 741b8b83
      Alexander Duyck 提交于
      We need to reset skb back to NULL when we have freed it in the Rx cleanup
      path.  I found one spot where this wasn't occurring so this patch fixes it.
      
      Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      741b8b83
    • H
      i40e: fix configuration of RSS table with DCB · f25571b5
      Harshitha Ramamurthy 提交于
      There exists a bug in the driver where the calculation of the
      RSS size was not taking into account the number of traffic classes
      enabled. This patch factors in the traffic classes both in
      the initial configuration of the table as well as reconfiguration.
      
      Change-ID: I34dcd345ce52faf1d6b9614bea28d450cfd5f621
      Signed-off-by: NHarshitha Ramamurthy <harshitha.ramamurthy@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f25571b5
    • A
      i40e/i40evf: Update code to better handle incrementing page count · 1793668c
      Alexander Duyck 提交于
      Update the driver code so that we do bulk updates of the page reference
      count instead of just incrementing it by one reference at a time.  The
      advantage to doing this is that we cut down on atomic operations and
      this in turn should give us a slight improvement in cycles per packet.
      In addition if we eventually move this over to using build_skb the gains
      will be more noticeable.
      
      I also found and fixed a store forwarding stall from where we were
      assigning "*new_buff = *old_buff".  By breaking it up into individual
      copies we can avoid this and as a result the performance is slightly
      improved.
      
      Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      1793668c