1. 22 4月, 2016 1 次提交
  2. 19 3月, 2016 1 次提交
  3. 03 3月, 2016 1 次提交
  4. 02 3月, 2016 2 次提交
  5. 17 2月, 2016 1 次提交
    • H
      net/mlx4_core: Set UAR page size to 4KB regardless of system page size · 85743f1e
      Huy Nguyen 提交于
      problem description:
      
      The current code sets UAR page size equal to system page size.
      The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
      The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.
      
      solution:
      
      Always set UAR page to 4KB. This allows more UAR pages if the OS
      has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
      system page size, with 4MB uar region, there are 4MB/2/64KB = 32
      uars (half for uar, half for blueflame). This does not meet minimum 128
      UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
      which meet the minimum requirement.
      
      Note that only codes in mlx4_core that deal with firmware know that uar
      page size is 4KB. Codes that deal with usr page in cq and qp context
      (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
      that uar page size equals to system page size.
      
      Note that with this implementation, on 64KB system page size kernel, there
      are 16 uars per system page but only one uars is used. The other 15
      uars are ignored because of the above assumption.
      
      Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
      to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
      firmware.
      
      Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
      the virtual OS must be updated. If hypervisor has old code, and the virtual
      OS has this new code, the new code will be backward compatible with the
      old code. If the uar size is big enough, this new code in VF continues to
      work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
      meet 128 uars requirement, this new code not loaded in VF and print the same
      error message as the old code in Hypervisor.
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85743f1e
  6. 07 12月, 2015 1 次提交
  7. 16 11月, 2015 1 次提交
  8. 15 10月, 2015 1 次提交
    • J
      net/mlx4_core: Replace VF zero mac with random mac in mlx4_core · 2b3ddf27
      Jack Morgenstein 提交于
      By design, when no default MAC addresses are set in the Hypervisor for VFs,
      the VFs are passed zero-macs. When such a MAC is received by the VF, it
      generates a random MAC address and registers that MAC address
      with the Hypervisor.
      
      This random mac generation is currently done in the mlx4_en module.
      There is a problem, though, if the mlx4_ib module is loaded by a VF before
      the mlx4_en module. In this case, for RoCE, mlx4_ib will see the un-replaced
      zero-mac and register that zero-mac as part of QP1 initialization.
      
      Having a zero-mac in the port's MAC table creates problems for a
      Baseboard Management Console. The BMC occasionally sends packets with a
      zero-mac destination MAC. If there is a zero-mac present in the port's
      MAC table, the FW will send such BMC packets to the host driver rather than
      to the wire, and BMC will stop working.
      
      To address this problem, we move the replacement of zero-mac addresses
      with random-mac addresses to procedure mlx4_slave_cap(), which is part of the
      driver startup for VFs, and is before activation of mlx4_ib and mlx4_en.
      As a result, zero-mac addresses will never be registered in the port MAC table
      by the driver.
      
      In addition, when mlx4_en does initialize the net device, it needs to set
      the NET_ADDR_RANDOM flag in the netdev structure if the address was
      randomly generated. This is done so that udev on the VM does not create
      a new device name after each VF probe (VM boot and such). To accomplish this,
      we add a per-port flag in mlx4_dev which gets set whenever mlx4_core replaces
      a zero-mac with a randomly-generated mac. This flag is examined when mlx4_en
      initializes the net-device.
      
      Fix was suggested by Matan Barak <matanb@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b3ddf27
  9. 08 10月, 2015 1 次提交
  10. 28 8月, 2015 1 次提交
  11. 28 7月, 2015 1 次提交
  12. 27 7月, 2015 1 次提交
  13. 09 7月, 2015 1 次提交
  14. 16 6月, 2015 6 次提交
  15. 13 6月, 2015 1 次提交
  16. 04 6月, 2015 2 次提交
  17. 31 5月, 2015 2 次提交
    • I
      net/mlx4_core: Move affinity hints to mlx4_core ownership · de161803
      Ido Shamay 提交于
      Now that EQs management is in the sole responsibility of mlx4_core,
      the IRQ affinity hints configuration should be in its hands as well.
      request_irq is called only once by the first consumer (maybe mlx4_ib),
      so mlx4_en passes the affinity mask too late. We also need to request
      vectors according to the cores we want to run on.
      
      mlx4_core distribution of IRQs to cores is straight forward,
      EQ(i)->IRQ will set affinity hint to core i.
      Consumers need to request EQ vectors, according to their cores
      considerations (NUMA).
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de161803
    • M
      net/mlx4: Add EQ pool · c66fa19c
      Matan Barak 提交于
      Previously, mlx4_en allocated EQs and used them exclusively.
      This affected RoCE performance, as applications which are
      events sensitive were limited to use only the legacy EQs.
      
      Change that by introducing an EQ pool. This pool is managed
      by mlx4_core. EQs are assigned to ports (when there are limited
      number of EQs, multiple ports could be assigned to the same EQs).
      
      An exception to this rule is the ASYNC EQ which handles various events.
      
      Legacy EQs are completely removed as all EQs could be shared.
      
      When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for
      EQ serving on a specific port. The core driver calculates which
      EQ should be assigned to that request.
      
      Because IRQs are shared between IB and Ethernet modules, their
      names only include the PCI device BDF address.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c66fa19c
  18. 25 5月, 2015 1 次提交
  19. 16 4月, 2015 2 次提交
  20. 03 4月, 2015 3 次提交
  21. 19 3月, 2015 1 次提交
  22. 05 2月, 2015 2 次提交
  23. 28 1月, 2015 2 次提交
  24. 26 1月, 2015 4 次提交
    • Y
      net/mlx4_core: Reset flow activation upon SRIOV fatal command cases · 0cd93027
      Yishai Hadas 提交于
      When SRIOV commands are executed over the comm-channel and get
      a fatal error (e.g. timeout, closing command failure) the VF enters
      into error state and reset flow is activated.
      
      To be able to recognize whether the failure was on a closing command, the
      operational code for the given VHCR command is used. Once the device entered
      into an error state we prevent redundant error messages from being printed.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0cd93027
    • Y
      net/mlx4_core: Enable device recovery flow with SRIOV · 55ad3592
      Yishai Hadas 提交于
      In SRIOV, both the PF and the VF may attempt device recovery whenever they
      assume that the device is not functioning.  When the PF driver resets the
      device, the VF should detect this and attempt to reinitialize itself.
      
      The VF must be able to reset itself under all circumstances, even
      if the PF is not responsive.
      
      The VF shall reset itself in the following cases:
      
      1. Commands are not processed within reasonable time over the communication channel.
      This is done considering device state and the correct return code based on
      the command as was done in the native mode, done in the next patch.
      
      2. The VF driver receives an internal error event reported by the PF on the
      communication channel. This occurs when the PF driver resets the device or
      when VF is out of sync with the PF.
      
      Add 'VF reset' capability, which allows the VF to reinitialize itself even when the
      PF is not responsive.
      
      As PF and VF may run their reset flow simulantanisly, there are several cases
      that are handled:
      - Prevent freeing VF resources upon FLR, when PF is in its unloading stage.
      - Prevent PF getting VF commands before it has finished initializing its resources.
      - Upon VF startup, check that comm-channel is online before sending
        commands to the PF and getting timed-out.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      55ad3592
    • Y
      net/mlx4_core: Handle AER flow properly · 2ba5fbd6
      Yishai Hadas 提交于
      Fix AER callbacks to work properly, it includes:
      - Refractoring AER to be aligned with Reset flow support.
      - Sync with concurrent catas flow.
      
      In addition, fix the shutdown PCI callback to sync with
      concurrent catas flow.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ba5fbd6
    • Y
      net/mlx4_core: Manage interface state for Reset flow cases · c69453e2
      Yishai Hadas 提交于
      We need to manage interface state to sync between reset flow and some other
      relative cases such as remove_one. This has to be done to prevent certain
      races. For example in case software stack is down as a result of unload call,
      the remove_one should skip the unload phase.
      
      Implement the remove_one case, handling AER and other cases comes next.
      
      The interface can be up/down, upon remove_one, the state will include an extra
      bit indicating that the device is cleaned-up, forcing other tasks to finish
      before the final cleanup.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c69453e2