1. 17 2月, 2016 1 次提交
    • H
      net/mlx4_core: Set UAR page size to 4KB regardless of system page size · 85743f1e
      Huy Nguyen 提交于
      problem description:
      
      The current code sets UAR page size equal to system page size.
      The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
      The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.
      
      solution:
      
      Always set UAR page to 4KB. This allows more UAR pages if the OS
      has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
      system page size, with 4MB uar region, there are 4MB/2/64KB = 32
      uars (half for uar, half for blueflame). This does not meet minimum 128
      UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
      which meet the minimum requirement.
      
      Note that only codes in mlx4_core that deal with firmware know that uar
      page size is 4KB. Codes that deal with usr page in cq and qp context
      (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
      that uar page size equals to system page size.
      
      Note that with this implementation, on 64KB system page size kernel, there
      are 16 uars per system page but only one uars is used. The other 15
      uars are ignored because of the above assumption.
      
      Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
      to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
      firmware.
      
      Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
      the virtual OS must be updated. If hypervisor has old code, and the virtual
      OS has this new code, the new code will be backward compatible with the
      old code. If the uar size is big enough, this new code in VF continues to
      work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
      meet 128 uars requirement, this new code not loaded in VF and print the same
      error message as the old code in Hypervisor.
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85743f1e
  2. 20 1月, 2016 3 次提交
  3. 09 12月, 2015 1 次提交
  4. 22 10月, 2015 1 次提交
  5. 15 10月, 2015 1 次提交
    • J
      net/mlx4_core: Replace VF zero mac with random mac in mlx4_core · 2b3ddf27
      Jack Morgenstein 提交于
      By design, when no default MAC addresses are set in the Hypervisor for VFs,
      the VFs are passed zero-macs. When such a MAC is received by the VF, it
      generates a random MAC address and registers that MAC address
      with the Hypervisor.
      
      This random mac generation is currently done in the mlx4_en module.
      There is a problem, though, if the mlx4_ib module is loaded by a VF before
      the mlx4_en module. In this case, for RoCE, mlx4_ib will see the un-replaced
      zero-mac and register that zero-mac as part of QP1 initialization.
      
      Having a zero-mac in the port's MAC table creates problems for a
      Baseboard Management Console. The BMC occasionally sends packets with a
      zero-mac destination MAC. If there is a zero-mac present in the port's
      MAC table, the FW will send such BMC packets to the host driver rather than
      to the wire, and BMC will stop working.
      
      To address this problem, we move the replacement of zero-mac addresses
      with random-mac addresses to procedure mlx4_slave_cap(), which is part of the
      driver startup for VFs, and is before activation of mlx4_ib and mlx4_en.
      As a result, zero-mac addresses will never be registered in the port MAC table
      by the driver.
      
      In addition, when mlx4_en does initialize the net device, it needs to set
      the NET_ADDR_RANDOM flag in the netdev structure if the address was
      randomly generated. This is done so that udev on the VM does not create
      a new device name after each VF probe (VM boot and such). To accomplish this,
      we add a per-port flag in mlx4_dev which gets set whenever mlx4_core replaces
      a zero-mac with a randomly-generated mac. This flag is examined when mlx4_en
      initializes the net-device.
      
      Fix was suggested by Matan Barak <matanb@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b3ddf27
  6. 31 8月, 2015 1 次提交
    • M
      IB/mlx4: Implement ib_device callbacks · e26be1bf
      Moni Shoua 提交于
      get_netdev: get the net_device on the physical port of the IB transport port. In
      port aggregation mode it is required to return the netdev of the active port.
      
      modify_gid: note for a change in the RoCE gid cache. Handle this by writing to
      the harsware GID table. It is possible that indexes in cahce and hardware tables
      won't match so a translation is required when modifying a QP or creating an
      address handle.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e26be1bf
  7. 28 7月, 2015 1 次提交
  8. 16 6月, 2015 3 次提交
  9. 13 6月, 2015 1 次提交
  10. 31 5月, 2015 1 次提交
    • M
      net/mlx4: Add EQ pool · c66fa19c
      Matan Barak 提交于
      Previously, mlx4_en allocated EQs and used them exclusively.
      This affected RoCE performance, as applications which are
      events sensitive were limited to use only the legacy EQs.
      
      Change that by introducing an EQ pool. This pool is managed
      by mlx4_core. EQs are assigned to ports (when there are limited
      number of EQs, multiple ports could be assigned to the same EQs).
      
      An exception to this rule is the ASYNC EQ which handles various events.
      
      Legacy EQs are completely removed as all EQs could be shared.
      
      When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for
      EQ serving on a specific port. The core driver calculates which
      EQ should be assigned to that request.
      
      Because IRQs are shared between IB and Ethernet modules, their
      names only include the PCI device BDF address.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c66fa19c
  11. 16 4月, 2015 2 次提交
  12. 03 4月, 2015 7 次提交
  13. 01 4月, 2015 2 次提交
  14. 19 3月, 2015 1 次提交
  15. 07 3月, 2015 1 次提交
  16. 10 2月, 2015 1 次提交
    • Y
      IB/mlx4: Reset flow support for IB kernel ULPs · 35f05dab
      Yishai Hadas 提交于
      The driver exposes interfaces that directly relate to HW state. Upon fatal
      error, consumers of these interfaces (ULPs) that rely on completion of
      all their posted work-request could hang, thereby introducing dependencies
      in shutdown order.  To prevent this from happening, we manage the
      relevant resources (CQs, QPs) that are used by the device. Upon a fatal error,
      we now generate simulated completions for outstanding WQEs that were not
      completed at the time the HW was reset.
      
      It includes invoking the completion event handler for all involved CQs so that
      the ULPs will poll those CQs. When polled we return simulated CQEs with
      IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and
      not wait forever for completions upon receiving remove_one.
      
      The above change requires an extra check in the data path to make sure that when
      device is in error state, the simulated CQEs will be returned and no further
      WQEs will be posted.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f05dab
  17. 05 2月, 2015 2 次提交
  18. 03 2月, 2015 1 次提交
  19. 28 1月, 2015 1 次提交
  20. 26 1月, 2015 6 次提交
  21. 31 12月, 2014 1 次提交
  22. 12 12月, 2014 1 次提交
    • M
      net/mlx4: Add support for A0 steering · 7d077cd3
      Matan Barak 提交于
      Add the required firmware commands for A0 steering and a way to enable
      that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT,
      QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used
      to configure and query the device.
      
      The different A0 DMFS (steering) modes are:
      
      Static - optimized performance, but flow steering rules are
      limited. This mode should be choosed explicitly by the user
      in order to be used.
      
      Dynamic - this mode should be explicitly choosed by the user.
      In this mode, the FW works in optimized steering mode as long as
      it can and afterwards automatically drops to classic (full) DMFS.
      
      Disable - this mode should be explicitly choosed by the user.
      The user instructs the system not to use optimized steering, even if
      the FW supports Dynamic A0 DMFS (and thus will be able to use optimized
      steering in Default A0 DMFS mode).
      
      Default - this mode is implicitly choosed. In this mode, if the FW
      supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll
      work at Disable A0 DMFS mode.
      
      Under SRIOV configuration, when the A0 steering mode is enabled,
      older guest VF drivers who aren't using the RX QP allocation flag
      (MLX4_RESERVE_A0_QP) will get a QP from the general range and
      fail when attempting to register a steering rule. To avoid that,
      the PF context behaviour is changed once on A0 static mode, to
      require support for the allocation flag in VF drivers too.
      
      In order to enable A0 steering, we use log_num_mgm_entry_size param.
      If the value of the parameter is not positive, we treat the absolute
      value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this
      bit field enables static A0 steering.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d077cd3