1. 17 2月, 2016 1 次提交
    • H
      net/mlx4_core: Set UAR page size to 4KB regardless of system page size · 85743f1e
      Huy Nguyen 提交于
      problem description:
      
      The current code sets UAR page size equal to system page size.
      The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
      The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.
      
      solution:
      
      Always set UAR page to 4KB. This allows more UAR pages if the OS
      has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
      system page size, with 4MB uar region, there are 4MB/2/64KB = 32
      uars (half for uar, half for blueflame). This does not meet minimum 128
      UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
      which meet the minimum requirement.
      
      Note that only codes in mlx4_core that deal with firmware know that uar
      page size is 4KB. Codes that deal with usr page in cq and qp context
      (mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
      that uar page size equals to system page size.
      
      Note that with this implementation, on 64KB system page size kernel, there
      are 16 uars per system page but only one uars is used. The other 15
      uars are ignored because of the above assumption.
      
      Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
      to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
      firmware.
      
      Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
      the virtual OS must be updated. If hypervisor has old code, and the virtual
      OS has this new code, the new code will be backward compatible with the
      old code. If the uar size is big enough, this new code in VF continues to
      work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
      meet 128 uars requirement, this new code not loaded in VF and print the same
      error message as the old code in Hypervisor.
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85743f1e
  2. 20 1月, 2016 6 次提交
  3. 09 12月, 2015 1 次提交
  4. 07 12月, 2015 1 次提交
  5. 22 10月, 2015 1 次提交
  6. 15 10月, 2015 1 次提交
    • J
      net/mlx4_core: Replace VF zero mac with random mac in mlx4_core · 2b3ddf27
      Jack Morgenstein 提交于
      By design, when no default MAC addresses are set in the Hypervisor for VFs,
      the VFs are passed zero-macs. When such a MAC is received by the VF, it
      generates a random MAC address and registers that MAC address
      with the Hypervisor.
      
      This random mac generation is currently done in the mlx4_en module.
      There is a problem, though, if the mlx4_ib module is loaded by a VF before
      the mlx4_en module. In this case, for RoCE, mlx4_ib will see the un-replaced
      zero-mac and register that zero-mac as part of QP1 initialization.
      
      Having a zero-mac in the port's MAC table creates problems for a
      Baseboard Management Console. The BMC occasionally sends packets with a
      zero-mac destination MAC. If there is a zero-mac present in the port's
      MAC table, the FW will send such BMC packets to the host driver rather than
      to the wire, and BMC will stop working.
      
      To address this problem, we move the replacement of zero-mac addresses
      with random-mac addresses to procedure mlx4_slave_cap(), which is part of the
      driver startup for VFs, and is before activation of mlx4_ib and mlx4_en.
      As a result, zero-mac addresses will never be registered in the port MAC table
      by the driver.
      
      In addition, when mlx4_en does initialize the net device, it needs to set
      the NET_ADDR_RANDOM flag in the netdev structure if the address was
      randomly generated. This is done so that udev on the VM does not create
      a new device name after each VF probe (VM boot and such). To accomplish this,
      we add a per-port flag in mlx4_dev which gets set whenever mlx4_core replaces
      a zero-mac with a randomly-generated mac. This flag is examined when mlx4_en
      initializes the net-device.
      
      Fix was suggested by Matan Barak <matanb@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b3ddf27
  7. 31 8月, 2015 2 次提交
    • M
      IB/mlx4: Implement ib_device callbacks · e26be1bf
      Moni Shoua 提交于
      get_netdev: get the net_device on the physical port of the IB transport port. In
      port aggregation mode it is required to return the netdev of the active port.
      
      modify_gid: note for a change in the RoCE gid cache. Handle this by writing to
      the harsware GID table. It is possible that indexes in cahce and hardware tables
      won't match so a translation is required when modifying a QP or creating an
      address handle.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      e26be1bf
    • M
      net/mlx4: Postpone the registration of net_device · 79857cd3
      Moni Shoua 提交于
      The mlx4 network driver was registered in the context of the 'add'
      function of the core driver (called when HW should be registered).
      This makes the netdev event NETDEV_REGISTER to be sent in a context
      where the answer to get_protocol_dev() callback returns NULL. This may
      be confusing to listeners of netdev events.
      This patch is a preparation to the patch that implements the
      get_netdev() callback in the IB/mlx4 driver.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      79857cd3
  8. 28 7月, 2015 3 次提交
  9. 16 6月, 2015 4 次提交
  10. 13 6月, 2015 1 次提交
  11. 31 5月, 2015 1 次提交
    • M
      net/mlx4: Add EQ pool · c66fa19c
      Matan Barak 提交于
      Previously, mlx4_en allocated EQs and used them exclusively.
      This affected RoCE performance, as applications which are
      events sensitive were limited to use only the legacy EQs.
      
      Change that by introducing an EQ pool. This pool is managed
      by mlx4_core. EQs are assigned to ports (when there are limited
      number of EQs, multiple ports could be assigned to the same EQs).
      
      An exception to this rule is the ASYNC EQ which handles various events.
      
      Legacy EQs are completely removed as all EQs could be shared.
      
      When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for
      EQ serving on a specific port. The core driver calculates which
      EQ should be assigned to that request.
      
      Because IRQs are shared between IB and Ethernet modules, their
      names only include the PCI device BDF address.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c66fa19c
  12. 16 4月, 2015 2 次提交
  13. 03 4月, 2015 12 次提交
  14. 01 4月, 2015 2 次提交
  15. 19 3月, 2015 1 次提交
  16. 07 3月, 2015 1 次提交