1. 01 10月, 2012 9 次提交
    • R
      mlx4_core: Disable SENSE_PORT for multifunction devices · aadf4f3f
      Roland Dreier 提交于
      In the current driver, the SENSE_PORT firmware command is issued as a
      "wrapped" command, but the command handling code doesn't have a
      wrapper, so it will never do anything other than log an error message.
      The latest ConnectX-3 2.11.500 firmware reports the SENSE_PORT
      capability even in multi-function (SR-IOV) mode, so the driver will
      try to issue the command.
      
      At least until the driver has a proper wrapper for SENSE_PORT, make
      sure we disable the command for multi-function devices.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      aadf4f3f
    • R
      mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs · ca3e57a5
      Roland Dreier 提交于
      Instead of having a hard-coded "PCI device ID != 0x1003" (which
      obviously breaks as newer devices with ID != 0x1003 become available),
      instead let's set a flag in our PCI device table for the older devices
      where we're supposed to force using SENSE_PORT.  This also avoids
      enabling SENSE_PORT for virtual functions by mistake.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      ca3e57a5
    • R
      mlx4_core: Stash PCI ID driver_data in mlx4_priv structure · 839f1243
      Roland Dreier 提交于
      That way we can check flags later on, when we've finished with the
      pci_device_id structure.  Also convert the "is VF" flag to an enum:
      "Never do in the preprocessor what can be done in C."
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      839f1243
    • R
      mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem · f3d4c89e
      Roland Dreier 提交于
      On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca()
      before mlx4_multi_func_init().  However, for unlucky configurations,
      mlx4_init_hca() might call mlx4_SENSE_PORT() (via mlx4_dev_cap()), and
      that calls mlx4_cmd_imm() with MLX4_CMD_WRAPPED set.
      
      However, on a multifunction device with MLX4_CMD_WRAPPED, __mlx4_cmd()
      calls into mlx4_slave_cmd(), and that immediately tries to do
      
      	down(&priv->cmd.slave_sem);
      
      but priv->cmd.slave_sem isn't initialized until mlx4_multi_func_init()
      (which we haven't called yet).  The next thing it tries to do is access
      priv->mfunc.vhcr, but that hasn't been allocated yet.
      
      Fix this by moving the initialization of slave_sem and vhcr up into
      mlx4_cmd_init(). Also, since slave_sem is really just being used as a
      mutex, convert it into a slave_cmd_mutex.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f3d4c89e
    • R
      mlx4_core: Trivial cleanups to driver log messages · 84b1f153
      Roland Dreier 提交于
      Also put format string onto one line.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      84b1f153
    • J
      mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations · 47605df9
      Jack Morgenstein 提交于
      Previously, the structure of a guest's proxy QPs followed the
      structure of the PPF special qps (qp0 port 1, qp0 port 2, qp1 port 1,
      qp1 port 2, ...).  The guest then did offset calculations on the
      sqp_base qp number that the PPF passed to it in QUERY_FUNC_CAP().
      
      This is now changed so that the guest does no offset calculations
      regarding proxy or tunnel QPs to use.  This change frees the PPF from
      needing to adhere to a specific order in allocating proxy and tunnel
      QPs.
      
      Now QUERY_FUNC_CAP provides each port individually with its proxy
      qp0, proxy qp1, tunnel qp0, and tunnel qp1 QP numbers, and these are
      used directly where required (with no offset calculations).
      
      To accomplish this change, several fields were added to the phys_caps
      structure for use by the PPF and by non-SR-IOV mode:
      
          base_sqpn -- in non-sriov mode, this was formerly sqp_start.
          base_proxy_sqpn -- the first physical proxy qp number -- used by PPF
          base_tunnel_sqpn -- the first physical tunnel qp number -- used by PPF.
      
      The current code in the PPF still adheres to the previous layout of
      sqps, proxy-sqps and tunnel-sqps.  However, the PPF can change this
      layout without affecting VF or (paravirtualized) PF code.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      47605df9
    • J
      mlx4: Paravirtualize Node Guids for slaves · afa8fd1d
      Jack Morgenstein 提交于
      This is necessary in order to support > 1 VF/PF in a VM for software
      that uses the node guid as a discriminator, such as librdmacm.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      afa8fd1d
    • J
      mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop · 54679e14
      Jack Morgenstein 提交于
      This requires:
      
      1. Replacing the paravirtualized P_Key index (inserted by the guest)
         with the real P_Key index.
      
      2. For UD QPs, placing the guest's true source GID index in the
         address path structure mgid field, and setting the ud_force_mgid
         bit so that the mgid is taken from the QP context and not from the
         WQE when posting sends.
      
      3. For UC and RC QPs, placing the guest's true source GID index in the
         address path structure mgid field.
      
      4. For tunnel and proxy QPs, setting the Q_Key value reserved for that
         proxy/tunnel pair.
      
      Since not all the above adjustments occur in all the QP transitions,
      the QP transitions require separate wrapper functions.
      
      Secondly, initialize the P_Key virtualization table to its default
      values: Master virtualized table is 1-1 with the real P_Key table,
      guest virtualized table has P_Key index 0 mapped to the real P_Key
      index 0, and all the other P_Key indices mapped to the reserved
      (invalid) P_Key at index 127.
      
      Finally, add logic in smp_snoop for maintaining the phys_P_Key_cache.
      and generating events on the master only if a P_Key actually changed.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      54679e14
    • J
      mlx4_core: Add proxy and tunnel QPs to the reserved QP area · e2c76824
      Jack Morgenstein 提交于
      In addition, pass the proxy and tunnel QP numbers to slaves so the
      driver can perform special QP paravirtualization.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      e2c76824
  2. 08 9月, 2012 3 次提交
  3. 04 8月, 2012 1 次提交
  4. 26 7月, 2012 1 次提交
    • K
      mlx4: Add support for EEH error recovery · 57dbf29a
      Kleber Sacilotto de Souza 提交于
      Currently the mlx4 drivers don't have the necessary callbacks to
      implement EEH errors detection and recovery, so the PCI layer uses the
      probe and remove callbacks to try to recover the device after an error on
      the bus. However, these callbacks have race conditions with the internal
      catastrophic error recovery functions, which will also detect the error
      and this can cause the system to crash if both EEH and catas functions
      try to reset the device.
      
      This patch adds the necessary error recovery callbacks and makes sure
      that the internal catastrophic error functions will not try to reset the
      device in such scenarios. It also adds some calls to
      pci_channel_offline() to suppress reads/writes on the bus when the slot
      cannot accept I/O operations so we prevent unnecessary accesses to the
      bus and speed up the device removal.
      Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Acked-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57dbf29a
  5. 12 7月, 2012 3 次提交
    • J
      mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them · 6634961c
      Jack Morgenstein 提交于
      To allow easy paravirtualization of P_Key and GID table sizes, keep
      paravirtualized sizes in mlx4_dev->caps, but save the actual physical
      sizes from FW in struct: mlx4_dev->phys_cap.
      
      In addition, in SR-IOV mode, do the following:
      
      1. Reduce reported P_Key table size by 1.
         This is done to reserve the highest P_Key index for internal use,
         for declaring an invalid P_Key in P_Key paravirtualization.
         We require a P_Key index which always contain an invalid P_Key
         value for this purpose (i.e., one which cannot be modified by
         the subnet manager).  The way to do this is to reduce the
         P_Key table size reported to the subnet manager by 1, so that
         it will not attempt to access the P_Key at index #127.
      
      2. Paravirtualize the GID table size to 1. Thus, each guest sees
         only a single GID (at its paravirtualized index 0).
      
      In addition, since we are paravirtualizing the GID table size to 1, we
      add paravirtualization of the master GID event here (i.e., we do not
      do ib_dispatch_event() for the GUID change event on the master, since
      its (only) GUID never changes).
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      6634961c
    • J
      mlx4_core: Allow guests to have IB ports · 105c320f
      Jack Morgenstein 提交于
      Modify mlx4_dev_cap to allow IB support when SR-IOV is active.  Modify
      mlx4_slave_cap to set the "rdma-supported" bit in its flags area, and
      pass that to the guests (this is done in QUERY_FUNC_CAP and its
      wrapper).
      
      However, we don't activate IB support quite yet -- we leave the error
      return at the start of mlx4_ib_add in the mlx4_ib driver.
      
      In addition, set "protected fmr supported" bit to zero in the
      QUERY_FUNC_CAP wrapper.
      
      Finally, in the QUERY_FUNC_CAP wrapper, we needed to add code which
      checks for the port type (IB or Ethernet).  Previously, this was not
      an issue, since only Ethernet ports were supported.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      105c320f
    • J
      mlx4_core: Implement mechanism for reserved Q_Keys · 396f2feb
      Jack Morgenstein 提交于
      The SR-IOV special QP tunneling mechanism uses proxy special QPs
      (instead of the real special QPs) for MADs on guests.  These proxy QPs
      send their packets to a "tunnel" QP owned by the master.  The master
      then forwards the MAD (after any required paravirtualization) to the
      real special QP, which sends out the MAD.
      
      For security reasons (i.e., to prevent guests from sending MADs to
      tunnel QPs belonging to other guests), each proxy-tunnel QP pair is
      assigned a unique, reserved, Q_Key.  These Q_Keys are available only
      for proxy and tunnel QPs -- if the guest tries to use these Q_Keys
      with other QPs, it will fail.
      
      This patch introduces a mechanism for reserving a block of 64K Q_Keys
      for proxy/tunneling use.
      
      The patch introduces also two new fields into mlx4_dev: base_sqpn and
      base_tunnel_sqpn.
      
      In SR-IOV mode, the QP numbers for the "real," proxy, and tunnel sqps
      are added to the reserved QPN area (so that they will not change).
      There are 8 special QPs per port in the HCA, and each of them is
      assigned both a proxy and a tunnel QP, for each VF and for the PF as
      well in SR-IOV mode.
      
      The QPNs for these QPs are arranged as follows:
       1. The real SQP numbers (8)
       2. The proxy SQPs (8 * (max number of VFs + max number of PFs)
       3. The tunnel SQPs (8 * (max number of VFs + max number of PFs)
      
      To support these QPs, two new fields are added to struct mlx4_dev:
      
        base_sqp:  this is the QP number of the first of the real SQPs
        base_tunnel_sqp: this is the qp number of the first qp in the tunnel
                         sqp region. (On guests, this is the first tunnel
                         sqp of the 8 which are assigned to that guest).
      
      In addition, in SR-IOV mode, sqp_start is the number of the first
      proxy SQP in the proxy SQP region.  (In guests, this is the first
      proxy SQP of the 8 which are assigned to that guest)
      
      Note that in non-SR-IOV mode, there are no proxies and no tunnels.
      In this case, sqp_start is set to sqp_base -- which minimizes code
      changes.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      396f2feb
  6. 11 7月, 2012 1 次提交
  7. 08 7月, 2012 2 次提交
    • H
      {NET, IB}/mlx4: Add device managed flow steering firmware API · 0ff1fb65
      Hadar Hen Zion 提交于
      The driver is modified to support three operation modes.
      
      If supported by firmware use the device managed flow steering
      API, that which we call device managed steering mode. Else, if
      the firmware supports the B0 steering mode use it, and finally,
      if none of the above, use the A0 steering mode.
      
      When the steering mode is device managed, the code is modified
      such that L2 based rules set by the mlx4_en driver for Ethernet
      unicast and multicast, and the IB stack multicast attach calls
      done through the mlx4_ib driver are all routed to use the device
      managed API.
      
      When attaching rule using device managed flow steering API,
      the firmware returns a 64 bit registration id, which is to be
      provided during detach.
      
      Currently the firmware is always programmed during HCA initialization
      to use standard L2 hashing. Future work should be done to allow
      configuring the flow-steering hash function with common, non
      proprietary means.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ff1fb65
    • H
      net/mlx4: Set steering mode according to device capabilities · c96d97f4
      Hadar Hen Zion 提交于
      Instead of checking the firmware supported steering mode in various
      places in the code, add a dedicated field in the mlx4 device capabilities
      structure which is written once during the initialization flow and read
      across the code.
      
      This also set the grounds for add new steering modes. Currently two modes
      are supported, and are named after the ConnectX HW versions A0 and B0.
      
      A0 steering uses mac_index, vlan_index and priority to steer traffic
      into pre-defined range of QPs.
      
      B0 steering uses Ethernet L2 hashing rules and is enabled only
      if the firmware supports both unicast and multicast B0 steering,
      
      The current steering modes are relevant for Ethernet traffic only,
      such that Infiniband steering remains untouched.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c96d97f4
  8. 05 7月, 2012 1 次提交
  9. 26 6月, 2012 1 次提交
  10. 01 6月, 2012 3 次提交
  11. 16 5月, 2012 3 次提交
  12. 09 5月, 2012 1 次提交
    • S
      mlx4_core: Add second capabilities flags field · b3416f44
      Shlomo Pongratz 提交于
      This patch adds a 64-bit flags2 features member to struct mlx4_dev to
      export further features of the hardware.  The original flags field
      tracks features whose support bits are advertised by the firmware in
      offsets 0x40 and 0x44 of the query device capabilities command.
      flags2 will track features whose support bits are scattered at various
      offsets.
      
      RSS support is the first feature to be exported through flags2.  RSS
      capabilities are located at offset 0x2e.  The size of the RSS
      indirection table is also given in this offset.
      Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b3416f44
  13. 13 3月, 2012 2 次提交
    • O
      mlx4_core: Allow dynamic MTU configuration for IB ports · 096335b3
      Or Gerlitz 提交于
      Set the MTU for IB ports in the driver instead of using the firmware
      default of 2KB (the driver defaults to 4KB).  Allow for dynamic mtu
      configuration through a new, per-port sysfs entry.
      
      Since there's a dependency between the port MTU and the max number of
      HW VLs the port can support, apply a mim/max approach, using a loop
      that goes down from the highest possible number of VLs to the lowest,
      using the firmware return status to know whether the requested number
      of VLs is possible with a given MTU.
      
      For now, as with the dynamic link type change / VPI support, the sysfs
      entry to change the mtu is exposed only when NOT running in SR-IOV
      mode.  To allow changing the MTU for the master in SR-IOV mode,
      primary-function-initiated FLR (Function Level Reset) needs to be
      implemented.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      096335b3
    • R
      mlx4_core: Fix one more static exported function · e10903b0
      Roland Dreier 提交于
      Commit 22c8bff6 ("mlx4_core: Exported functions can't be static")
      fixed most of this up, but forgot about mlx4_is_slave_active().  Fix
      this one too.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      e10903b0
  14. 07 3月, 2012 1 次提交
    • O
      mlx4_core: Get rid of redundant ext_port_cap flags · 8154c07f
      Or Gerlitz 提交于
      While doing the work for commit a6f7feae ("IB/mlx4: pass SMP
      vendor-specific attribute MADs to firmware") we realized that the
      firmware would respond on all sorts of vendor-specific MADs.
      Therefore commit 97285b78 ("mlx4_core: Add extended port
      capabilities support") adds redundant code into the driver, since
      there's no real reaon to maintain the extended capabilities of the
      port, as they can be queried on demand (e.g the FDR10 capability).
      
      This patch reverts commit 97285b78 and removes the check for
      extended caps from the mlx4_ib driver port query flow.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8154c07f
  15. 26 2月, 2012 1 次提交
  16. 24 2月, 2012 1 次提交
  17. 22 2月, 2012 2 次提交
    • Y
      mlx4: Setting new port types after all interfaces unregistered · 3d8f9308
      Yevgeny Petrilin 提交于
      In port type change flow, need to set the new port types only after
      all interfaces have finished the unregister process.
      Otherwise, during unregister, one of the interfaces might issue a SET_PORT
      command with wrong port types, it can cause bad FW behavior.
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d8f9308
    • Y
      mlx4: Replacing pool_lock with mutex · 730c41d5
      Yevgeny Petrilin 提交于
      Under the spinlock we call request_irq(), which allocates memory with GFP_KERNEL,
      This causes the following trace when DEBUG_SPINLOCK is enabled, it can cause
      the following trace:
      
       BUG: spinlock wrong CPU on CPU#2, ethtool/2595
       lock: ffff8801f9cbc2b0, .magic: dead4ead, .owner: ethtool/2595, .owner_cpu: 0
       Pid: 2595, comm: ethtool Not tainted 3.0.18 #2
       Call Trace:
       spin_bug+0xa2/0xf0
       do_raw_spin_unlock+0x71/0xa0
       _raw_spin_unlock+0xe/0x10
       mlx4_assign_eq+0x12b/0x190 [mlx4_core]
       mlx4_en_activate_cq+0x252/0x2d0 [mlx4_en]
       ? mlx4_en_activate_rx_rings+0x227/0x370 [mlx4_en]
       mlx4_en_start_port+0x189/0xb90 [mlx4_en]
       mlx4_en_set_ringparam+0x29a/0x340 [mlx4_en]
       dev_ethtool+0x816/0xb10
       ? dev_get_by_name_rcu+0xa4/0xe0
       dev_ioctl+0x2b5/0x470
       handle_mm_fault+0x1cd/0x2d0
       sock_do_ioctl+0x5d/0x70
       sock_ioctl+0x79/0x2f0
       do_vfs_ioctl+0x8c/0x340
       sys_ioctl+0xa1/0xb0
       system_call_fastpath+0x16/0x1b
      
      Replacing with mutex, which is enough in this case.
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      730c41d5
  18. 21 2月, 2012 1 次提交
  19. 15 2月, 2012 1 次提交
  20. 23 1月, 2012 2 次提交