1. 01 10月, 2012 2 次提交
    • J
      IB/mlx4: Initialize SR-IOV IB support for slaves in master context · fc06573d
      Jack Morgenstein 提交于
      Allocate SR-IOV paravirtualization resources and MAD demuxing contexts
      on the master.
      
      This has two parts.  The first part is to initialize the structures to
      contain the contexts.  This is done at master startup time in
      mlx4_ib_init_sriov().
      
      The second part is to actually create the tunneling resources required
      on the master to support a slave.  This is performed the master
      detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event
      generated when a slave initializes its comm channel).
      
      For the master, there is no such startup event, so it creates its own
      tunneling resources when it starts up.  In addition, the master also
      creates the real special QPs.  The ib_core layer on the master causes
      creation of proxy special QPs, since the master is also
      paravirtualized at the ib_core layer.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      fc06573d
    • J
      IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support · 1ffeb2eb
      Jack Morgenstein 提交于
      1. Introduce the basic SR-IOV parvirtualization context objects for
         multiplexing and demultiplexing MADs.
      2. Introduce support for the new proxy and tunnel QP types.
      
      This patch introduces the objects required by the master for managing
      QP paravirtualization for guests.
      
      struct mlx4_ib_sriov is created by the master only.
      It is a container for the following:
      
      1. All the info required by the PPF to multiplex and de-multiplex MADs
         (including those from the PF). (struct mlx4_ib_demux_ctx demux)
      2. All the info required to manage alias GUIDs (i.e., the GUID at
         index 0 that each guest perceives.  In fact, this is not the GUID
         which is actually at index 0, but is, in fact, the GUID which is at
         index[<VF number>] in the physical table.
      3. structures which are used to manage CM paravirtualization
      4. structures for managing the real special QPs when running in SR-IOV
         mode.  The real SQPs are controlled by the PPF in this case.  All
         SQPs created and controlled by the ib core layer are proxy SQP.
      
      struct mlx4_ib_demux_ctx contains the information per port needed
      to manage paravirtualization:
      
      1. All multicast paravirt info
      2. All tunnel-qp paravirt info for the port.
      3. GUID-table and GUID-prefix for the port
      4. work queues.
      
      struct mlx4_ib_demux_pv_ctx contains all the info for managing the
      paravirtualized QPs for one slave/port.
      
      struct mlx4_ib_demux_pv_qp contains the info need to run an individual
      QP (either tunnel qp or real SQP).
      
      Note:  We made use of the 2 most significant bits in enum
      mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h).
      We need these bits in the low-level driver for internal purposes.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      1ffeb2eb
  2. 17 8月, 2012 1 次提交
  3. 11 8月, 2012 1 次提交
    • J
      IB/mlx4: Fix possible deadlock on sm_lock spinlock · df7fba66
      Jack Morgenstein 提交于
      The sm_lock spinlock is taken in the process context by
      mlx4_ib_modify_device, and in the interrupt context by update_sm_ah,
      so we need to take that spinlock with irqsave, and release it with
      irqrestore.
      
      Lockdeps reports this as follows:
      
          [ INFO: inconsistent lock state ]
          3.5.0+ #20 Not tainted
          inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
          swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
          (&(&ibdev->sm_lock)->rlock){?.+...}, at: [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib]
          {HARDIRQ-ON-W} state was registered at:
            [<ffffffff810b84a0>] mark_irqflags+0x120/0x190
            [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0
            [<ffffffff810b9f51>] lock_acquire+0xb1/0x150
            [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50
            [<ffffffffa028d563>] mlx4_ib_modify_device+0x63/0x240 [mlx4_ib]
            [<ffffffffa026d1fc>] ib_modify_device+0x1c/0x20 [ib_core]
            [<ffffffffa026c353>] set_node_desc+0x83/0xc0 [ib_core]
            [<ffffffff8136a150>] dev_attr_store+0x20/0x30
            [<ffffffff81201fd6>] sysfs_write_file+0xe6/0x170
            [<ffffffff8118da38>] vfs_write+0xc8/0x190
            [<ffffffff8118dc01>] sys_write+0x51/0x90
            [<ffffffff8155b869>] system_call_fastpath+0x16/0x1b
      
          ...
          *** DEADLOCK ***
      
          1 lock held by swapper/0/0:
      
          stack backtrace:
          Pid: 0, comm: swapper/0 Not tainted 3.5.0+ #20
          Call Trace:
          <IRQ>  [<ffffffff810b7bea>] print_usage_bug+0x18a/0x190
          [<ffffffff810b7370>] ? print_irq_inversion_bug+0x210/0x210
          [<ffffffff810b7fb2>] mark_lock_irq+0xf2/0x280
          [<ffffffff810b8290>] mark_lock+0x150/0x240
          [<ffffffff810b84ef>] mark_irqflags+0x16f/0x190
          [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0
          [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
          [<ffffffff810b9f51>] lock_acquire+0xb1/0x150
          [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
          [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50
          [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib]
          [<ffffffffa026b2fa>] ? ib_create_ah+0x1a/0x40 [ib_core]
          [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib]
          [<ffffffff810c27c3>] ? is_module_address+0x23/0x30
          [<ffffffffa028b05b>] handle_port_mgmt_change_event+0xeb/0x150 [mlx4_ib]
          [<ffffffffa028c177>] mlx4_ib_event+0x117/0x160 [mlx4_ib]
          [<ffffffff81552501>] ? _raw_spin_lock_irqsave+0x61/0x70
          [<ffffffffa022718c>] mlx4_dispatch_event+0x6c/0x90 [mlx4_core]
          [<ffffffffa0221b40>] mlx4_eq_int+0x500/0x950 [mlx4_core]
      
      Reported by: Or Gerlitz <ogerlitz@mellanox.com>
      Tested-by: NBart Van Assche <bvanassche@acm.org>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      df7fba66
  4. 19 7月, 2012 1 次提交
  5. 12 7月, 2012 2 次提交
  6. 11 7月, 2012 1 次提交
    • J
      mlx4: Use port management change event instead of smp_snoop · 00f5ce99
      Jack Morgenstein 提交于
      The port management change event can replace smp_snoop.  If the
      capability bit for this event is set in dev-caps, the event is used
      (by the driver setting the PORT_MNG_CHG_EVENT bit in the async event
      mask in the MAP_EQ fw command).  In this case, when the driver passes
      incoming SMP PORT_INFO SET mads to the FW, the FW generates port
      management change events to signal any changes to the driver.
      
      If the FW generates these events, smp_snoop shouldn't be invoked in
      ib_process_mad(), or duplicate events will occur (once from the
      FW-generated event, and once from smp_snoop).
      
      In the case where the FW does not generate port management change
      events smp_snoop needs to be invoked to create these events.  The flow
      in smp_snoop has been modified to make use of the same procedures as
      in the fw-generated-event event case to generate the port management
      events (LID change, Client-rereg, Pkey change, and/or GID change).
      
      Port management change event handling required changing the
      mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument
      (last argument) had to be changed to unsigned long in order to
      accomodate passing the EQE pointer.
      
      We also needed to move the definition of struct mlx4_eqe from
      net/mlx4.h to file device.h -- to make it available to the IB driver,
      to handle port management change events.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      00f5ce99
  7. 09 7月, 2012 1 次提交
  8. 08 7月, 2012 1 次提交
    • H
      {NET, IB}/mlx4: Add device managed flow steering firmware API · 0ff1fb65
      Hadar Hen Zion 提交于
      The driver is modified to support three operation modes.
      
      If supported by firmware use the device managed flow steering
      API, that which we call device managed steering mode. Else, if
      the firmware supports the B0 steering mode use it, and finally,
      if none of the above, use the A0 steering mode.
      
      When the steering mode is device managed, the code is modified
      such that L2 based rules set by the mlx4_en driver for Ethernet
      unicast and multicast, and the IB stack multicast attach calls
      done through the mlx4_ib driver are all routed to use the device
      managed API.
      
      When attaching rule using device managed flow steering API,
      the firmware returns a 64 bit registration id, which is to be
      provided during detach.
      
      Currently the firmware is always programmed during HCA initialization
      to use standard L2 hashing. Future work should be done to allow
      configuring the flow-steering hash function with common, non
      proprietary means.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ff1fb65
  9. 07 6月, 2012 1 次提交
  10. 04 6月, 2012 1 次提交
  11. 19 5月, 2012 2 次提交
  12. 09 5月, 2012 3 次提交
  13. 25 4月, 2012 1 次提交
  14. 03 4月, 2012 1 次提交
  15. 13 3月, 2012 2 次提交
  16. 09 3月, 2012 1 次提交
    • O
      IB: Change CQE "csum_ok" field to a bit flag · d927d505
      Or Gerlitz 提交于
      Use a bit in wc_flags rather then a whole integer to hold the
      "checksum OK" flag.  By itself, this change doesn't reduce the size of
      struct ib_wc on 64bit machines -- it stays on 56 bytes because of
      padding.  However, it will allow to add more fields in the future
      without enlarging the struct.  Also, it will let us have a unified
      approach with future libibverbs checksum offload reporting, because a
      bit flag doesn't break the library ABI.
      
      This patch was suggested during conversation with Liran Liss
      <liranl@mellanox.com>.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      d927d505
  17. 07 3月, 2012 1 次提交
    • O
      mlx4_core: Get rid of redundant ext_port_cap flags · 8154c07f
      Or Gerlitz 提交于
      While doing the work for commit a6f7feae ("IB/mlx4: pass SMP
      vendor-specific attribute MADs to firmware") we realized that the
      firmware would respond on all sorts of vendor-specific MADs.
      Therefore commit 97285b78 ("mlx4_core: Add extended port
      capabilities support") adds redundant code into the driver, since
      there's no real reaon to maintain the extended capabilities of the
      port, as they can be queried on demand (e.g the FDR10 capability).
      
      This patch reverts commit 97285b78 and removes the check for
      extended caps from the mlx4_ib driver port query flow.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8154c07f
  18. 06 3月, 2012 1 次提交
  19. 26 2月, 2012 2 次提交
  20. 31 1月, 2012 1 次提交
    • J
      IB/mlx4: pass SMP vendor-specific attribute MADs to firmware · a6f7feae
      Jack Morgenstein 提交于
      In the current code, vendor-specific MADs (e.g with the FDR-10
      attribute) are silently dropped by the driver, resulting in timeouts
      at the sending side and inability to query/configure the relevant
      feature.  However, the ConnectX firmware is able to handle such MADs.
      For unsupported attributes, the firmware returns a GET_RESPONSE MAD
      containing an error status.
      
      For example, for a FDR-10 node with LID 11:
      
          # ibstat mlx4_0 1
      
          CA: 'mlx4_0'
          Port 1:
          State: Active
          Physical state: LinkUp
          Rate: 40 (FDR10)
          Base lid: 11
          LMC: 0
          SM lid: 24
          Capability mask: 0x02514868
          Port GUID: 0x0002c903002e65d1
          Link layer: InfiniBand
      
      Extended Port Query (EPI) vendor mad timeouts before the patch:
      
          # smpquery MEPI 11 -d
      
          ibwarn: [4196] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
          ibwarn: [4196] _do_madrpc: retry 1 (timeout 1000 ms)
          ibwarn: [4196] _do_madrpc: retry 2 (timeout 1000 ms)
          ibwarn: [4196] _do_madrpc: timeout after 3 retries, 3000 ms
          ibwarn: [4196] mad_rpc: _do_madrpc failed; dport (Lid 11)
          smpquery: iberror: [pid 4196] main: failed: operation EPI: ext port info query failed
      
      EPI query works OK with the patch:
      
          # smpquery MEPI 11 -d
      
          ibwarn: [6548] smp_query_via: attr 0xff90 mod 0x0 route Lid 11
          ibwarn: [6548] mad_rpc: data offs 64 sz 64
          mad data
          0000 0000 0000 0001 0000 0001 0000 0001
          0000 0000 0000 0000 0000 0000 0000 0000
          0000 0000 0000 0000 0000 0000 0000 0000
          0000 0000 0000 0000 0000 0000 0000 0000
          # Ext Port info: Lid 11 port 0
          StateChangeEnable:...............0x00
          LinkSpeedSupported:..............0x01
          LinkSpeedEnabled:................0x01
          LinkSpeedActive:.................0x01
      Signed-off-by: NJack Morgenstein <jackm@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Acked-by: NIra Weiny <weiny2@llnl.gov>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      a6f7feae
  21. 04 1月, 2012 1 次提交
    • O
      IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE · 9106c410
      Or Gerlitz 提交于
      For IBoE, SLs 0-7 are mapped to Ethernet 802.1Q user priority bits
      (pbits) which are part of the VLAN tag, SLs 8-15 are reserved.
      
      Under Ethernet, the ConnectX firmware treats (decode/encode) the four
      bit SL field in various constructs such as QPC / UD WQE / CQE as PPP0
      and not as 0PPP. This correlates well to the fact that within the
      vlan tag the pbits are located in bits 15-13 and not 12-14.
      
      The current code wasn't consistent around that area - the
      encoding was correct for the IBoE QPC.path.schedule_queue field,
      but was wrong for IBoE CQEs and when MLX header was built.
      
      These inconsistencies resulted in wrong SL <--> wire 802.1Q pbits
      mapping, which is fixed by using SL <--> PPP0 all around the place.
      Signed-off-by: NOren Duer <oren@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      9106c410
  22. 14 12月, 2011 3 次提交
  23. 07 12月, 2011 1 次提交
  24. 01 11月, 2011 2 次提交
  25. 29 10月, 2011 1 次提交
  26. 14 10月, 2011 5 次提交