1. 16 4月, 2015 3 次提交
    • Y
      IB/mlx4: Request alias GUID on demand · ee59fa0d
      Yishai Hadas 提交于
      Request GIDs from the SM on demand, i.e., when a VF actually needs them,
      and release them when the GIDs are no longer in use.
      
      In cloud environments, this is useful for GID migrations, in which a
      GID is assigned to a VF on the destination HCA, while the VF on the
      source HCA is shutdown (but the GID was not administratively released).
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      ee59fa0d
    • Y
      IB/mlx4: Change init flow to request alias GUIDs for active VFs · f5479601
      Yishai Hadas 提交于
      Change the init flow to ask GUIDs only for active VFs. This is done for
      both SM & HOST modes so that there is no need any more to maintain the
      ownership record type.
      
      In case SM mode is used, the initial value will be 0, ask the SM to assign,
      for the HOST mode the initial value will be the HOST generated GUID.
      
      This will enable out of the box experience for both probed and attached VFs.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      f5479601
    • Y
      IB/mlx4: Alias GUID adding persistency support · 99ee4df6
      Yishai Hadas 提交于
      If the SM rejects an alias GUID request the PF driver keeps trying to acquire
      the specified GUID indefinitely, utilizing an exponential backoff scheme.
      
      Retrying is managed per GUID entry. Each entry that wasn't applied holds its
      next retry information. Retry requests to the SM consist of records of 8
      consecutive GUIDS. Each record that contains GUIDs requiring retries holds its
      next time-to-run based on the retry information of all its GUID entries. The
      record having the lowest retry time will run first when that retry time
      arrives.
      
      Since the method (SET or DELETE) as sent to the SM applies to all the GUIDs in
      the record, we must handle SET requests and DELETE requests in separate SM
      messages (one for SETs and the other for DELETEs).
      
      To avoid race conditions where a GUID entry request (set or delete) was
      modified after the SM request was sent, we save the method and the requested
      indices as part of the callback's context -- thus, only the requested indexes
      are evaluated when the response is received.
      
      When an GUID entry is approved we turn off its retry-required bit, this
      prevents redundant SM retries from occurring on that record.
      
      The port down event should be sent only when previously it was up. Likewise,
      the port up event should be sent only if previously the port was down.
      
      Synchronization was added around the flows that change entries and record state
      to prevent race conditions.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      99ee4df6
  2. 10 2月, 2015 1 次提交
    • Y
      IB/mlx4: Reset flow support for IB kernel ULPs · 35f05dab
      Yishai Hadas 提交于
      The driver exposes interfaces that directly relate to HW state. Upon fatal
      error, consumers of these interfaces (ULPs) that rely on completion of
      all their posted work-request could hang, thereby introducing dependencies
      in shutdown order.  To prevent this from happening, we manage the
      relevant resources (CQs, QPs) that are used by the device. Upon a fatal error,
      we now generate simulated completions for outstanding WQEs that were not
      completed at the time the HW was reset.
      
      It includes invoking the completion event handler for all involved CQs so that
      the ULPs will poll those CQs. When polled we return simulated CQEs with
      IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and
      not wait forever for completions upon receiving remove_one.
      
      The above change requires an extra check in the data path to make sure that when
      device is in error state, the simulated CQEs will be returned and no further
      WQEs will be posted.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f05dab
  3. 05 2月, 2015 2 次提交
  4. 23 9月, 2014 1 次提交
    • J
      IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header · 3e0629cb
      Jack Morgenstein 提交于
      The source MAC is needed in RoCE when building the QP1 header.
      
      Currently, this is obtained from the source net device. However, the net
      device may not yet exist, or can be destroyed in parallel to this QP1 send
      operation (e.g through the VPI port change flow) so accessing it may cause
      a kernel crash.
      
      To fix this, we maintain a source MAC cache per port for the net device in
      struct mlx4_ib_roce.  This cached MAC is initialized to be the default MAC
      address obtained during HCA initialization via QUERY_PORT. This cached MAC
      is updated via the netdev event notifier handler.
      
      Since the cached MAC is held in an atomic64 object, we do not need locking
      when accessing it.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      3e0629cb
  5. 02 8月, 2014 1 次提交
  6. 03 6月, 2014 1 次提交
  7. 17 5月, 2014 1 次提交
    • M
      IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes · 9433c188
      Matan Barak 提交于
      When we receive a netdev event indicating a netdev change and/or
      a netdev address change, we must change the MAC index used by the
      proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the
      VF will not carry the same source MAC address as the non-CM packets.
      
      We use the UPDATE_QP command to perform this change.
      
      In order to avoid modifying a QP context based on netdev event,
      while the driver attempts to destroy this QP (e.g either the mlx4_ib
      or ib_mad modules are unloaded), we use mutex locking in both flows.
      
      Since the relevant mlx4 proxy GSI QP is created indirectly by the
      mad module when they create their GSI QP, the mlx4 didn't need to
      keep track on that QP prior to this change.
      
      Now, when QP modifications are needed to this QP from within the
      driver, we added refernece to it.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9433c188
  8. 13 3月, 2014 2 次提交
    • J
      mlx4: Implement IP based gids support for RoCE/SRIOV · 5ea8bbfc
      Jack Morgenstein 提交于
      Since there is no connection between the MAC/VLAN and the GID
      when using IP-based addressing, the proxy QP1 (running on the
      slave) must pass the source-mac, destination-mac, and vlan_id
      information separately from the GID. Additionally, the Host
      must pass the remote source-mac and vlan_id back to the slave,
      
      This is achieved as follows:
      Outgoing MADs:
          1. Source MAC: obtained from the CQ completion structure
             (struct ib_wc, smac field).
          2. Destination MAC: obtained from the tunnel header
          3. vlan_id: obtained from the tunnel header.
      Incoming MADs
          1. The source (i.e., remote) MAC and vlan_id are passed in
             the tunnel header to the proxy QP1.
      
      VST mode support:
           For outgoing MADs,  the vlan_id obtained from the header is
              discarded, and the vlan_id specified by the Hypervisor is used
              instead.
           For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the
              "invalid" vlan (0xffff)  is substituted when forwarding to the slave.
      Signed-off-by: NMoni Shoua <monis@mellanox.co.il>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ea8bbfc
    • J
      mlx4: Add ref counting to port MAC table for RoCE · 2f5bb473
      Jack Morgenstein 提交于
      The IB side of RoCE requires the MAC table index of the
      MAC address used by its QPs.
      
      To obtain the real MAC index, the IB side registers the
      MAC (increasing its ref count, and also returning the
      real MAC index) during the modify-qp sequence.
      
      This protects against the ETH side deleting or modifying
      that MAC table entry while the QP is active.
      
      Note that until the modify-qp command returns success,
      the MAC and VLAN information only has "candidate" status.
      If the modify-qp succeeds, the "candidate" info is promoted
      to the operational MAC/VLAN info for the qp. If the modify fails,
      the candidate MAC/VLAN is unregistered, and the old qp info
      is preserved.
      
      The patch is a bit complex, because there are multiple qp
      transitions where the primary-path information may be
      modified:  INIT-to-RTR, and SQD-to-SQD.
      
      Similarly for the alternate path information.
      
      Therefore the code must handle cases where path information
      has already been entered into the QP context by previous
      qp transitions.
      
      For the MAC address, the success logic is as follows:
      1. If there was no previous MAC, simply move the candidate
         MAC information to the operational information, and reset
         the candidate MAC info.
      2. If there was a previous MAC, unregister it.  Then move
         the MAC information from candidate to operational, and
         reset the candidate info (as in 1. above).
      
      The MAC address failure logic is the same for all cases:
       - Unregister the candidate MAC, and reset the candidate MAC info.
      
      For Vlan registration, the logic is similar.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f5bb473
  9. 19 1月, 2014 2 次提交
  10. 15 1月, 2014 2 次提交
  11. 29 8月, 2013 1 次提交
  12. 26 2月, 2013 3 次提交
  13. 27 11月, 2012 1 次提交
    • O
      mlx4: 64-byte CQE/EQE support · 08ff3235
      Or Gerlitz 提交于
      ConnectX-3 devices can use either 64- or 32-byte completion queue
      entries (CQEs) and event queue entries (EQEs).  Using 64-byte
      EQEs/CQEs performs better because each entry is aligned to a complete
      cacheline.  This patch queries the HCA's capabilities, and if it
      supports 64-byte CQEs and EQES the driver will configure the HW to
      work in 64-byte mode.
      
      The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ.
      
      Since this mode is global, userspace (libmlx4) must be updated to work
      with the configured CQE size, and guests using SR-IOV virtual
      functions need to know both EQE and CQE size.
      
      In case one of the 64-byte CQE/EQE capabilities is activated, the
      patch makes sure that older guest drivers that use the QUERY_DEV_FUNC
      command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that
      they need an update to be able to work with the PPF. This is done by
      changing the returned pf_context_behaviour not to be zero any more. In
      case none of these capabilities is activated that value remains zero
      and older guest drivers can run OK.
      
      The SRIOV related flow is as follows
      
      1. the PPF does the detection of the new capabilities using
         QUERY_DEV_CAP command.
      
      2. the PPF activates the new capabilities using INIT_HCA.
      
      3. the VF detects if the PPF activated the capabilities using
         QUERY_HCA, and if this is the case activates them for itself too.
      
      Note that the VF detects that it must be aware to the new PF behaviour
      using QUERY_FUNC_CAP.  Steps 1 and 2 apply also for native mode.
      
      User space notification is done through a new field introduced in
      struct mlx4_ib_ucontext which holds device capabilities for which user
      space must take action. This changes the binary interface so the ABI
      towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only
      when **needed** i.e. only when the driver does use 64-byte CQEs or
      future device capabilities which must be in sync by user space. This
      practice allows to work with unmodified libmlx4 on older devices (e.g
      A0, B0) which don't support 64-byte CQEs.
      
      In order to keep existing systems functional when they update to a
      newer kernel that contains these changes in VF and userspace ABI, a
      module parameter enable_64b_cqe_eqe must be set to enable 64-byte
      mode; the default is currently false.
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      08ff3235
  14. 01 10月, 2012 8 次提交
    • J
      mlx4: Paravirtualize Node Guids for slaves · afa8fd1d
      Jack Morgenstein 提交于
      This is necessary in order to support > 1 VF/PF in a VM for software
      that uses the node guid as a discriminator, such as librdmacm.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      afa8fd1d
    • J
      IB/mlx4: Add iov directory in sysfs under the ib device · c1e7e466
      Jack Morgenstein 提交于
      This directory is added only for the master -- slaves do not have it.
      
      The sysfs iov directory is used to manage and examine the port P_Key
      and guid paravirtualization.
      
      Under iov/ports, the administrator may examine the gid and P_Key tables
      as they are present in the device (and as are seen in the "network
      view" presented to the SM).
      
      Under the iov/<pci slot number> directories, the admin may map the
      index numbers in the physical tables (as under iov/ports) to the
      paravirtualized index numbers that guests see.
      
      For example, if the administrator, for port 1 on guest 2 maps physical
      pkey index 10 to virtual index 1, then that guest, whenever it uses
      its pkey index 1, will actually be using the real pkey index 10.
      
      Based on patch from Erez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c1e7e466
    • J
      mlx4: Add alias_guid mechanism · a0c64a17
      Jack Morgenstein 提交于
      For IB ports, we paravirtualize the GUID at index 0 on slaves.  The
      GUID at index 0 seen by a slave is the actual GUID occupying the GUID
      table at the slave-id index.
      
      The driver, by default, requests at startup time that subnet manager
      populate its entire guid table with GUIDs. These guids are then mapped
      (paravirtualized) to the slaves, and appear for each slave as its GUID
      at index 0.
      
      Until each slave has such a guid, its port status is DOWN.
      
      The guid table is cached to support special QP paravirtualization, and
      event propagation to slaves on guid change (we test to see if the guid
      really changed before propagating an event to the slave).
      
      To support this caching, add capability to __mlx4_ib_query_gid() to
      obtain the network view (i.e., physical view) gid at index X, not just
      the host (paravirtualized) view.
      
      Based on a patch from Erez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      a0c64a17
    • A
      IB/mlx4: Add CM paravirtualization · 3cf69cc8
      Amir Vadai 提交于
      In CM para-virtualization:
      
      1. Incoming requests are steered to the correct vHCA according to the
         embedded GID.
      2. Communication IDs on outgoing requests are replaced by a globally
         unique ID, generated by the PPF, since there is no synchronization
         of ID generation between guests (and so these IDs are not
         guaranteed to be globally unique).  The guest's comm ID is stored,
         and is returned to the response MAD when it arrives.
      Signed-off-by: NAmir Vadai <amirv@mellanox.co.il>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      3cf69cc8
    • O
      IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV · b9c5d6a6
      Oren Duer 提交于
      MCG paravirtualization support includes:
      - Creating multicast groups by VFs, and keeping accounting of them
      - Leaving multicast groups by VFs
      - Updating SM only with real changes in the overall picture of MCGs status
      - Creation of MGID=0 groups (let SM choose MGID)
      
      Note that the MCG module maintains its own internal MCG object
      reference counts.  The reason for this is that the IB core is used to
      track only the multicast groups joins generated by the PF it runs
      over.  The PF IB core layer is unaware of slaves, so it cannot be used
      to keep track of MCG joins they generate.
      Signed-off-by: NOren Duer <oren@mellanox.co.il>
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b9c5d6a6
    • J
      mlx4: MAD_IFC paravirtualization · 0a9a0188
      Jack Morgenstein 提交于
      The MAD_IFC firmware command fulfills two functions.
      
      First, it is used in the QP0/QP1 MAD-handling flow to obtain
      information from the FW (for answering queries), and for setting
      variables in the HCA (MAD SET packets).
      
      For this, MAD_IFC should provide the FW (physical) view of the data.
      This is the view that OpenSM needs.  We call this the "network view".
      
      In the second case, MAD_IFC is used by various verbs to obtain data
      regarding the local HCA (e.g., ib_query_device()).  We call this the
      "host view".
      
      This data needs to be paravirtualized.
      
      MAD_IFC therefore needs a wrapper function, and also needs another
      flag indicating whether it should provide the network view (when it is
      called by ib_process_mad in special-qp packet handling), or the host
      view (when it is called while implementing a verb).
      
      There are currently 2 flag parameters in mlx4_MAD_IFC already:
      ignore_bkey and ignore_mkey.  These two parameters are replaced by a
      single "mad_ifc_flags" parameter, with different bits set for each
      flag.  A third flag is added: "network-view/host-view".
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      0a9a0188
    • J
      IB/mlx4: Initialize SR-IOV IB support for slaves in master context · fc06573d
      Jack Morgenstein 提交于
      Allocate SR-IOV paravirtualization resources and MAD demuxing contexts
      on the master.
      
      This has two parts.  The first part is to initialize the structures to
      contain the contexts.  This is done at master startup time in
      mlx4_ib_init_sriov().
      
      The second part is to actually create the tunneling resources required
      on the master to support a slave.  This is performed the master
      detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event
      generated when a slave initializes its comm channel).
      
      For the master, there is no such startup event, so it creates its own
      tunneling resources when it starts up.  In addition, the master also
      creates the real special QPs.  The ib_core layer on the master causes
      creation of proxy special QPs, since the master is also
      paravirtualized at the ib_core layer.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      fc06573d
    • J
      IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support · 1ffeb2eb
      Jack Morgenstein 提交于
      1. Introduce the basic SR-IOV parvirtualization context objects for
         multiplexing and demultiplexing MADs.
      2. Introduce support for the new proxy and tunnel QP types.
      
      This patch introduces the objects required by the master for managing
      QP paravirtualization for guests.
      
      struct mlx4_ib_sriov is created by the master only.
      It is a container for the following:
      
      1. All the info required by the PPF to multiplex and de-multiplex MADs
         (including those from the PF). (struct mlx4_ib_demux_ctx demux)
      2. All the info required to manage alias GUIDs (i.e., the GUID at
         index 0 that each guest perceives.  In fact, this is not the GUID
         which is actually at index 0, but is, in fact, the GUID which is at
         index[<VF number>] in the physical table.
      3. structures which are used to manage CM paravirtualization
      4. structures for managing the real special QPs when running in SR-IOV
         mode.  The real SQPs are controlled by the PPF in this case.  All
         SQPs created and controlled by the ib core layer are proxy SQP.
      
      struct mlx4_ib_demux_ctx contains the information per port needed
      to manage paravirtualization:
      
      1. All multicast paravirt info
      2. All tunnel-qp paravirt info for the port.
      3. GUID-table and GUID-prefix for the port
      4. work queues.
      
      struct mlx4_ib_demux_pv_ctx contains all the info for managing the
      paravirtualized QPs for one slave/port.
      
      struct mlx4_ib_demux_pv_qp contains the info need to run an individual
      QP (either tunnel qp or real SQP).
      
      Note:  We made use of the 2 most significant bits in enum
      mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h).
      We need these bits in the low-level driver for internal purposes.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      1ffeb2eb
  15. 11 7月, 2012 1 次提交
    • J
      mlx4: Use port management change event instead of smp_snoop · 00f5ce99
      Jack Morgenstein 提交于
      The port management change event can replace smp_snoop.  If the
      capability bit for this event is set in dev-caps, the event is used
      (by the driver setting the PORT_MNG_CHG_EVENT bit in the async event
      mask in the MAP_EQ fw command).  In this case, when the driver passes
      incoming SMP PORT_INFO SET mads to the FW, the FW generates port
      management change events to signal any changes to the driver.
      
      If the FW generates these events, smp_snoop shouldn't be invoked in
      ib_process_mad(), or duplicate events will occur (once from the
      FW-generated event, and once from smp_snoop).
      
      In the case where the FW does not generate port management change
      events smp_snoop needs to be invoked to create these events.  The flow
      in smp_snoop has been modified to make use of the same procedures as
      in the fw-generated-event event case to generate the port management
      events (LID change, Client-rereg, Pkey change, and/or GID change).
      
      Port management change event handling required changing the
      mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument
      (last argument) had to be changed to unsigned long in order to
      accomodate passing the EQE pointer.
      
      We also needed to move the definition of struct mlx4_eqe from
      net/mlx4.h to file device.h -- to make it available to the IB driver,
      to handle port management change events.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      00f5ce99
  16. 09 7月, 2012 1 次提交
  17. 08 7月, 2012 1 次提交
    • H
      {NET, IB}/mlx4: Add device managed flow steering firmware API · 0ff1fb65
      Hadar Hen Zion 提交于
      The driver is modified to support three operation modes.
      
      If supported by firmware use the device managed flow steering
      API, that which we call device managed steering mode. Else, if
      the firmware supports the B0 steering mode use it, and finally,
      if none of the above, use the A0 steering mode.
      
      When the steering mode is device managed, the code is modified
      such that L2 based rules set by the mlx4_en driver for Ethernet
      unicast and multicast, and the IB stack multicast attach calls
      done through the mlx4_ib driver are all routed to use the device
      managed API.
      
      When attaching rule using device managed flow steering API,
      the firmware returns a 64 bit registration id, which is to be
      provided during detach.
      
      Currently the firmware is always programmed during HCA initialization
      to use standard L2 hashing. Future work should be done to allow
      configuring the flow-steering hash function with common, non
      proprietary means.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ff1fb65
  18. 07 6月, 2012 1 次提交
  19. 19 5月, 2012 1 次提交
  20. 14 10月, 2011 2 次提交
  21. 19 7月, 2011 1 次提交
  22. 26 10月, 2010 1 次提交
    • E
      IB/mlx4: Add support for IBoE · fa417f7b
      Eli Cohen 提交于
      Add support for IBoE to mlx4_ib.  The bulk of the code is handling the
      new address vector fields; mlx4 needs the MAC address of a remote node
      to include it in a WQE (for datagrams) or in the QP context (for
      connected QPs).  Address resolution is done by assuming all unicast
      GIDs are either link-local IPv6 addresses.
      
      Multicast group attach/detach needs to update the NIC's multicast
      filters; but since attaching a QP to a multicast group can be done
      before the QP is bound to a port, for IBoE we need to keep track of
      all multicast groups that a QP is attached too before it transitions
      from INIT to RTR (since it does not have a port in the INIT state).
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      
      [ Many things cleaned up and otherwise monkeyed with; hope I didn't
        introduce too many bugs.  - Roland ]
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      fa417f7b
  23. 06 9月, 2009 1 次提交
    • J
      IB/mlx4: Don't allow userspace open while recovering from catastrophic error · 3b4a8cd5
      Jack Morgenstein 提交于
      Userspace apps are supposed to release all ib device resources if they
      receive a fatal async event (IBV_EVENT_DEVICE_FATAL).  However, the
      app has no way of knowing when the device has come back up, except to
      repeatedly attempt ibv_open_device() until it succeeds.
      
      However, currently there is no protection against the open succeeding
      while the device is in being removed following the fatal event.  In
      this case, the open will succeed, but as a result the device waits in
      the middle of its removal until the new app releases its resources --
      and the new app will not do so, since the open succeeded at a point
      following the fatal event generation.
      
      This patch adds an "active" flag to the device. The active flag is set
      to false (in the fatal event flow) before the "fatal" event is
      generated, so any subsequent ibv_dev_open() call to the device will
      fail until the device comes back up, thus preventing the above
      deadlock.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      3b4a8cd5
  24. 08 5月, 2009 1 次提交