1. 09 12月, 2015 1 次提交
  2. 29 10月, 2015 2 次提交
  3. 22 10月, 2015 5 次提交
  4. 08 10月, 2015 1 次提交
    • C
      IB: split struct ib_send_wr · e622f2f4
      Christoph Hellwig 提交于
      This patch split up struct ib_send_wr so that all non-trivial verbs
      use their own structure which embedds struct ib_send_wr.  This dramaticly
      shrinks the size of a WR for most common operations:
      
      sizeof(struct ib_send_wr) (old):	96
      
      sizeof(struct ib_send_wr):		48
      sizeof(struct ib_rdma_wr):		64
      sizeof(struct ib_atomic_wr):		96
      sizeof(struct ib_ud_wr):		88
      sizeof(struct ib_fast_reg_wr):		88
      sizeof(struct ib_bind_mw_wr):		96
      sizeof(struct ib_sig_handover_wr):	80
      
      And with Sagi's pending MR rework the fast registration WR will also be
      down to a reasonable size:
      
      sizeof(struct ib_fastreg_wr):		64
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt]
      Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc]
      Tested-by: NHaggai Eran <haggaie@mellanox.com>
      Tested-by: NSagi Grimberg <sagig@mellanox.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      e622f2f4
  5. 31 8月, 2015 1 次提交
  6. 16 6月, 2015 2 次提交
  7. 16 4月, 2015 2 次提交
  8. 18 2月, 2015 1 次提交
  9. 10 2月, 2015 1 次提交
    • Y
      IB/mlx4: Reset flow support for IB kernel ULPs · 35f05dab
      Yishai Hadas 提交于
      The driver exposes interfaces that directly relate to HW state. Upon fatal
      error, consumers of these interfaces (ULPs) that rely on completion of
      all their posted work-request could hang, thereby introducing dependencies
      in shutdown order.  To prevent this from happening, we manage the
      relevant resources (CQs, QPs) that are used by the device. Upon a fatal error,
      we now generate simulated completions for outstanding WQEs that were not
      completed at the time the HW was reset.
      
      It includes invoking the completion event handler for all involved CQs so that
      the ULPs will poll those CQs. When polled we return simulated CQEs with
      IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and
      not wait forever for completions upon receiving remove_one.
      
      The above change requires an extra check in the data path to make sure that when
      device is in error state, the simulated CQEs will be returned and no further
      WQEs will be posted.
      Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f05dab
  10. 05 2月, 2015 2 次提交
  11. 12 12月, 2014 2 次提交
    • M
      net/mlx4: Add A0 hybrid steering · d57febe1
      Matan Barak 提交于
      A0 hybrid steering is a form of high performance flow steering.
      By using this mode, mlx4 cards use a fast limited table based steering,
      in order to enable fast steering of unicast packets to a QP.
      
      In order to implement A0 hybrid steering we allocate resources
      from different zones:
      (1) General range
      (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region.
      
      When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP,
      we try hard to allocate the QP from range (2). Otherwise, we try hard not
      to allocate from this  range. However, when the system is pushed to its
      limits and one needs every resource, the allocator uses every region it can.
      
      Meaning, when we run out of raw-eth qps, the allocator allocates from the
      general range (and the special-A0 area is no longer active). If we run out
      of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that
      is also exhausted, the allocator will allocate from the general range
      (and the A0 region is no longer active).
      
      Note that if a raw-eth qp is allocated from the general range, it attempts
      to allocate the range such that bits 6 and 7 (blueflame bits) in the
      QP number are not set.
      
      When the feature is used in SRIOV, the VF has to notify the PF what
      kind of QP attributes it needs. In order to do that, along with the
      "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According
      to the combination of these bits, the PF tries to allocate a suitable QP.
      
      In order to maintain backward compatibility (with older PFs), the PF
      notifies which QP attributes it supports via QUERY_FUNC_CAP command.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d57febe1
    • E
      net/mlx4: Change QP allocation scheme · ddae0349
      Eugenia Emantayev 提交于
      When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields
      in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset.
      
      The current Ethernet driver code reserves a Tx QP range with 256b alignment.
      
      This is wrong because if there are more than 64 Tx QPs in use,
      QPNs >= base + 65 will have bits 6/7 set.
      
      This problem is not specific for the Ethernet driver, any entity that
      tries to reserve more than 64 BF-enabled QPs should fail. Also, using
      ranges is not necessary here and is wasteful.
      
      The new mechanism introduced here will support reservation for
      "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
      (when hypervisors support WC in VMs). The flow we use is:
      
      1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
         and request "BF enabled QPs" if BF is supported for the function
      
      2. In the ALLOC_RES FW command, change param1 to:
      a. param1[23:0]  - number of QPs
      b. param1[31-24] - flags controlling QPs reservation
      
      Bit 31 refers to Eth blueflame supported QPs. Those QPs must have
      bits 6 and 7 unset in order to be used in Ethernet.
      
      Bits 24-30 of the flags are currently reserved.
      
      When a function tries to allocate a QP, it states the required attributes
      for this QP. Those attributes are considered "best-effort". If an attribute,
      such as Ethernet BF enabled QP, is a must-have attribute, the function has
      to check that attribute is supported before trying to do the allocation.
      
      In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
      which are unsupported. If SRIOV is used, the PF validates those attributes
      and masks out unsupported attributes as well. In order to notify VFs which
      attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's
      mailbox is filled by the PF, which notifies which QP allocation attributes
      it supports.
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.co.il>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddae0349
  12. 23 9月, 2014 3 次提交
  13. 11 9月, 2014 1 次提交
  14. 30 8月, 2014 1 次提交
  15. 10 6月, 2014 1 次提交
  16. 03 6月, 2014 1 次提交
  17. 30 5月, 2014 2 次提交
    • J
      mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs · 99ec41d0
      Jack Morgenstein 提交于
      This commit adds the infrastructure for enabling selected VFs to
      operate SMI (QP0) MADs without restriction.
      
      Additionally, for these enabled VFs, their QP0 proxy and tunnel QPs
      are MLX QPs.  As such, they operate over VL15.  Therefore, they are
      not affected by "credit" problems or changes in the VLArb table (which
      may shut down VL0).
      
      Non-enabled VFs may only create UD proxy QP0 qps (which are forced by
      the hypervisor to send packets using the q-key it assigns and places
      in the qp-context).  Thus, non-enabled VFs will not pose a security
      risk.  The hypervisor discards any privileged MADs it receives from
      these non-enabled VFs.
      
      By default, all VFs are NOT enabled, and must explicitly be enabled
      by the administrator.
      
      The sysfs interface which operates the VF enablement infrastructure
      is provided in the next commit.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      99ec41d0
    • J
      IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses · 97982f5a
      Jack Morgenstein 提交于
      Currently, VFs in SRIOV VFs are denied QP0 access.  The main reason
      for this decision is security, since Subnet Management Datagrams
      (SMPs) are not restricted by network partitioning and may affect the
      physical network topology.  Moreover, even the SM may be denied access
      from portions of the network by setting management keys unknown to the
      SM.
      
      However, it is desirable to grant SMI access to certain privileged
      VFs, so that certain network management activities may be conducted
      within virtual machines instead of the hypervisor.
      
      This commit does the following:
      
      1. Create QP0 tunnel QPs for all VFs.
      
      2. Discard SMI mads sent-from/received-for non-privileged VFs in the
         hypervisor MAD multiplex/demultiplex logic.  SMI mads from/for
         privileged VFs are allowed to pass.
      
      3. MAD_IFC wrapper changes/fixes.  For non-privileged VFs, only
         host-view MAD_IFC commands are allowed, and only for SMI LID-Routed
         GET mads.  For privileged VFs, there are no restrictions.
      
      This commit does not allow privileged VFs as yet.  To determine if a VF
      is privileged, it calls function mlx4_vf_smi_enabled().  This function
      returns 0 unconditionally for now.
      
      The next two commits allow defining and activating privileged VFs.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      97982f5a
  18. 17 5月, 2014 1 次提交
    • M
      IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes · 9433c188
      Matan Barak 提交于
      When we receive a netdev event indicating a netdev change and/or
      a netdev address change, we must change the MAC index used by the
      proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the
      VF will not carry the same source MAC address as the non-CM packets.
      
      We use the UPDATE_QP command to perform this change.
      
      In order to avoid modifying a QP context based on netdev event,
      while the driver attempts to destroy this QP (e.g either the mlx4_ib
      or ib_mad modules are unloaded), we use mutex locking in both flows.
      
      Since the relevant mlx4 proxy GSI QP is created indirectly by the
      mad module when they create their GSI QP, the mlx4 didn't need to
      keep track on that QP prior to this change.
      
      Now, when QP modifications are needed to this QP from within the
      driver, we added refernece to it.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9433c188
  19. 18 3月, 2014 1 次提交
  20. 13 3月, 2014 3 次提交
    • J
      mlx4: Implement IP based gids support for RoCE/SRIOV · 5ea8bbfc
      Jack Morgenstein 提交于
      Since there is no connection between the MAC/VLAN and the GID
      when using IP-based addressing, the proxy QP1 (running on the
      slave) must pass the source-mac, destination-mac, and vlan_id
      information separately from the GID. Additionally, the Host
      must pass the remote source-mac and vlan_id back to the slave,
      
      This is achieved as follows:
      Outgoing MADs:
          1. Source MAC: obtained from the CQ completion structure
             (struct ib_wc, smac field).
          2. Destination MAC: obtained from the tunnel header
          3. vlan_id: obtained from the tunnel header.
      Incoming MADs
          1. The source (i.e., remote) MAC and vlan_id are passed in
             the tunnel header to the proxy QP1.
      
      VST mode support:
           For outgoing MADs,  the vlan_id obtained from the header is
              discarded, and the vlan_id specified by the Hypervisor is used
              instead.
           For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the
              "invalid" vlan (0xffff)  is substituted when forwarding to the slave.
      Signed-off-by: NMoni Shoua <monis@mellanox.co.il>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ea8bbfc
    • J
      mlx4: Add ref counting to port MAC table for RoCE · 2f5bb473
      Jack Morgenstein 提交于
      The IB side of RoCE requires the MAC table index of the
      MAC address used by its QPs.
      
      To obtain the real MAC index, the IB side registers the
      MAC (increasing its ref count, and also returning the
      real MAC index) during the modify-qp sequence.
      
      This protects against the ETH side deleting or modifying
      that MAC table entry while the QP is active.
      
      Note that until the modify-qp command returns success,
      the MAC and VLAN information only has "candidate" status.
      If the modify-qp succeeds, the "candidate" info is promoted
      to the operational MAC/VLAN info for the qp. If the modify fails,
      the candidate MAC/VLAN is unregistered, and the old qp info
      is preserved.
      
      The patch is a bit complex, because there are multiple qp
      transitions where the primary-path information may be
      modified:  INIT-to-RTR, and SQD-to-SQD.
      
      Similarly for the alternate path information.
      
      Therefore the code must handle cases where path information
      has already been entered into the QP context by previous
      qp transitions.
      
      For the MAC address, the success logic is as follows:
      1. If there was no previous MAC, simply move the candidate
         MAC information to the operational information, and reset
         the candidate MAC info.
      2. If there was a previous MAC, unregister it.  Then move
         the MAC information from candidate to operational, and
         reset the candidate info (as in 1. above).
      
      The MAC address failure logic is the same for all cases:
       - Unregister the candidate MAC, and reset the candidate MAC info.
      
      For Vlan registration, the logic is similar.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f5bb473
    • J
      mlx4: Adjust QP1 multiplexing for RoCE/SRIOV · 6ee51a4e
      Jack Morgenstein 提交于
      This requires the following modifications:
      1. Fix build_mlx4_header to properly fill in the ETH fields
      2. Adjust mux and demux QP1 flow to support RoCE.
      
      This commit still assumes only one GID per slave for RoCE.
      The commit enabling multiple GIDs is a subsequent commit, and
      is done separately because of its complexity.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ee51a4e
  21. 19 1月, 2014 1 次提交
  22. 15 1月, 2014 2 次提交
    • M
      IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be
      Matan Barak 提交于
      This patch add the support for Ethernet L2 attributes in the
      verbs/cm/cma structures.
      
      When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
      in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
      
      Thus, those attributes were added to the following structures:
      
      * ib_ah_attr - added dmac
      * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
      * ib_wc - added smac, vlan_id
      * ib_sa_path_rec - added smac, dmac, vlan_id
      * cm_av - added smac and vlan_id
      
      For the path record structure, extra care was taken to avoid the new
      fields when packing it into wire format, so we don't break the IB CM
      and SA wire protocol.
      
      On the active side, the CM fills. its internal structures from the
      path provided by the ULP.  We add there taking the ETH L2 attributes
      and placing them into the CM Address Handle (struct cm_av).
      
      On the passive side, the CM fills its internal structures from the WC
      associated with the REQ message.  We add there taking the ETH L2
      attributes from the WC.
      
      When the HW driver provides the required ETH L2 attributes in the WC,
      they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
      code checks for the presence of these flags, and in their absence does
      address resolution from the ib_init_ah_from_wc() helper function.
      
      ib_modify_qp_is_ok is also updated to consider the link layer. Some
      parameters are mandatory for Ethernet link layer, while they are
      irrelevant for IB.  Vendor drivers are modified to support the new
      function signature.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      dd5f03be
    • M
      IB/mlx4: Add support for steerable IB UD QPs · c1c98501
      Matan Barak 提交于
      This patch adds support for steerable (NETIF) QP creation.  When we
      create the device, we allocate a range of steerable QPs.
      
      Afterward when a QP is created with the NETIF flag, it's allocated
      from this range.  Allocation is managed by bitmap allocator.
      
      Internal steering rules for those QPs is automatically generated on
      their creation.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c1c98501
  23. 25 4月, 2013 2 次提交
  24. 26 2月, 2013 1 次提交