1. 31 8月, 2015 4 次提交
  2. 13 6月, 2015 2 次提交
  3. 02 6月, 2015 1 次提交
  4. 19 5月, 2015 3 次提交
  5. 16 12月, 2014 1 次提交
  6. 05 6月, 2014 1 次提交
    • R
      IB/core: Fix sparse warnings about redeclared functions · 8385fd84
      Roland Dreier 提交于
      Fix a few functions that are declared with __attribute_const__ in the
      ib_verbs.h header file but defined without it in verbs.c.  This gets rid
      of the following sparse warnings:
      
          drivers/infiniband/core/verbs.c:51:5: error: symbol 'ib_rate_to_mult' redeclared with different type (originally declared at include/rdma/ib_verbs.h:469) - different modifiers
          drivers/infiniband/core/verbs.c:68:14: error: symbol 'mult_to_ib_rate' redeclared with different type (originally declared at include/rdma/ib_verbs.h:607) - different modifiers
          drivers/infiniband/core/verbs.c:85:5: error: symbol 'ib_rate_to_mbps' redeclared with different type (originally declared at include/rdma/ib_verbs.h:476) - different modifiers
          drivers/infiniband/core/verbs.c:111:1: error: symbol 'rdma_node_get_transport' redeclared with different type (originally declared at include/rdma/ib_verbs.h:84) - different modifiers
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8385fd84
  7. 08 3月, 2014 2 次提交
    • S
      IB/core: Introduce signature verbs API · 1b01d335
      Sagi Grimberg 提交于
      Introduce a verbs interface for signature-related operations.  A
      signature handover operation configures the layouts of data and
      protection attributes both in memory and wire domains.
      
      Signature operations are:
      
      - INSERT:
        Generate and insert protection information when handing over
        data from input space to output space.
      - validate and STRIP:
        Validate protection information and remove it when handing over
        data from input space to output space.
      - validate and PASS:
        Validate protection information and pass it when handing over
        data from input space to output space.
      
      Once the signature handover opration is done, the HCA will offload
      data integrity generation/validation while performing the actual data
      transfer.
      
      Additions:
      
      1. HCA signature capabilities in device attributes
          Verbs provider supporting signature handover operations fills
          relevant fields in device attributes structure returned by
          ib_query_device.
      
      2. QP creation flag IB_QP_CREATE_SIGNATURE_EN
          Creating a QP that will carry signature handover operations may
          require some special preparations from the verbs provider.  So we
          add QP creation flag IB_QP_CREATE_SIGNATURE_EN to declare that the
          created QP may carry out signature handover operations.  Expose
          signature support to verbs layer (no support for now).
      
      3. New send work request IB_WR_REG_SIG_MR
          Signature handover work request. This WR will define the signature
          handover properties of the memory/wire domains as well as the
          domains layout. The purpose of this work request is to bind all
          the needed information for the signature operation:
      
          - data to be transferred:  wr->sg_list (ib_sge).
            * The raw data, pre-registered to a single MR (normally, before
              signature, this MR would have been used directly for the data
              transfer)
          - data protection guards: sig_handover.prot (ib_sge).
            * The data protection buffer, pre-registered to a single MR, which
              contains the data integrity guards of the raw data blocks.
              Note that it may not always exist, only in cases where the user is
              interested in storing protection guards in memory.
          - signature operation attributes: sig_handover.sig_attrs.
            * Tells the HCA how to validate/generate the protection information.
      
          Once the work request is executed, the memory region that will
          describe the signature transaction will be the sig_mr.  The
          application can now go ahead and send the sig_mr.rkey or use the
          sig_mr.lkey for data transfer.
      
      4. New Verb ib_check_mr_status
          check_mr_status verb checks the status of the memory region post
          transaction.  The first check that may be used is
          IB_MR_CHECK_SIG_STATUS, which will indicate if any signature
          errors are pending for a specific signature-enabled ib_mr.  This
          verb is a lightwight check and is allowed to be taken from
          interrupt context.  An application must call this verb after it is
          known that the actual data transfer has finished.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      1b01d335
    • S
      IB/core: Introduce protected memory regions · 17cd3a2d
      Sagi Grimberg 提交于
      This commit introduces verbs for creating/destoying memory
      regions which will allow new types of memory key operations such
      as protected memory registration.
      
      Indirect memory registration is registering several (one
      of more) pre-registered memory regions in a specific layout.
      The Indirect region may potentialy describe several regions
      and some repitition format between them.
      
      Protected Memory registration is registering a memory region
      with various data integrity attributes which will describe protection
      schemes that will be handled by the HCA in an offloaded manner.
      These memory regions will be applicable for a new REG_SIG_MR
      work request introduced later in this patchset.
      
      In the future these routines may replace or implement current memory
      regions creation routines existing today:
      - ib_reg_user_mr
      - ib_alloc_fast_reg_mr
      - ib_get_dma_mr
      - ib_dereg_mr
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      17cd3a2d
  8. 20 1月, 2014 1 次提交
  9. 19 1月, 2014 1 次提交
  10. 15 1月, 2014 1 次提交
    • M
      IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be
      Matan Barak 提交于
      This patch add the support for Ethernet L2 attributes in the
      verbs/cm/cma structures.
      
      When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
      in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
      
      Thus, those attributes were added to the following structures:
      
      * ib_ah_attr - added dmac
      * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
      * ib_wc - added smac, vlan_id
      * ib_sa_path_rec - added smac, dmac, vlan_id
      * cm_av - added smac and vlan_id
      
      For the path record structure, extra care was taken to avoid the new
      fields when packing it into wire format, so we don't break the IB CM
      and SA wire protocol.
      
      On the active side, the CM fills. its internal structures from the
      path provided by the ULP.  We add there taking the ETH L2 attributes
      and placing them into the CM Address Handle (struct cm_av).
      
      On the passive side, the CM fills its internal structures from the WC
      associated with the REQ message.  We add there taking the ETH L2
      attributes from the WC.
      
      When the HW driver provides the required ETH L2 attributes in the WC,
      they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
      code checks for the presence of these flags, and in their absence does
      address resolution from the ib_init_ah_from_wc() helper function.
      
      ib_modify_qp_is_ok is also updated to consider the link layer. Some
      parameters are mandatory for Ethernet link layer, while they are
      irrelevant for IB.  Vendor drivers are modified to support the new
      function signature.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      dd5f03be
  11. 14 1月, 2014 1 次提交
  12. 16 11月, 2013 1 次提交
  13. 09 11月, 2013 1 次提交
  14. 29 8月, 2013 1 次提交
    • H
      IB/core: Add receive flow steering support · 319a441d
      Hadar Hen Zion 提交于
      The RDMA stack allows for applications to create IB_QPT_RAW_PACKET
      QPs, which receive plain Ethernet packets, specifically packets that
      don't carry any QPN to be matched by the receiving side.  Applications
      using these QPs must be provided with a method to program some
      steering rule with the HW so packets arriving at the local port can be
      routed to them.
      
      This patch adds ib_create_flow(), which allow providing a flow
      specification for a QP.  When there's a match between the
      specification and a received packet, the packet is forwarded to that
      QP, in a the same way one uses ib_attach_multicast() for IB UD
      multicast handling.
      
      Flow specifications are provided as instances of struct ib_flow_spec_yyy,
      which describe L2, L3 and L4 headers.  Currently specs for Ethernet, IPv4,
      TCP and UDP are defined.  Flow specs are made of values and masks.
      
      The input to ib_create_flow() is a struct ib_flow_attr, which contains
      a few mandatory control elements and optional flow specs.
      
          struct ib_flow_attr {
                  enum ib_flow_attr_type type;
                  u16      size;
                  u16      priority;
                  u32      flags;
                  u8       num_of_specs;
                  u8       port;
                  /* Following are the optional layers according to user request
                   * struct ib_flow_spec_yyy
                   * struct ib_flow_spec_zzz
                   */
          };
      
      As these specs are eventually coming from user space, they are defined and
      used in a way which allows adding new spec types without kernel/user ABI
      change, just with a little API enhancement which defines the newly added spec.
      
      The flow spec structures are defined with TLV (Type-Length-Value)
      entries, which allows calling ib_create_flow() with a list of variable
      length of optional specs.
      
      For the actual processing of ib_flow_attr the driver uses the number
      of specs and the size mandatory fields along with the TLV nature of
      the specs.
      
      Steering rules processing order is according to the domain over which
      the rule is set and the rule priority.  All rules set by user space
      applicatations fall into the IB_FLOW_DOMAIN_USER domain, other domains
      could be used by future IPoIB RFS and Ethetool flow-steering interface
      implementation.  Lower numerical value for the priority field means
      higher priority.
      
      The returned value from ib_create_flow() is a struct ib_flow, which
      contains a database pointer (handle) provided by the HW driver to be
      used when calling ib_destroy_flow().
      
      Applications that offload TCP/IP traffic can also be written over IB
      UD QPs.  The ib_create_flow() / ib_destroy_flow() API is designed to
      support UD QPs too.  A HW driver can set IB_DEVICE_MANAGED_FLOW_STEERING
      to denote support for flow steering.
      
      The ib_flow_attr enum type supports usage of flow steering for promiscuous
      and sniffer purposes:
      
          IB_FLOW_ATTR_NORMAL - "regular" rule, steering according to rule specification
      
          IB_FLOW_ATTR_ALL_DEFAULT - default unicast and multicast rule, receive
              all Ethernet traffic which isn't steered to any QP
      
          IB_FLOW_ATTR_MC_DEFAULT - same as IB_FLOW_ATTR_ALL_DEFAULT but only for multicast
      
          IB_FLOW_ATTR_SNIFFER - sniffer rule, receive all port traffic
      
      ALL_DEFAULT and MC_DEFAULT rules options are valid only for Ethernet link type.
      Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      319a441d
  15. 14 8月, 2013 1 次提交
  16. 17 4月, 2013 1 次提交
  17. 22 2月, 2013 1 次提交
    • S
      IB/core: Add "type 2" memory windows support · 7083e42e
      Shani Michaeli 提交于
      This patch enhances the IB core support for Memory Windows (MWs).
      
      MWs allow an application to have better/flexible control over remote
      access to memory.
      
      Two types of MWs are supported, with the second type having two flavors:
      
          Type 1  - associated with PD only
          Type 2A - associated with QPN only
          Type 2B - associated with PD and QPN
      
      Applications can allocate a MW once, and then repeatedly bind the MW
      to different ranges in MRs that are associated to the same PD. Type 1
      windows are bound through a verb, while type 2 windows are bound by
      posting a work request.
      
      The 32-bit memory key is composed of a 24-bit index and an 8-bit
      key. The key is changed with each bind, thus allowing more control
      over the peer's use of the memory key.
      
      The changes introduced are the following:
      
      * add memory window type enum and a corresponding parameter to ib_alloc_mw.
      * type 2 memory window bind work request support.
      * create a struct that contains the common part of the bind verb struct
        ibv_mw_bind and the bind work request into a single struct.
      * add the ib_inc_rkey helper function to advance the tag part of an rkey.
      
      Consumer interface details:
      
      * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
        IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
        for these features.
      
        Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
        IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
        memory windows. It can set neither to indicate it doesn't support
        type 2 windows at all.
      
      * modify existing provides and consumers code to the new param of
        ib_alloc_mw and the ib_mw_bind_info structure
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NShani Michaeli <shanim@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      7083e42e
  18. 09 5月, 2012 2 次提交
  19. 28 1月, 2012 1 次提交
  20. 01 11月, 2011 1 次提交
  21. 14 10月, 2011 6 次提交
    • S
      RDMA/core: Export ib_open_qp() to share XRC TGT QPs · 0e0ec7e0
      Sean Hefty 提交于
      XRC TGT QPs are shared resources among multiple processes.  Since the
      creating process may exit, allow other processes which share the same
      XRC domain to open an existing QP.  This allows us to transfer
      ownership of an XRC TGT QP to another process.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      0e0ec7e0
    • S
      RDMA/uverbs: Export XRC domains to user space · 53d0bd1e
      Sean Hefty 提交于
      Allow user space to create XRC domains.  Because XRCDs are expected to
      be shared among multiple processes, we use inodes to identify an XRCD.
      
      Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      53d0bd1e
    • S
      RDMA/verbs: Cleanup XRC TGT QPs when destroying XRCD · d3d72d90
      Sean Hefty 提交于
      XRC TGT QPs are intended to be shared among multiple users and
      processes.  Allow the destruction of an XRC TGT QP to be done explicitly
      through ib_destroy_qp() or when the XRCD is destroyed.
      
      To support destroying an XRC TGT QP, we need to track TGT QPs with the
      XRCD.  When the XRCD is destroyed, all tracked XRC TGT QPs are also
      cleaned up.
      
      To avoid stale reference issues, if a user is holding a reference on a
      TGT QP, we increment a reference count on the QP.  The user releases the
      reference by calling ib_release_qp.  This releases any access to the QP
      from a user above verbs, but allows the QP to continue to exist until
      destroyed by the XRCD.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      d3d72d90
    • S
      RDMA/core: Add XRC QPs · b42b63cf
      Sean Hefty 提交于
      XRC ("eXtended reliable connected") is an IB transport that provides
      better scalability by allowing senders to specify which shared receive
      queue (SRQ) should be used to receive a message, which essentially
      allows one transport context (QP connection) to serve multiple
      destinations (as long as they share an adapter, of course).
      
      XRC communication is between an initiator (INI) QP and a target (TGT)
      QP.  Target QPs are associated with SRQs through an XRCD.  An XRC TGT QP
      behaves like a receive-only RD QP.  XRC INI QPs behave similarly to RC
      QPs, except that work requests posted to an XRC INI QP must specify the
      remote SRQ that is the target of the work request.
      
      We define two new QP types for XRC, to distinguish between INI and TGT
      QPs, and update the core layer to support XRC QPs.
      
      This patch is derived from work by Jack Morgenstein
      <jackm@dev.mellanox.co.il>
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b42b63cf
    • S
      RDMA/core: Add XRC SRQ type · 418d5130
      Sean Hefty 提交于
      XRC ("eXtended reliable connected") is an IB transport that provides
      better scalability by allowing senders to specify which shared receive
      queue (SRQ) should be used to receive a message, which essentially
      allows one transport context (QP connection) to serve multiple
      destinations (as long as they share an adapter, of course).
      
      XRC defines SRQs that are specifically used by XRC connections.  Expand
      the SRQ code to support XRC SRQs.  An XRC SRQ is currently restricted to
      only XRC use according to the IB XRC Annex.
      
      Portions of this patch were derived from work by
      Jack Morgenstein <jackm@dev.mellanox.co.il>.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      418d5130
    • S
      RDMA/core: Add SRQ type field · 96104eda
      Sean Hefty 提交于
      Currently, there is only a single ("basic") type of SRQ, but with XRC
      support we will add a second.  Prepare for this by defining an SRQ type
      and setting all current users to IB_SRQT_BASIC.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      96104eda
  22. 13 10月, 2011 1 次提交
    • S
      RDMA/core: Add XRC domain support · 59991f94
      Sean Hefty 提交于
      XRC ("eXtended reliable connected") is an IB transport that provides
      better scalability by allowing senders to specify which shared receive
      queue (SRQ) should be used to receive a message, which essentially
      allows one transport context (QP connection) to serve multiple
      destinations (as long as they share an adapter, of course).
      
      A few new concepts are introduced to support this.  This patch adds:
      
       - A new device capability flag, IB_DEVICE_XRC, which low-level
         drivers set to indicate that a device supports XRC.
       - A new object type, XRC domains (struct ib_xrcd), and new verbs
         ib_alloc_xrcd()/ib_dealloc_xrcd().  XRCDs are used to limit which
         XRC SRQs an incoming message can target.
      
      This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il>.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      59991f94
  23. 12 10月, 2011 1 次提交
  24. 28 9月, 2010 1 次提交
    • E
      IB/core: Add link layer property to ports · a3f5adaf
      Eli Cohen 提交于
      This patch allows ports to have different link layers:
      IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET.  This is required
      for adding IBoE (InfiniBand-over-Ethernet, aka RoCE) support.  For
      devices that do not provide an implementation for querying the link
      layer property of a port, we return a default value based on the
      transport: RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND
      and RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      a3f5adaf
  25. 05 8月, 2010 1 次提交
  26. 15 7月, 2008 2 次提交
    • R
      IB/core: Reset to error QP state transition is not allowed · e5a5e7d5
      Ralph Campbell 提交于
      I was reviewing the QP state transition diagram in the IB 1.2.1 spec
      and the code for qp_state_table[], and noticed that the code allows a
      QP to be modified from IB_QPS_RESET to IB_QPS_ERR whereas the notes
      for figure 124 (pg 457) specifically says that this transition isn't
      allowed.  This is a clarification from earlier versions of the IB
      spec, which were ambiguous in this area and suggested that the RESET
      to ERR transition was allowed.
      
      Fix up the qp_state_table[] to make RESET->ERR not allowed.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      e5a5e7d5
    • S
      RDMA/core: Add memory management extensions support · 00f7ec36
      Steve Wise 提交于
      This patch adds support for the IB "base memory management extension"
      (BMME) and the equivalent iWARP operations (which the iWARP verbs
      mandates all devices must implement).  The new operations are:
      
       - Allocate an ib_mr for use in fast register work requests.
      
       - Allocate/free a physical buffer lists for use in fast register work
         requests.  This allows device drivers to allocate this memory as
         needed for use in posting send requests (eg via dma_alloc_coherent).
      
       - New send queue work requests:
         * send with remote invalidate
         * fast register memory region
         * local invalidate memory region
         * RDMA read with invalidate local memory region (iWARP only)
      
      Consumer interface details:
      
       - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added
         to indicate device support for these features.
      
       - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV,
         IB_WR_RDMA_READ_WITH_INV are added.
      
       - A new consumer API function, ib_alloc_mr() is added to allocate
         fast register memory regions.
      
       - New consumer API functions, ib_alloc_fast_reg_page_list() and
         ib_free_fast_reg_page_list() are added to allocate and free
         device-specific memory for fast registration page lists.
      
       - A new consumer API function, ib_update_fast_reg_key(), is added to
         allow the key portion of the R_Key and L_Key of a fast registration
         MR to be updated.  Consumers call this if desired before posting
         a IB_WR_FAST_REG_MR work request.
      
      Consumers can use this as follows:
      
       - MR is allocated with ib_alloc_mr().
      
       - Page list memory is allocated with ib_alloc_fast_reg_page_list().
      
       - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key().
      
       - MR made VALID and bound to a specific page list via
         ib_post_send(IB_WR_FAST_REG_MR)
      
       - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV),
         ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with
         invalidate operation.
      
       - MR is deallocated with ib_dereg_mr()
      
       - page lists dealloced via ib_free_fast_reg_page_list().
      
      Applications can allocate a fast register MR once, and then can
      repeatedly bind the MR to different physical block lists (PBLs) via
      posting work requests to a send queue (SQ).  For each outstanding
      MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be
      allocated (the fast_reg_page_list is owned by the low-level driver
      from the consumer posting a work request until the request completes).
      Thus pipelining can be achieved while still allowing device-specific
      page_list processing.
      
      The 32-bit fast register memory key/STag is composed of a 24-bit index
      and an 8-bit key.  The application can change the key each time it
      fast registers thus allowing more control over the peer's use of the
      key/STag (ie it can effectively be changed each time the rkey is
      rebound to a page list).
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      00f7ec36