1. 16 12月, 2014 5 次提交
  2. 14 10月, 2014 2 次提交
  3. 09 10月, 2014 1 次提交
  4. 23 9月, 2014 1 次提交
  5. 20 9月, 2014 1 次提交
    • S
      IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get · 87773dd5
      Shawn Bohrer 提交于
      In debugging an application that receives -ENOMEM from ib_reg_mr(), I
      found that ib_umem_get() can fail because the pinned_vm count has
      wrapped causing it to always be larger than the lock limit even with
      RLIMIT_MEMLOCK set to RLIM_INFINITY.
      
      The wrapping of pinned_vm occurs because the process that calls
      ib_reg_mr() will have its mm->pinned_vm count incremented.  Later a
      different process with a different mm_struct than the one that
      allocated the ib_umem struct ends up releasing it which results in
      decrementing the new processes mm->pinned_vm count past zero and
      wrapping.
      
      I'm not entirely sure what circumstances cause a different process to
      release the ib_umem than the one that allocated it but the kernel
      stack trace of the freeing process from my situation looks like the
      following:
      
          Call Trace:
           [<ffffffff814d64b1>] dump_stack+0x19/0x1b
           [<ffffffffa0b522a5>] ib_umem_release+0x1f5/0x200 [ib_core]
           [<ffffffffa0b90681>] mlx4_ib_destroy_qp+0x241/0x440 [mlx4_ib]
           [<ffffffffa0b4d93c>] ib_destroy_qp+0x12c/0x170 [ib_core]
           [<ffffffffa0cc7129>] ib_uverbs_close+0x259/0x4e0 [ib_uverbs]
           [<ffffffff81141cba>] __fput+0xba/0x240
           [<ffffffff81141e4e>] ____fput+0xe/0x10
           [<ffffffff81060894>] task_work_run+0xc4/0xe0
           [<ffffffff810029e5>] do_notify_resume+0x95/0xa0
           [<ffffffff814e3dd0>] int_signal+0x12/0x17
      
      The following patch fixes the issue by storing the pid struct of the
      process that calls ib_umem_get() so that ib_umem_release and/or
      ib_umem_account() can properly decrement the pinned_vm count of the
      correct mm_struct.
      Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
      Reviewed-by: NShachar Raindel <raindel@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      87773dd5
  6. 11 8月, 2014 5 次提交
  7. 05 8月, 2014 1 次提交
  8. 02 8月, 2014 1 次提交
  9. 11 6月, 2014 1 次提交
    • T
      RDMA/core: Add support for iWARP Port Mapper user space service · 30dc5e63
      Tatyana Nikolova 提交于
      This patch adds iWARP Port Mapper (IWPM) Version 2 support.  The iWARP
      Port Mapper implementation is based on the port mapper specification
      section in the Sockets Direct Protocol paper -
      http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf
      
      Existing iWARP RDMA providers use the same IP address as the native
      TCP/IP stack when creating RDMA connections.  They need a mechanism to
      claim the TCP ports used for RDMA connections to prevent TCP port
      collisions when other host applications use TCP ports.  The iWARP Port
      Mapper provides a standard mechanism to accomplish this.  Without this
      service it is possible for RDMA application to bind/listen on the same
      port which is already being used by native TCP host application.  If
      that happens the incoming TCP connection data can be passed to the
      RDMA stack with error.
      
      The iWARP Port Mapper solution doesn't contain any changes to the
      existing network stack in the kernel space.  All the changes are
      contained with the infiniband tree and also in user space.
      
      The iWARP Port Mapper service is implemented as a user space daemon
      process.  Source for the IWPM service is located at
      http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary
      
      The iWARP driver (port mapper client) sends to the IWPM service the
      local IP address and TCP port it has received from the RDMA
      application, when starting a connection.  The IWPM service performs a
      socket bind from user space to get an available TCP port, called a
      mapped port, and communicates it back to the client.  In that sense,
      the IWPM service is used to map the TCP port, which the RDMA
      application uses to any port available from the host TCP port
      space. The mapped ports are used in iWARP RDMA connections to avoid
      collisions with native TCP stack which is aware that these ports are
      taken. When an RDMA connection using a mapped port is terminated, the
      client notifies the IWPM service, which then releases the TCP port.
      
      The message exchange between the IWPM service and the iWARP drivers
      (between user space and kernel space) is implemented using netlink
      sockets.
      
      1) Netlink interface functions are added: ibnl_unicast() and
         ibnl_mulitcast() for sending netlink messages to user space
      
      2) The signature of the existing ibnl_put_msg() is changed to be more
         generic
      
      3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW
         corresponding to the two iWarp drivers - nes and cxgb4 which use
         the IWPM service
      
      4) Enums are added to enumerate the attributes in the netlink
         messages, which are exchanged between the user space IWPM service
         and the iWARP drivers
      Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: NPJ Waskiewicz <pj.waskiewicz@solidfire.com>
      
      [ Fold in range checking fixes and nlh_next removal as suggested by Dan
        Carpenter and Steve Wise.  Fix sparse endianness in hash.  - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      30dc5e63
  10. 07 6月, 2014 1 次提交
    • B
      IB/umad: Fix use-after-free on close · 60e1751c
      Bart Van Assche 提交于
      Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n>
      triggers a use-after-free.  __fput() invokes f_op->release() before it
      invokes cdev_put().  Make sure that the ib_umad_device structure is
      freed by the cdev_put() call instead of f_op->release().  This avoids
      that changing the port mode from IB into Ethernet and back to IB
      followed by restarting opensmd triggers the following kernel oops:
      
          general protection fault: 0000 [#1] PREEMPT SMP
          RIP: 0010:[<ffffffff810cc65c>]  [<ffffffff810cc65c>] module_put+0x2c/0x170
          Call Trace:
           [<ffffffff81190f20>] cdev_put+0x20/0x30
           [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0
           [<ffffffff8118e35e>] ____fput+0xe/0x10
           [<ffffffff810723bc>] task_work_run+0xac/0xe0
           [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0
           [<ffffffff814b8398>] int_signal+0x12/0x17
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NYann Droneaud <ydroneaud@opteya.com>
      Cc: <stable@vger.kernel.org> # 3.x: 8ec0a0e6: IB/umad: Fix error handling
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      60e1751c
  11. 06 6月, 2014 1 次提交
    • H
      IB/core: Fix kobject leak on device register error flow · 584482ac
      Haggai Eran 提交于
      The ports kobject isn't being released during error flow in device
      registration.  This patch refactors the ports kobject cleanup into a
      single function called from both the error flow in device registration
      and from the unregistration function.
      
      A couple of attributes aren't being deleted (iw_stats_group, and
      ib_class_attributes).  While this may be handled implicitly by the
      destruction of their kobjects, it seems better to handle all the
      attributes the same way.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      
      [ Make free_port_list_attributes() static.  - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      584482ac
  12. 05 6月, 2014 3 次提交
    • H
      IB/core: Fix port kobject deletion during error flow · cad6d02a
      Haggai Eran 提交于
      When encountering an error during the add_port function, adding a port
      to sysfs, the port kobject is freed without being deleted from sysfs.
      
      Instead of freeing it directly, the patch uses kobject_put to release
      the kobject and delete it.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      cad6d02a
    • H
      IB/core: Remove unneeded kobject_get/put calls · 373c0ea1
      Haggai Eran 提交于
      The ib_core module will call kobject_get on the parent object of each
      kobject it creates.  This is redundant since kobject_add does that
      anyway.
      
      As a side effect, this patch should fix leaking the ports kobject and
      the device kobject during unregister flow, since the previous code
      didn't seem to take into account the kobject_get calls on behalf of
      the child kobjects.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      373c0ea1
    • R
      IB/core: Fix sparse warnings about redeclared functions · 8385fd84
      Roland Dreier 提交于
      Fix a few functions that are declared with __attribute_const__ in the
      ib_verbs.h header file but defined without it in verbs.c.  This gets rid
      of the following sparse warnings:
      
          drivers/infiniband/core/verbs.c:51:5: error: symbol 'ib_rate_to_mult' redeclared with different type (originally declared at include/rdma/ib_verbs.h:469) - different modifiers
          drivers/infiniband/core/verbs.c:68:14: error: symbol 'mult_to_ib_rate' redeclared with different type (originally declared at include/rdma/ib_verbs.h:607) - different modifiers
          drivers/infiniband/core/verbs.c:85:5: error: symbol 'ib_rate_to_mbps' redeclared with different type (originally declared at include/rdma/ib_verbs.h:476) - different modifiers
          drivers/infiniband/core/verbs.c:111:1: error: symbol 'rdma_node_get_transport' redeclared with different type (originally declared at include/rdma/ib_verbs.h:84) - different modifiers
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8385fd84
  13. 04 6月, 2014 1 次提交
    • R
      IB/mad: Fix sparse warning about gfp_t use · 5343c00d
      Roland Dreier 提交于
      Properly convert gfp_t & result to bool to fix:
      
          drivers/infiniband/core/sa_query.c:621:33: warning: incorrect type in initializer (different base types)
          drivers/infiniband/core/sa_query.c:621:33:    expected bool [unsigned] [usertype] preload
          drivers/infiniband/core/sa_query.c:621:33:    got restricted gfp_t
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      5343c00d
  14. 30 5月, 2014 1 次提交
    • B
      IB/umad: Fix error handling · 8ec0a0e6
      Bart Van Assche 提交于
      Avoid leaking a kref count in ib_umad_open() if port->ib_dev == NULL
      or if nonseekable_open() fails.
      
      Avoid leaking a kref count, that sm_sem is kept down and also that the
      IB_PORT_SM capability mask is not cleared in ib_umad_sm_open() if
      nonseekable_open() fails.
      
      Since container_of() never returns NULL, remove the code that tests
      whether container_of() returns NULL.
      
      Moving the kref_get() call from the start of ib_umad_*open() to the
      end is safe since it is the responsibility of the caller of these
      functions to ensure that the cdev pointer remains valid until at least
      when these functions return.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: <stable@vger.kernel.org>
      
      [ydroneaud@opteya.com: rework a bit to reduce the amount of code changed]
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      
      [ nonseekable_open() can't actually fail, but....  - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8ec0a0e6
  15. 02 4月, 2014 2 次提交
  16. 08 3月, 2014 2 次提交
    • S
      IB/core: Introduce signature verbs API · 1b01d335
      Sagi Grimberg 提交于
      Introduce a verbs interface for signature-related operations.  A
      signature handover operation configures the layouts of data and
      protection attributes both in memory and wire domains.
      
      Signature operations are:
      
      - INSERT:
        Generate and insert protection information when handing over
        data from input space to output space.
      - validate and STRIP:
        Validate protection information and remove it when handing over
        data from input space to output space.
      - validate and PASS:
        Validate protection information and pass it when handing over
        data from input space to output space.
      
      Once the signature handover opration is done, the HCA will offload
      data integrity generation/validation while performing the actual data
      transfer.
      
      Additions:
      
      1. HCA signature capabilities in device attributes
          Verbs provider supporting signature handover operations fills
          relevant fields in device attributes structure returned by
          ib_query_device.
      
      2. QP creation flag IB_QP_CREATE_SIGNATURE_EN
          Creating a QP that will carry signature handover operations may
          require some special preparations from the verbs provider.  So we
          add QP creation flag IB_QP_CREATE_SIGNATURE_EN to declare that the
          created QP may carry out signature handover operations.  Expose
          signature support to verbs layer (no support for now).
      
      3. New send work request IB_WR_REG_SIG_MR
          Signature handover work request. This WR will define the signature
          handover properties of the memory/wire domains as well as the
          domains layout. The purpose of this work request is to bind all
          the needed information for the signature operation:
      
          - data to be transferred:  wr->sg_list (ib_sge).
            * The raw data, pre-registered to a single MR (normally, before
              signature, this MR would have been used directly for the data
              transfer)
          - data protection guards: sig_handover.prot (ib_sge).
            * The data protection buffer, pre-registered to a single MR, which
              contains the data integrity guards of the raw data blocks.
              Note that it may not always exist, only in cases where the user is
              interested in storing protection guards in memory.
          - signature operation attributes: sig_handover.sig_attrs.
            * Tells the HCA how to validate/generate the protection information.
      
          Once the work request is executed, the memory region that will
          describe the signature transaction will be the sig_mr.  The
          application can now go ahead and send the sig_mr.rkey or use the
          sig_mr.lkey for data transfer.
      
      4. New Verb ib_check_mr_status
          check_mr_status verb checks the status of the memory region post
          transaction.  The first check that may be used is
          IB_MR_CHECK_SIG_STATUS, which will indicate if any signature
          errors are pending for a specific signature-enabled ib_mr.  This
          verb is a lightwight check and is allowed to be taken from
          interrupt context.  An application must call this verb after it is
          known that the actual data transfer has finished.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      1b01d335
    • S
      IB/core: Introduce protected memory regions · 17cd3a2d
      Sagi Grimberg 提交于
      This commit introduces verbs for creating/destoying memory
      regions which will allow new types of memory key operations such
      as protected memory registration.
      
      Indirect memory registration is registering several (one
      of more) pre-registered memory regions in a specific layout.
      The Indirect region may potentialy describe several regions
      and some repitition format between them.
      
      Protected Memory registration is registering a memory region
      with various data integrity attributes which will describe protection
      schemes that will be handled by the HCA in an offloaded manner.
      These memory regions will be applicable for a new REG_SIG_MR
      work request introduced later in this patchset.
      
      In the future these routines may replace or implement current memory
      regions creation routines existing today:
      - ib_reg_user_mr
      - ib_alloc_fast_reg_mr
      - ib_get_dma_mr
      - ib_dereg_mr
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      17cd3a2d
  17. 05 3月, 2014 1 次提交
  18. 23 1月, 2014 2 次提交
  19. 20 1月, 2014 3 次提交
  20. 19 1月, 2014 2 次提交
  21. 15 1月, 2014 2 次提交
    • A
      net: replace macros net_random and net_srandom with direct calls to prandom · 63862b5b
      Aruna-Hewapathirane 提交于
      This patch removes the net_random and net_srandom macros and replaces
      them with direct calls to the prandom ones. As new commits only seem to
      use prandom_u32 there is no use to keep them around.
      This change makes it easier to grep for users of prandom_u32.
      Signed-off-by: NAruna-Hewapathirane <aruna.hewapathirane@gmail.com>
      Suggested-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63862b5b
    • M
      IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be
      Matan Barak 提交于
      This patch add the support for Ethernet L2 attributes in the
      verbs/cm/cma structures.
      
      When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
      in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
      
      Thus, those attributes were added to the following structures:
      
      * ib_ah_attr - added dmac
      * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
      * ib_wc - added smac, vlan_id
      * ib_sa_path_rec - added smac, dmac, vlan_id
      * cm_av - added smac and vlan_id
      
      For the path record structure, extra care was taken to avoid the new
      fields when packing it into wire format, so we don't break the IB CM
      and SA wire protocol.
      
      On the active side, the CM fills. its internal structures from the
      path provided by the ULP.  We add there taking the ETH L2 attributes
      and placing them into the CM Address Handle (struct cm_av).
      
      On the passive side, the CM fills its internal structures from the WC
      associated with the REQ message.  We add there taking the ETH L2
      attributes from the WC.
      
      When the HW driver provides the required ETH L2 attributes in the WC,
      they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
      code checks for the presence of these flags, and in their absence does
      address resolution from the ib_init_ah_from_wc() helper function.
      
      ib_modify_qp_is_ok is also updated to consider the link layer. Some
      parameters are mandatory for Ethernet link layer, while they are
      irrelevant for IB.  Vendor drivers are modified to support the new
      function signature.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      dd5f03be
  22. 14 1月, 2014 1 次提交