1. 31 8月, 2015 4 次提交
  2. 29 8月, 2015 2 次提交
    • S
      RDMA/cma: fix IPv6 address resolution · 6c26a771
      Spencer Baugh 提交于
      Resolving a link-local IPv6 address with an unspecified source address
      was broken by commit 5462eddd, which prevented the IPv6 stack from
      learning the scope id of the link-local IPv6 address, causing random
      failures as the IP stack chose a random link to resolve the address on.
      
      This commit 5462eddd made us bail out of cma_check_linklocal early if
      the address passed in was not an IPv6 link-local address. On the address
      resolution path, the address passed in is the source address; if the
      source address is the unspecified address, which is not link-local, we
      will bail out early.
      
      This is mostly correct, but if the destination address is a link-local
      address, then we will be following a link-local route, and we'll need to
      tell the IPv6 stack what the scope id of the destination address is.
      This used to be done by last line of cma_check_linklocal, which is
      skipped when bailing out early:
      
      	dev_addr->bound_dev_if = sin6->sin6_scope_id;
      
      (In cma_bind_addr, the sin6_scope_id of the source address is set to the
      sin6_scope_id of the destination address, so this is correct)
      This line is required in turn for the following line, L279 of
      addr6_resolve, to actually inform the IPv6 stack of the scope id:
      
            fl6.flowi6_oif = addr->bound_dev_if;
      
      Since we can only know we are in this failure case when we have access
      to both the source IPv6 address and destination IPv6 address, we have to
      deal with this further up the stack. So detect this failure case in
      cma_bind_addr, and set bound_dev_if to the destination address scope id
      to correct it.
      Signed-off-by: NSpencer Baugh <sbaugh@catern.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      6c26a771
    • J
      IB/ucma: Fix theoretical user triggered use-after-free · 7e967fd0
      Jason Gunthorpe 提交于
      Something like this:
      
      CPU A                         CPU B
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      
      ========================      ================================
      ucma_destroy_id()
       wait_for_completion()
                                    .. anything
                                      ucma_put_ctx()
                                        complete()
       .. continues ...
                                    ucma_leave_multicast()
                                     mutex_lock(mut)
                                       atomic_inc(ctx->ref)
                                     mutex_unlock(mut)
       ucma_free_ctx()
        ucma_cleanup_multicast()
         mutex_lock(mut)
           kfree(mc)
                                     rdma_leave_multicast(mc->ctx->cm_id,..
      
      Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot
      leave the mutex(mut) protection.
      
      The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects
      it from racing with ucma_destroy_id.
      Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      7e967fd0
  3. 15 7月, 2015 7 次提交
    • J
      IB/core: Destroy multcast_idr on module exit · 45d25420
      Johannes Thumshirn 提交于
      Destroy multcast_idr on module exit, reclaiming the allocated memory.
      
      This was detected by the following semantic patch (written by Luis Rodriguez
      <mcgrof@suse.com>)
      <SmPL>
      @ defines_module_init @
      declarer name module_init, module_exit;
      declarer name DEFINE_IDR;
      identifier init;
      @@
      
      module_init(init);
      
      @ defines_module_exit @
      identifier exit;
      @@
      
      module_exit(exit);
      
      @ declares_idr depends on defines_module_init && defines_module_exit @
      identifier idr;
      @@
      
      DEFINE_IDR(idr);
      
      @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       idr_destroy(&idr);
       ...
      }
      
      @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       +idr_destroy(&idr);
      }
      
      </SmPL>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      45d25420
    • C
      IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES · 59d40dd9
      Carol L Soto 提交于
      ib_ucm_release_dev clears the wrong bit if devnum is greater
      than IB_UCM_MAX_DEVICES.
      Signed-off-by: NCarol L Soto <clsoto@linux.vnet.ibm.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      59d40dd9
    • H
      IB/ucma: Fix lockdep warning in ucma_lock_files · 31b57b87
      Haggai Eran 提交于
      The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
      an ID. Use mutex_lock_nested() to prevent the warning below.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.1.0-rc6-hmm+ #40 Tainted: G           O
       ---------------------------------------------
       pingpong_rpc_se/10260 is trying to acquire lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
      
       but task is already holding lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&file->mut);
         lock(&file->mut);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by pingpong_rpc_se/10260:
        #0:  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       stack backtrace:
       CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G           O    4.1.0-rc6-hmm+ #40
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
        ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
        ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
        ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
       Call Trace:
        [<ffffffff81668f49>] dump_stack+0x4f/0x6e
        [<ffffffff810bb991>] __lock_acquire+0x741/0x1820
        [<ffffffff8121bee9>] ? dput+0x29/0x320
        [<ffffffff810bcb38>] lock_acquire+0xc8/0x240
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
        [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
        [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
        [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
        [<ffffffff81200674>] __vfs_write+0x34/0x100
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
        [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
        [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
        [<ffffffff812009bb>] vfs_write+0xab/0x120
        [<ffffffff81201519>] SyS_write+0x59/0xd0
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      31b57b87
    • T
      RDMA/core: Fixes for port mapper client registration · a7f2f24c
      Tatyana Nikolova 提交于
      Fixes to allow clients to make remove mapping requests, after
      they have provided the user space service with the mapping
      information, they are using when the service is restarted.
      
      1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
         registration types for the port mapper clients and functions
         to set/check the registration type.
      2) If the port mapper user space service is not available to register
         the client, then its registration stays IWPM_REG_UNDEF and the
         registration isn't checked until the service becomes available
         (no mappings are possible, if the user space service isn't running).
      3) After the service is restarted, the user space port mapper pid is set
         to valid and the client registration is set to IWPM_REG_INCOMPL
         to allow the client to make remove mapping requests.
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a7f2f24c
    • E
      IB/cm: Do not queue work to a device that's going away · be4b4993
      Erez Shitrit 提交于
      Whenever ib_cm gets remove_one call, like when there is a hot-unplug
      event, the driver should mark itself as going_down and confirm that no
      new works are going to be queued for that device.
      so, the order of the actions are:
      1. mark the going_down bit.
      2. flush the wq.
      3. [make sure no new works for that device.]
      4. unregister mad agent.
      
      otherwise, works that are already queued can be scheduled after the mad
      agent was freed.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      be4b4993
    • I
      IB/mad: Fix compare between big endian and cpu endian · cd4cd565
      Ira Weiny 提交于
      The define OPA_LID_PERMISSIVE is big endian and was compared to the
      cpu endian variable opa_drslid.
      
      Problem caught by 0-day build infrastructure.
      
      Fixes: 8e4349d1 (IB/mad: Add final OPA MAD processing)
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NJohn, Jubin <jubin.john@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      cd4cd565
    • H
      IB: Add rdma_cap_ib_switch helper and use where appropriate · 4139032b
      Hal Rosenstock 提交于
      Persuant to Liran's comments on node_type on linux-rdma
      mailing list:
      
      In an effort to reform the RDMA core and ULPs to minimize use of
      node_type in struct ib_device, an additional bit is added to
      struct ib_device for is_switch (IB switch). This is needed
      to be initialized by any IB switch device driver. This is a
      NEW requirement on such device drivers which are all
      "out of tree".
      
      In addition, an ib_switch helper was added to ib_verbs.h
      based on the is_switch device bit rather than node_type
      (although those should be consistent).
      
      The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
      as well as (IPoIB and SRP) ULPs are updated where
      appropriate to use this new helper. In some cases,
      the helper is now used under the covers of using
      rdma_[start end]_port rather than the open coding
      previously used.
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Tested-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NHal Rosenstock <hal@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4139032b
  4. 13 6月, 2015 18 次提交
  5. 11 6月, 2015 1 次提交
  6. 02 6月, 2015 2 次提交
  7. 21 5月, 2015 6 次提交
    • M
      IB/cma: Fix broken AF_IB UD support · c07678bb
      Matthew Finlay 提交于
      Support for using UD and AF_IB is currently broken.  The
      IB_CM_SIDR_REQ_RECEIVED message is not handled properly in
      cma_save_net_info() and we end up falling into code that will try and
      process the request as ipv4/ipv6, which will end up failing.
      
      The resolution is to add a check for the SIDR_REQ and call
      cma_save_ib_info() with a NULL path record.  Change cma_save_ib_info()
      to copy the src sib info from the listen_id when the path record is NULL.
      Reported-by: NHari Shankar <Hari.Shankar@netapp.com>
      Signed-off-by: NMatt Finlay <matt@mellanox.com>
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      c07678bb
    • I
      IB/core: Change rdma_protocol_iboe to roce · 5d9fb044
      Ira Weiny 提交于
      After discussion upstream, it was agreed to transition the usage of iboe
      in the kernel to roce.  This keeps our terminology consistent with what
      was finalized in the IBTA Annex 16 and IBTA Annex 17 publications.
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      5d9fb044
    • T
      ib/cm: Change reject message type when destroying cm_id · c29ed5a4
      Ted Kim 提交于
      Problem reported by: Ted Kim <ted.h.kim@oracle.com>:
      
      We have a case where a Linux system and a non-Linux system are
      trying to interoperate.  The Linux host is the active side and
      starts the connection establishment, but later decides to not go
      through with the connection setup and does rdma_destroy_id().
      
      The rdma_destroy_id() eventually works its way down to cm_destroy_id()
      in core/cm.c, where a REJ is sent. The non-Linux system
      has some trouble recognizing the REJ because of:
      
      A. CM states which can't receive the REJ
      B. Some issues about REJ formatting (missing comm ID)
      
      ISSUE A: That part of the spec says, a Consumer Reject REJ can be
      sent for a connection abort, but it goes further
      and says: can send a REJ message with a "Consumer Reject"
      Reason code if they are in a CM state (i.e. REP
      Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows
      a REJ to be sent (lines 35-38).
      
      Of the states listed there in that sentence, it would
      seem to limit the active side to using the Consumer Reject
      (for the abort case) in just the REP-Rcvd and MRA-REP-Sent
      states. That is basically only after the active side
      sees a REP (or alternatively goes down the state transitions
      to timeout in which case a Timeout REJ is sent).
      
      As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case
      to the same as REQ-SENT.  Essentially, make a REJ sent after
      getting an MRA on active side a timeout rather than Consumer-
      Reject, which is arguably more correct with the CM state
      diagrams previous to getting a REP.
      Signed-off-by: NTed Kim <ted.h.kim@oracle.com>
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      c29ed5a4
    • I
      IB/core: Convert core to use bitfield for caps · f9b22e35
      Ira Weiny 提交于
      Remove query_protocol callback
      
      Use the new Core Capability bits for:
      
      rdma_protocol_*
      rdma_cap_ib_mad
      rdma_cap_ib_smi
      rdma_cap_ib_cm
      rdma_cap_iw_cm
      rdma_cap_ib_sa
      rdma_cap_ib_mcast
      rdma_cap_af_ib
      rdma_cap_eth_ah
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      f9b22e35
    • I
      IB/core: Add per port immutable struct to ib_device · 7738613e
      Ira Weiny 提交于
      As of commit 5eb620c8 "IB/core: Add helpers for uncached GID and P_Key
      searches"; pkey_tbl_len and gid_tbl_len are immutable data which are stored in
      the ib_device.
      
      The per port core capability flags to be added later are also immutable data to
      be stored in the ib_device object.
      
      In preparation for this create a structure for per port immutable data and
      place the pkey and gid table lengths within this structure.
      
      "get_port_immutable" is added as a mandatory device function to allow the
      drivers to fill in this data.
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      7738613e
    • I
      IB/user_mad: Fix buggy usage of port index · 26c45428
      Ira Weiny 提交于
      The addition of the rdma_cap_ib_mad is technically broken in ib_umad_remove_one
      because the loop "i" value is not a port value.
      
      This bug resulted in the ib_umad failing to properly remove its resources when
      the core capability functions were converted to bit fields.
      
      NOTE: e17371d73908 did not result in broken behavior on its own.  It was only
      an issue when the implementation of rdma_cap_ib_mad was changed.
      
      Pass the port value to rdma_cap_ib_mad.
      
      Fixes: e17371d73908 ("IB/Verbs: Use management helper rdma_cap_ib_mad()")
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      26c45428