1. 31 8月, 2015 28 次提交
  2. 29 8月, 2015 2 次提交
    • S
      RDMA/cma: fix IPv6 address resolution · 6c26a771
      Spencer Baugh 提交于
      Resolving a link-local IPv6 address with an unspecified source address
      was broken by commit 5462eddd, which prevented the IPv6 stack from
      learning the scope id of the link-local IPv6 address, causing random
      failures as the IP stack chose a random link to resolve the address on.
      
      This commit 5462eddd made us bail out of cma_check_linklocal early if
      the address passed in was not an IPv6 link-local address. On the address
      resolution path, the address passed in is the source address; if the
      source address is the unspecified address, which is not link-local, we
      will bail out early.
      
      This is mostly correct, but if the destination address is a link-local
      address, then we will be following a link-local route, and we'll need to
      tell the IPv6 stack what the scope id of the destination address is.
      This used to be done by last line of cma_check_linklocal, which is
      skipped when bailing out early:
      
      	dev_addr->bound_dev_if = sin6->sin6_scope_id;
      
      (In cma_bind_addr, the sin6_scope_id of the source address is set to the
      sin6_scope_id of the destination address, so this is correct)
      This line is required in turn for the following line, L279 of
      addr6_resolve, to actually inform the IPv6 stack of the scope id:
      
            fl6.flowi6_oif = addr->bound_dev_if;
      
      Since we can only know we are in this failure case when we have access
      to both the source IPv6 address and destination IPv6 address, we have to
      deal with this further up the stack. So detect this failure case in
      cma_bind_addr, and set bound_dev_if to the destination address scope id
      to correct it.
      Signed-off-by: NSpencer Baugh <sbaugh@catern.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      6c26a771
    • J
      IB/ucma: Fix theoretical user triggered use-after-free · 7e967fd0
      Jason Gunthorpe 提交于
      Something like this:
      
      CPU A                         CPU B
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      
      ========================      ================================
      ucma_destroy_id()
       wait_for_completion()
                                    .. anything
                                      ucma_put_ctx()
                                        complete()
       .. continues ...
                                    ucma_leave_multicast()
                                     mutex_lock(mut)
                                       atomic_inc(ctx->ref)
                                     mutex_unlock(mut)
       ucma_free_ctx()
        ucma_cleanup_multicast()
         mutex_lock(mut)
           kfree(mc)
                                     rdma_leave_multicast(mc->ctx->cm_id,..
      
      Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot
      leave the mutex(mut) protection.
      
      The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects
      it from racing with ucma_destroy_id.
      Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      7e967fd0
  3. 15 7月, 2015 7 次提交
    • J
      IB/core: Destroy multcast_idr on module exit · 45d25420
      Johannes Thumshirn 提交于
      Destroy multcast_idr on module exit, reclaiming the allocated memory.
      
      This was detected by the following semantic patch (written by Luis Rodriguez
      <mcgrof@suse.com>)
      <SmPL>
      @ defines_module_init @
      declarer name module_init, module_exit;
      declarer name DEFINE_IDR;
      identifier init;
      @@
      
      module_init(init);
      
      @ defines_module_exit @
      identifier exit;
      @@
      
      module_exit(exit);
      
      @ declares_idr depends on defines_module_init && defines_module_exit @
      identifier idr;
      @@
      
      DEFINE_IDR(idr);
      
      @ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       idr_destroy(&idr);
       ...
      }
      
      @ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
      identifier declares_idr.idr, defines_module_exit.exit;
      @@
      
      exit(void)
      {
       ...
       +idr_destroy(&idr);
      }
      
      </SmPL>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      45d25420
    • C
      IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES · 59d40dd9
      Carol L Soto 提交于
      ib_ucm_release_dev clears the wrong bit if devnum is greater
      than IB_UCM_MAX_DEVICES.
      Signed-off-by: NCarol L Soto <clsoto@linux.vnet.ibm.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      59d40dd9
    • H
      IB/ucma: Fix lockdep warning in ucma_lock_files · 31b57b87
      Haggai Eran 提交于
      The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
      an ID. Use mutex_lock_nested() to prevent the warning below.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       4.1.0-rc6-hmm+ #40 Tainted: G           O
       ---------------------------------------------
       pingpong_rpc_se/10260 is trying to acquire lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
      
       but task is already holding lock:
        (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&file->mut);
         lock(&file->mut);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by pingpong_rpc_se/10260:
        #0:  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]
      
       stack backtrace:
       CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G           O    4.1.0-rc6-hmm+ #40
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
        ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
        ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
        ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
       Call Trace:
        [<ffffffff81668f49>] dump_stack+0x4f/0x6e
        [<ffffffff810bb991>] __lock_acquire+0x741/0x1820
        [<ffffffff8121bee9>] ? dput+0x29/0x320
        [<ffffffff810bcb38>] lock_acquire+0xc8/0x240
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
        [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
        [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
        [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
        [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
        [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
        [<ffffffff81200674>] __vfs_write+0x34/0x100
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
        [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
        [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
        [<ffffffff812009bb>] vfs_write+0xab/0x120
        [<ffffffff81201519>] SyS_write+0x59/0xd0
        [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
        [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      31b57b87
    • T
      RDMA/core: Fixes for port mapper client registration · a7f2f24c
      Tatyana Nikolova 提交于
      Fixes to allow clients to make remove mapping requests, after
      they have provided the user space service with the mapping
      information, they are using when the service is restarted.
      
      1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
         registration types for the port mapper clients and functions
         to set/check the registration type.
      2) If the port mapper user space service is not available to register
         the client, then its registration stays IWPM_REG_UNDEF and the
         registration isn't checked until the service becomes available
         (no mappings are possible, if the user space service isn't running).
      3) After the service is restarted, the user space port mapper pid is set
         to valid and the client registration is set to IWPM_REG_INCOMPL
         to allow the client to make remove mapping requests.
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
      Tested-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a7f2f24c
    • E
      IB/cm: Do not queue work to a device that's going away · be4b4993
      Erez Shitrit 提交于
      Whenever ib_cm gets remove_one call, like when there is a hot-unplug
      event, the driver should mark itself as going_down and confirm that no
      new works are going to be queued for that device.
      so, the order of the actions are:
      1. mark the going_down bit.
      2. flush the wq.
      3. [make sure no new works for that device.]
      4. unregister mad agent.
      
      otherwise, works that are already queued can be scheduled after the mad
      agent was freed.
      Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      be4b4993
    • I
      IB/mad: Fix compare between big endian and cpu endian · cd4cd565
      Ira Weiny 提交于
      The define OPA_LID_PERMISSIVE is big endian and was compared to the
      cpu endian variable opa_drslid.
      
      Problem caught by 0-day build infrastructure.
      
      Fixes: 8e4349d1 (IB/mad: Add final OPA MAD processing)
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: NJohn, Jubin <jubin.john@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      cd4cd565
    • H
      IB: Add rdma_cap_ib_switch helper and use where appropriate · 4139032b
      Hal Rosenstock 提交于
      Persuant to Liran's comments on node_type on linux-rdma
      mailing list:
      
      In an effort to reform the RDMA core and ULPs to minimize use of
      node_type in struct ib_device, an additional bit is added to
      struct ib_device for is_switch (IB switch). This is needed
      to be initialized by any IB switch device driver. This is a
      NEW requirement on such device drivers which are all
      "out of tree".
      
      In addition, an ib_switch helper was added to ib_verbs.h
      based on the is_switch device bit rather than node_type
      (although those should be consistent).
      
      The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
      as well as (IPoIB and SRP) ULPs are updated where
      appropriate to use this new helper. In some cases,
      the helper is now used under the covers of using
      rdma_[start end]_port rather than the open coding
      previously used.
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Reviewed-by: NIra Weiny <ira.weiny@intel.com>
      Tested-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NHal Rosenstock <hal@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      4139032b
  4. 13 6月, 2015 3 次提交
    • I
      IB/mad: Add final OPA MAD processing · 8e4349d1
      Ira Weiny 提交于
      For devices which support OPA MADs
      
         1) Use previously defined SMP support functions.
      
         2) Pass correct base version to ib_create_send_mad when processing OPA MADs.
      
         3) Process out_mad_key_index returned by agents for a response.  This is
            necessary because OPA SMP packets must carry a valid pkey.
      
         4) Carry the correct segment size (OPA vs IBTA) of RMPP messages within
            ib_mad_recv_wc.
      
         5) Handle variable length OPA MADs by:
      
              * Adjusting the 'fake' WC for locally routed SMP's to represent the
                proper incoming byte_len
              * out_mad_size is used from the local HCA agents
                      1) when sending agent responses on the wire
                      2) when passing responses through the local_completions
      		   function
      
      	NOTE: wc.byte_len includes the GRH length and therefore is different
      	      from the in_mad_size specified to the local HCA agents.
      	      out_mad_size should _not_ include the GRH length as it is added
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      8e4349d1
    • I
      IB/mad: Add partial Intel OPA MAD support · f28990bc
      Ira Weiny 提交于
      Add OPA SMP processing functionality.
      
      Define the new OPA SMP format, create support functions for this format using
      the previously defined helper functions as appropriate.
      
      These functions are defined in this patch and used in the final OPA MAD support
      patch.
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      f28990bc
    • I
      IB/mad: Add partial Intel OPA MAD support · 548ead17
      Ira Weiny 提交于
      This patch is the first of 3 which adds processing of OPA MADs
      
      1) Add Intel Omni-Path Architecture defines
      2) Increase max management version to accommodate OPA
      3) update ib_create_send_mad
      	If the device supports OPA MADs and the MAD being sent is the OPA base
      	version alter the MAD size and sg lengths as appropriate
      Signed-off-by: NIra Weiny <ira.weiny@intel.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      548ead17