1. 22 2月, 2013 2 次提交
    • S
      IB/uverbs: Implement memory windows support in uverbs · 6b52a12b
      Shani Michaeli 提交于
      The existing user/kernel uverbs API has IB_USER_VERBS_CMD_ALLOC/DEALLOC_MW.
      Implement these calls, along with destroying user memory windows during
      process cleanup.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NShani Michaeli <shanim@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      6b52a12b
    • S
      IB/core: Add "type 2" memory windows support · 7083e42e
      Shani Michaeli 提交于
      This patch enhances the IB core support for Memory Windows (MWs).
      
      MWs allow an application to have better/flexible control over remote
      access to memory.
      
      Two types of MWs are supported, with the second type having two flavors:
      
          Type 1  - associated with PD only
          Type 2A - associated with QPN only
          Type 2B - associated with PD and QPN
      
      Applications can allocate a MW once, and then repeatedly bind the MW
      to different ranges in MRs that are associated to the same PD. Type 1
      windows are bound through a verb, while type 2 windows are bound by
      posting a work request.
      
      The 32-bit memory key is composed of a 24-bit index and an 8-bit
      key. The key is changed with each bind, thus allowing more control
      over the peer's use of the memory key.
      
      The changes introduced are the following:
      
      * add memory window type enum and a corresponding parameter to ib_alloc_mw.
      * type 2 memory window bind work request support.
      * create a struct that contains the common part of the bind verb struct
        ibv_mw_bind and the bind work request into a single struct.
      * add the ib_inc_rkey helper function to advance the tag part of an rkey.
      
      Consumer interface details:
      
      * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
        IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
        for these features.
      
        Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
        IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
        memory windows. It can set neither to indicate it doesn't support
        type 2 windows at all.
      
      * modify existing provides and consumers code to the new param of
        ib_alloc_mw and the ib_mw_bind_info structure
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NShani Michaeli <shanim@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      7083e42e
  2. 30 11月, 2012 1 次提交
    • S
      RDMA/cm: Change return value from find_gid_port() · 63f05be2
      shefty 提交于
      Problem reported by Dan Carpenter <dan.carpenter@oracle.com>:
      
      The patch 3c86aa70: "RDMA/cm: Add RDMA CM support for IBoE
      devices" from Oct 13, 2010, leads to the following warning:
      net/sunrpc/xprtrdma/svc_rdma_transport.c:722 svc_rdma_create()
      	 error: passing non neg 1 to ERR_PTR
      
      This bug would result in a NULL dereference.  svc_rdma_create() is
      supposed to return ERR_PTRs or valid pointers, but instead it returns
      ERR_PTRs, valid pointers and 1.
      
      The call tree is:
      
      svc_rdma_create()
         => rdma_bind_addr()
            => cma_acquire_dev()
               => find_gid_port()
      
      rdma_bind_addr() should return a valid errno.  Fix this by having
      find_gid_port() also return a valid errno.  If we can't find the
      specified GID on a given port, return -EADDRNOTAVAIL, rather than
      -EAGAIN, to better indicate the error.  We also drop using the
      special return value of '1' and instead pass through the error
      returned by the underlying verbs call.  On such errors, rather
      than aborting the search,  we simply continue to check the next
      device/port.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      63f05be2
  3. 07 10月, 2012 1 次提交
  4. 06 10月, 2012 1 次提交
  5. 05 10月, 2012 1 次提交
  6. 01 10月, 2012 4 次提交
  7. 27 9月, 2012 2 次提交
  8. 09 9月, 2012 1 次提交
  9. 22 8月, 2012 2 次提交
    • T
      workqueue: deprecate __cancel_delayed_work() · 136b5721
      Tejun Heo 提交于
      Now that cancel_delayed_work() can be safely called from IRQ handlers,
      there's no reason to use __cancel_delayed_work().  Use
      cancel_delayed_work() instead of __cancel_delayed_work() and mark the
      latter deprecated.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJens Axboe <axboe@kernel.dk>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      136b5721
    • T
      workqueue: use mod_delayed_work() instead of __cancel + queue · e7c2f967
      Tejun Heo 提交于
      Now that mod_delayed_work() is safe to call from IRQ handlers,
      __cancel_delayed_work() followed by queue_delayed_work() can be
      replaced with mod_delayed_work().
      
      Most conversions are straight-forward except for the following.
      
      * net/core/link_watch.c: linkwatch_schedule_work() was doing a quite
        elaborate dancing around its delayed_work.  Collapse it such that
        linkwatch_work is queued for immediate execution if LW_URGENT and
        existing timer is kept otherwise.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> 
      e7c2f967
  10. 14 8月, 2012 2 次提交
    • T
      workqueue: use mod_delayed_work() instead of cancel + queue · 41f63c53
      Tejun Heo 提交于
      Convert delayed_work users doing cancel_delayed_work() followed by
      queue_delayed_work() to mod_delayed_work().
      
      Most conversions are straight-forward.  Ones worth mentioning are,
      
      * drivers/edac/edac_mc.c: edac_mc_workq_setup() converted to always
        use mod_delayed_work() and cancel loop in
        edac_mc_reset_delay_period() is dropped.
      
      * drivers/platform/x86/thinkpad_acpi.c: No need to remember whether
        watchdog is active or not.  @fan_watchdog_active and related code
        dropped.
      
      * drivers/power/charger-manager.c: Seemingly a lot of
        delayed_work_pending() abuse going on here.
        [delayed_]work_pending() are unsynchronized and racy when used like
        this.  I converted one instance in fullbatt_handler().  Please
        conver the rest so that it invokes workqueue APIs for the intended
        target state rather than trying to game work item pending state
        transitions.  e.g. if timer should be modified - call
        mod_delayed_work(), canceled - call cancel_delayed_work[_sync]().
      
      * drivers/thermal/thermal_sys.c: thermal_zone_device_set_polling()
        simplified.  Note that round_jiffies() calls in this function are
        meaningless.  round_jiffies() work on absolute jiffies not delta
        delay used by delayed_work.
      
      v2: Tomi pointed out that __cancel_delayed_work() users can't be
          safely converted to mod_delayed_work().  They could be calling it
          from irq context and if that happens while delayed_work_timer_fn()
          is running, it could deadlock.  __cancel_delayed_work() users are
          dropped.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NHenrique de Moraes Holschuh <hmh@hmh.eng.br>
      Acked-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
      Acked-by: NAnton Vorontsov <cbouatmailru@gmail.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: "John W. Linville" <linville@tuxdriver.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      41f63c53
    • T
      RDMA/ucma.c: Fix for events with wrong context on iWARP · 418edaab
      Tatyana Nikolova 提交于
      It is possible for asynchronous RDMA_CM_EVENT_ESTABLISHED events to be
      generated with ctx->uid == 0, because ucma_set_event_context() copies
      ctx->uid to the event structure outside of ctx->file->mut.  This leads
      to a crash in the userspace library, since it gets a bogus event.
      
      Fix this by taking the mutex a bit earlier in ucma_event_handler.
      Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
      Signed-off-by: NSean Hefty <Sean.Hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      418edaab
  11. 28 7月, 2012 2 次提交
  12. 12 7月, 2012 1 次提交
  13. 09 7月, 2012 6 次提交
  14. 30 6月, 2012 1 次提交
    • P
      netlink: add netlink_kernel_cfg parameter to netlink_kernel_create · a31f2d17
      Pablo Neira Ayuso 提交于
      This patch adds the following structure:
      
      struct netlink_kernel_cfg {
              unsigned int    groups;
              void            (*input)(struct sk_buff *skb);
              struct mutex    *cb_mutex;
      };
      
      That can be passed to netlink_kernel_create to set optional configurations
      for netlink kernel sockets.
      
      I've populated this structure by looking for NULL and zero parameters at the
      existing code. The remaining parameters that always need to be set are still
      left in the original interface.
      
      That includes optional parameters for the netlink socket creation. This allows
      easy extensibility of this interface in the future.
      
      This patch also adapts all callers to use this new interface.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a31f2d17
  15. 27 6月, 2012 1 次提交
  16. 20 6月, 2012 1 次提交
  17. 12 5月, 2012 1 次提交
  18. 09 5月, 2012 5 次提交
    • O
      IB/core: Add raw packet QP type · c938a616
      Or Gerlitz 提交于
      IB_QPT_RAW_PACKET allows applications to build a complete packet,
      including L2 headers, when sending; on the receive side, the HW will
      not strip any headers.
      
      This QP type is designed for userspace direct access to Ethernet; for
      example by applications that do TCP/IP themselves.  Only processes
      with the NET_RAW capability are allowed to create raw packet QPs (the
      name "raw packet QP" is supposed to suggest an analogy to AF_PACKET /
      SOL_RAW sockets).
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c938a616
    • S
      RDMA/cma: Fix lockdep false positive recursive locking · b6cec8aa
      Sean Hefty 提交于
      The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>:
      
          [ INFO: possible recursive locking detected ]
          3.3.0-32035-g1b2649e-dirty #4 Not tainted
          ---------------------------------------------
          kworker/5:1/418 is trying to acquire lock:
           (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i    d+0x33/0x1f0 [rdma_cm]
      
          but task is already holding lock:
           (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca    llback+0x24/0x45 [rdma_cm]
      
          other info that might help us debug this:
           Possible unsafe locking scenario:
      
                 CPU0
                 ----
            lock(&id_priv->handler_mutex);
            lock(&id_priv->handler_mutex);
      
           *** DEADLOCK ***
      
           May be due to missing lock nesting notation
      
          3 locks held by kworker/5:1/418:
           #0:  (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a    6
           #1:  ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on    e_work+0x210/0x4a6
           #2:  (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab    le_callback+0x24/0x45 [rdma_cm]
      
          stack backtrace:
          Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4
          Call Trace:
           [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204
           [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e
           [<ffffffff8106461f>] ? save_trace+0x3f/0xb3
           [<ffffffff810688fa>] lock_acquire+0xf0/0x116
           [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
           [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce
           [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
           [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
           [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf
           [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
           [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm]
           [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm]
           [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm]
           [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
           [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm]
           [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
           [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
           [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6
           [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6
           [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40
           [<ffffffff8104316e>] worker_thread+0x1d6/0x350
           [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241
           [<ffffffff81046a32>] kthread+0x84/0x8c
           [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10
           [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe
           [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56
           [<ffffffff8136e850>] ? gs_change+0xb/0xb
      
      The actual locking is fine, since we're dealing with different locks,
      but from the same lock class.  cma_disable_callback() acquires the
      listening id mutex, whereas rdma_destroy_id() acquires the mutex for
      the new connection id.  To fix this, delay the call to
      rdma_destroy_id() until we've released the listening id mutex.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b6cec8aa
    • R
      IB/uverbs: Lock SRQ / CQ / PD objects in a consistent order · 5909ce54
      Roland Dreier 提交于
      Since XRC support was added, the uverbs code has locked SRQ, CQ and PD
      objects needed during QP and SRQ creation in different orders
      depending on the the code path.  This leads to the (at least
      theoretical) possibility of deadlock, and triggers the lockdep splat
      below.
      
      Fix this by making sure we always lock the SRQ first, then CQs and
      finally the PD.
      
          ======================================================
          [ INFO: possible circular locking dependency detected ]
          3.4.0-rc5+ #34 Not tainted
          -------------------------------------------------------
          ibv_srq_pingpon/2484 is trying to acquire lock:
           (SRQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
      
          but task is already holding lock:
           (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
      
          which lock already depends on the new lock.
      
          the existing dependency chain (in reverse order) is:
      
          -> #2 (CQ-uobj){+++++.}:
                 [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
                 [<ffffffff81384f28>] down_read+0x34/0x43
                 [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
                 [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
                 [<ffffffffa00b16c3>] ib_uverbs_create_qp+0x180/0x684 [ib_uverbs]
                 [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
                 [<ffffffff810fe47f>] vfs_write+0xa7/0xee
                 [<ffffffff810fe65f>] sys_write+0x45/0x69
                 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b
      
          -> #1 (PD-uobj){++++++}:
                 [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
                 [<ffffffff81384f28>] down_read+0x34/0x43
                 [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
                 [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
                 [<ffffffffa00af8ad>] __uverbs_create_xsrq+0x96/0x386 [ib_uverbs]
                 [<ffffffffa00b31b9>] ib_uverbs_detach_mcast+0x1cd/0x1e6 [ib_uverbs]
                 [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
                 [<ffffffff810fe47f>] vfs_write+0xa7/0xee
                 [<ffffffff810fe65f>] sys_write+0x45/0x69
                 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b
      
          -> #0 (SRQ-uobj){+++++.}:
                 [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
                 [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
                 [<ffffffff81384f28>] down_read+0x34/0x43
                 [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
                 [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
                 [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
                 [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
                 [<ffffffff810fe47f>] vfs_write+0xa7/0xee
                 [<ffffffff810fe65f>] sys_write+0x45/0x69
                 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b
      
          other info that might help us debug this:
      
          Chain exists of:
            SRQ-uobj --> PD-uobj --> CQ-uobj
      
           Possible unsafe locking scenario:
      
                 CPU0                    CPU1
                 ----                    ----
            lock(CQ-uobj);
                                         lock(PD-uobj);
                                         lock(CQ-uobj);
            lock(SRQ-uobj);
      
           *** DEADLOCK ***
      
          3 locks held by ibv_srq_pingpon/2484:
           #0:  (QP-uobj){+.+...}, at: [<ffffffffa00b162c>] ib_uverbs_create_qp+0xe9/0x684 [ib_uverbs]
           #1:  (PD-uobj){++++++}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           #2:  (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
      
          stack backtrace:
          Pid: 2484, comm: ibv_srq_pingpon Not tainted 3.4.0-rc5+ #34
          Call Trace:
           [<ffffffff8137eff0>] print_circular_bug+0x1f8/0x209
           [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
           [<ffffffffa00af37c>] ? __idr_get_uobj+0x20/0x5e [ib_uverbs]
           [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffff81070eee>] ? lock_release+0x166/0x189
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
           [<ffffffff81070fec>] ? lock_acquire+0xdb/0xfe
           [<ffffffff81070c09>] ? lock_release_non_nested+0x94/0x213
           [<ffffffff810d470f>] ? might_fault+0x40/0x90
           [<ffffffff810d470f>] ? might_fault+0x40/0x90
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810ff736>] ? fget_light+0x3b/0x99
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b
      Reported-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      5909ce54
    • R
      IB/uverbs: Make lockdep output more readable · 3bea57a5
      Roland Dreier 提交于
      Add names for our lockdep classes, so instead of having to decipher
      lockdep output with mysterious names:
      
          Chain exists of:
            key#14 --> key#11 --> key#13
      
      lockdep will give us something nicer:
      
          Chain exists of:
            SRQ-uobj --> PD-uobj --> CQ-uobj
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      3bea57a5
    • O
      IB/core: Use qp->usecnt to track multicast attach/detach · c3bccbfb
      Or Gerlitz 提交于
      Just as we don't allow PDs, CQs, etc. to be destroyed if there are QPs
      that are attached to them, don't let a QP be destroyed if there are
      multicast group(s) attached to it.  Use the existing usecnt field of
      struct ib_qp which was added by commit 0e0ec7e0 ("RDMA/core: Export
      ib_open_qp() to share XRC TGT QPs") to track this.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c3bccbfb
  19. 25 4月, 2012 2 次提交
  20. 21 4月, 2012 2 次提交
  21. 05 4月, 2012 1 次提交