1. 05 6月, 2013 1 次提交
    • M
      IB/qib: Fix lockdep splat in qib_alloc_lkey() · f3bdf344
      Mike Marciniszyn 提交于
      The following backtrace is reported with CONFIG_PROVE_RCU:
      
          drivers/infiniband/hw/qib/qib_keys.c:64 suspicious rcu_dereference_check() usage!
          other info that might help us debug this:
          rcu_scheduler_active = 1, debug_locks = 1
          4 locks held by kworker/0:1/56:
          #0:  (events){.+.+.+}, at: [<ffffffff8107a4f5>] process_one_work+0x165/0x4a0
          #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8107a4f5>] process_one_work+0x165/0x4a0
          #2:  (device_mutex){+.+.+.}, at: [<ffffffffa0148dd8>] ib_register_device+0x38/0x220 [ib_core]
          #3:  (&(&dev->lk_table.lock)->rlock){......}, at: [<ffffffffa017e81c>] qib_alloc_lkey+0x3c/0x1b0 [ib_qib]
      
          stack backtrace:
          Pid: 56, comm: kworker/0:1 Not tainted 3.10.0-rc1+ #6
          Call Trace:
          [<ffffffff810c0b85>] lockdep_rcu_suspicious+0xe5/0x130
          [<ffffffffa017e8e1>] qib_alloc_lkey+0x101/0x1b0 [ib_qib]
          [<ffffffffa0184886>] qib_get_dma_mr+0xa6/0xd0 [ib_qib]
          [<ffffffffa01461aa>] ib_get_dma_mr+0x1a/0x50 [ib_core]
          [<ffffffffa01678dc>] ib_mad_port_open+0x12c/0x390 [ib_mad]
          [<ffffffff810c2c55>] ?  trace_hardirqs_on_caller+0x105/0x190
          [<ffffffffa0167b92>] ib_mad_init_device+0x52/0x110 [ib_mad]
          [<ffffffffa01917c0>] ?  sl2vl_attr_show+0x30/0x30 [ib_qib]
          [<ffffffffa0148f49>] ib_register_device+0x1a9/0x220 [ib_core]
          [<ffffffffa01b1685>] qib_register_ib_device+0x735/0xa40 [ib_qib]
          [<ffffffff8106ba98>] ? mod_timer+0x118/0x220
          [<ffffffffa017d425>] qib_init_one+0x1e5/0x400 [ib_qib]
          [<ffffffff812ce86e>] local_pci_probe+0x4e/0x90
          [<ffffffff81078118>] work_for_cpu_fn+0x18/0x30
          [<ffffffff8107a566>] process_one_work+0x1d6/0x4a0
          [<ffffffff8107a4f5>] ?  process_one_work+0x165/0x4a0
          [<ffffffff8107c9c9>] worker_thread+0x119/0x370
          [<ffffffff8107c8b0>] ?  manage_workers+0x180/0x180
          [<ffffffff8108294e>] kthread+0xee/0x100
          [<ffffffff81082860>] ?  __init_kthread_worker+0x70/0x70
          [<ffffffff815c04ac>] ret_from_fork+0x7c/0xb0
          [<ffffffff81082860>] ?  __init_kthread_worker+0x70/0x70
      
      Per Documentation/RCU/lockdep-splat.txt, the code now uses rcu_access_pointer()
      vs. rcu_dereference().
      Reported-by: NJay Fenlason <fenlason@redhat.com>
      Reviewed-by: NDean Luick <dean.luick@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f3bdf344
  2. 02 10月, 2012 1 次提交
  3. 11 7月, 2012 1 次提交
  4. 09 7月, 2012 2 次提交
    • M
      IB/qib: RCU locking for MR validation · 8aac4cc3
      Mike Marciniszyn 提交于
      Profiling indicates that MR validation locking is expensive.  The MR
      table is largely read-only and is a suitable candidate for RCU locking.
      
      The patch uses RCU locking during validation to eliminate one
      lock/unlock during that validation.
      Reviewed-by: NMike Heinz <michael.william.heinz@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8aac4cc3
    • M
      IB/qib: Avoid returning EBUSY from MR deregister · 6a82649f
      Mike Marciniszyn 提交于
      A timing issue can occur where qib_mr_dereg can return -EBUSY if the
      MR use count is not zero.
      
      This can occur if the MR is de-registered while RDMA read response
      packets are being progressed from the SDMA ring.  The suspicion is
      that the peer sent an RDMA read request, which has already been copied
      across to the peer.  The peer sees the completion of his request and
      then communicates to the responder that the MR is not needed any
      longer.  The responder tries to de-register the MR, catching some
      responses remaining in the SDMA ring holding the MR use count.
      
      The code now uses a get/put paradigm to track MR use counts and
      coordinates with the MR de-registration process using a completion
      when the count has reached zero.  A timeout on the delay is in place
      to catch other EBUSY issues.
      
      The reference count protocol is as follows:
      - The return to the user counts as 1
      - A reference from the lk_table or the qib_ibdev counts as 1.
      - Transient I/O operations increase/decrease as necessary
      
      A lot of code duplication has been folded into the new routines
      init_qib_mregion() and deinit_qib_mregion().  Additionally, explicit
      initialization of fields to zero is now handled by kzalloc().
      
      Also, duplicated code 'while.*num_sge' that decrements reference
      counts have been consolidated in qib_put_ss().
      Reviewed-by: NRamkrishna Vepa <ramkrishna.vepa@intel.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      6a82649f
  5. 11 1月, 2011 2 次提交
  6. 24 5月, 2010 1 次提交
  7. 06 12月, 2008 1 次提交
  8. 26 1月, 2008 1 次提交
  9. 10 7月, 2007 1 次提交
  10. 19 4月, 2007 2 次提交
  11. 13 12月, 2006 1 次提交
  12. 29 9月, 2006 1 次提交
  13. 23 9月, 2006 1 次提交
  14. 25 7月, 2006 1 次提交
  15. 02 7月, 2006 2 次提交
  16. 24 5月, 2006 1 次提交
  17. 01 4月, 2006 1 次提交