1. 30 11月, 2011 1 次提交
    • M
      IB/ipoib: Prevent hung task or softlockup processing multicast response · 3874397c
      Mike Marciniszyn 提交于
      This following can occur with ipoib when processing a multicast reponse:
      
          BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
          Modules linked in: ...
          CPU 0:
          Modules linked in: ...
          Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
          RIP: 0010:[<ffffffff814ddb27>]  [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20
          RSP: 0018:ffff8802119ed860  EFLAGS: 00000246
          0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
          RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
          RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
          R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
          R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
          FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
          CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
          Call Trace:
          [<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
          [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
          [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
          [<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
          [<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
          [<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0
          [<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0
          [<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0
          [<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
          [<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
          [<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
          [<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
          [<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
          [<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa]
          [<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
          [<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa]
          [<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
          [<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
          [<ffffffff81088830>] ? worker_thread+0x170/0x2a0
          [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
          [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
          [<ffffffff8108ddf6>] ? kthread+0x96/0xa0
          [<ffffffff8100c1ca>] ? child_rip+0xa/0x20
      
      Coinciding with stack trace is the following message:
      
          ib0: ib_address_create failed
      
      The code below in ipoib_mcast_join_finish() will note the above
      failure in the address handle but otherwise continue:
      
                      ah = ipoib_create_ah(dev, priv->pd, &av);
                      if (!ah) {
                              ipoib_warn(priv, "ib_address_create failed\n");
                      } else {
      
      The while loop at the bottom of ipoib_mcast_join_finish() will attempt
      to send queued multicast packets in mcast->pkt_queue and eventually
      end up in ipoib_mcast_send():
      
              if (!mcast->ah) {
                      if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
                              skb_queue_tail(&mcast->pkt_queue, skb);
                      else {
                              ++dev->stats.tx_dropped;
                              dev_kfree_skb_any(skb);
                      }
      
      My read is that the code will requeue the packet and return to the
      ipoib_mcast_join_finish() while loop and the stage is set for the
      "hung" task diagnostic as the while loop never sees a non-NULL ah, and
      will do nothing to resolve.
      
      There are GFP_ATOMIC allocates in the provider routines, so this is
      possible and should be dealt with.
      
      The test that induced the failure is associated with a host SM on the
      same server during a shutdown.
      
      This patch causes ipoib_mcast_join_finish() to exit with an error
      which will flush the queued mcast packets.  Nothing is done to unwind
      the QP attached state so that subsequent sends from above will retry
      the join.
      Reviewed-by: NRam Vepa <ram.vepa@qlogic.com>
      Reviewed-by: NGary Leshner <gary.leshner@qlogic.com>
      Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      3874397c
  2. 05 11月, 2011 2 次提交
  3. 01 11月, 2011 3 次提交
  4. 19 10月, 2011 2 次提交
  5. 14 10月, 2011 1 次提交
  6. 12 10月, 2011 1 次提交
  7. 27 8月, 2011 4 次提交
  8. 18 8月, 2011 3 次提交
  9. 17 8月, 2011 1 次提交
  10. 27 7月, 2011 1 次提交
  11. 25 7月, 2011 1 次提交
    • N
      iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi · 12352183
      Nicholas Bellinger 提交于
      This patch renames the following iscsi_proto.h structures to avoid
      namespace issues with drivers/target/iscsi/iscsi_target_core.h:
      
      *) struct iscsi_cmd -> struct iscsi_scsi_req
      *) struct iscsi_cmd_rsp -> struct iscsi_scsi_rsp
      *) struct iscsi_login -> struct iscsi_login_req
      
      This patch includes useful ISCSI_FLAG_LOGIN_[CURRENT,NEXT]_STAGE*,
      and ISCSI_FLAG_SNACK_TYPE_* definitions used by iscsi_target_mod, and
      fixes the incorrect definition of struct iscsi_snack to following
      RFC-3720 Section 10.16. SNACK Request.
      
      Also, this patch updates libiscsi, iSER, be2iscsi, and bn2xi to
      use the updated structure definitions in a handful of locations.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: NNicholas A. Bellinger <nab@linux-iscsi.org>
      12352183
  12. 18 7月, 2011 1 次提交
  13. 14 7月, 2011 1 次提交
    • B
      IB/srp: Avoid duplicate devices from LUN scan · fd1b6c4a
      Bart Van Assche 提交于
      SCSI scanning of a channel:id:lun triplet in Linux works as follows
      (function scsi_scan_target() in drivers/scsi/scsi_scan.c):
      
      - If lun == SCAN_WILD_CARD, send a REPORT LUNS command to the target
        and process the result.
      
      - If lun != SCAN_WILD_CARD, send an INQUIRY command to the LUN
        corresponding to the specified channel:id:lun triplet to verify
        whether the LUN exists.
      
      So a SCSI driver must either take the channel and target id values in
      account in its quecommand() function or it should declare that it only
      supports one channel and one target id.
      
      Currently the ib_srp driver does neither.  As a result scanning the
      SCSI bus via e.g. rescan-scsi-bus.sh causes many duplicate SCSI
      devices to be created. For each 0:0:L device, several duplicates are
      created with the same LUN number and with (C:I) != (0:0). Fix this by
      declaring that the ib_srp driver only supports one channel and one
      target id.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: <stable@kernel.org>
      Acked-by: NDavid Dillow <dillowda@ornl.gov>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      fd1b6c4a
  14. 07 6月, 2011 1 次提交
  15. 26 5月, 2011 1 次提交
    • S
      RDMA/cma: Pass QP type into rdma_create_id() · b26f9b99
      Sean Hefty 提交于
      The RDMA CM currently infers the QP type from the port space selected
      by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
      be a 1-1 correspondence between port space and QP type.  For netlink
      export of RDMA CM state, we want to export the QP type to userspace,
      so it is cleaner to explicitly associate a QP type to an ID.
      
      Modify rdma_create_id() to allow the user to specify the QP type, and
      use it to make our selections of datagram versus connected mode.
      Signed-off-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b26f9b99
  16. 24 5月, 2011 1 次提交
  17. 20 4月, 2011 1 次提交
  18. 31 3月, 2011 1 次提交
  19. 16 3月, 2011 6 次提交
    • D
      IB/srp: try to use larger FMR sizes to cover our mappings · be8b9814
      David Dillow 提交于
      Now that we can get larger SG lists, we can take advantage of HCAs that
      allow us to use larger FMR sizes. In many cases, we can use up to 512
      entries, so start there and work our way down.
      Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
      be8b9814
    • D
      IB/srp: add support for indirect tables that don't fit in SRP_CMD · c07d424d
      David Dillow 提交于
      This allows us to guarantee the ability to submit up to 8 MB requests
      based on the current value of SCSI_MAX_SG_CHAIN_SEGMENTS. While FMR will
      usually condense the requests into 8 SG entries, it is imperative that
      the target support external tables in case the FMR mapping fails or is
      not supported.
      
      We add a safety valve to allow targets without the needed support to
      reap the benefits of the large tables, but fail in a manner that lets
      the user know that the data didn't make it to the device. The user must
      add "allow_ext_sg=1" to the target parameters to indicate that the
      target has the needed support.
      
      If indirect_sg_entries is not specified in the modules options, then
      the sg_tablesize for the target will default to cmd_sg_entries unless
      overridden by the target options.
      Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
      c07d424d
    • D
      IB/srp: rework mapping engine to use multiple FMR entries · 8f26c9ff
      David Dillow 提交于
      Instead of forcing all of the S/G entries to fit in one FMR, and falling
      back to indirect descriptors if that fails, allow the use of as many
      FMRs as needed to map the request. This lays the groundwork for allowing
      indirect descriptor tables that are larger than can fit in the command
      IU, but should marginally improve performance now by reducing the number
      of indirect descriptors needed.
      
      We increase the minimum page size for the FMR pool to 4K, as larger
      pages help increase the coverage of each FMR, and it is rare that the
      kernel would send down a request with scattered 512 byte fragments.
      
      This patch also move some of the target initialization code afte the
      parsing of options, to keep it together with the new code that needs to
      allocate memory based on the options given.
      Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
      8f26c9ff
    • D
      IB/srp: allow sg_tablesize to be set for each target · 49248644
      David Dillow 提交于
      Different configurations of target software allow differing max sizes of
      the command IU. Allowing this to be changed per-target allows all
      targets on an initiator to get an optimal setting.
      
      We deprecate srp_sg_tablesize and replace it with cmd_sg_entries in
      preparation for allowing more indirect descriptors than can fit in the
      IU.
      Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
      49248644
    • D
      IB/srp: move IB CM setup completion into its own function · 961e0be8
      David Dillow 提交于
      This is to clean up prior to further changes.
      Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
      961e0be8
    • D
      IB/srp: always avoid non-zero offsets into an FMR · 8c4037b5
      David Dillow 提交于
      It is unclear exactly how this code works around Mellanox SRP targets,
      or if the problem is on the target side or in the HCA itself. In an
      abundance of caution, we should always enable the workaround.
      Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
      8c4037b5
  20. 25 2月, 2011 1 次提交
  21. 21 1月, 2011 1 次提交
    • D
      kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT · 6a108a14
      David Rientjes 提交于
      The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
      is used to configure any non-standard kernel with a much larger scope than
      only small devices.
      
      This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
      references to the option throughout the kernel.  A new CONFIG_EMBEDDED
      option is added that automatically selects CONFIG_EXPERT when enabled and
      can be used in the future to isolate options that should only be
      considered for embedded systems (RISC architectures, SLOB, etc).
      
      Calling the option "EXPERT" more accurately represents its intention: only
      expert users who understand the impact of the configuration changes they
      are making should enable it.
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NDavid Woodhouse <david.woodhouse@intel.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Greg KH <gregkh@suse.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Robin Holt <holt@sgi.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6a108a14
  22. 17 1月, 2011 1 次提交
    • T
      RDMA: Update workqueue usage · f0626710
      Tejun Heo 提交于
      * ib_wq is added, which is used as the common workqueue for infiniband
        instead of the system workqueue.  All system workqueue usages
        including flush_scheduled_work() callers are converted to use and
        flush ib_wq.
      
      * cancel_delayed_work() + flush_scheduled_work() converted to
        cancel_delayed_work_sync().
      
      * qib_wq is removed and ib_wq is used instead.
      
      This is to prepare for deprecation of flush_scheduled_work().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      f0626710
  23. 14 1月, 2011 1 次提交
  24. 13 1月, 2011 1 次提交
  25. 11 1月, 2011 2 次提交