1. 17 5月, 2012 1 次提交
    • D
      [SCSI] sd: limit the scope of the async probe domain · a7a20d10
      Dan Williams 提交于
      sd injects and synchronizes probe work on the global kernel-wide domain.
      This runs into conflict with PM that wants to perform resume actions in
      async context:
      
      [  494.237079] INFO: task kworker/u:3:554 blocked for more than 120 seconds.
      [  494.294396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  494.360809] kworker/u:3     D 0000000000000000     0   554      2 0x00000000
      [  494.420739]  ffff88012e4d3af0 0000000000000046 ffff88013200c160 ffff88012e4d3fd8
      [  494.484392]  ffff88012e4d3fd8 0000000000012500 ffff8801394ea0b0 ffff88013200c160
      [  494.548038]  ffff88012e4d3ae0 00000000000001e3 ffffffff81a249e0 ffff8801321c5398
      [  494.611685] Call Trace:
      [  494.632649]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
      [  494.674687]  [<ffffffff8104b968>] async_synchronize_cookie_domain+0xb6/0x112
      [  494.734177]  [<ffffffff810461ff>] ? __init_waitqueue_head+0x50/0x50
      [  494.787134]  [<ffffffff8131a224>] ? scsi_remove_target+0x48/0x48
      [  494.837900]  [<ffffffff8104b9d9>] async_synchronize_cookie+0x15/0x17
      [  494.891567]  [<ffffffff8104ba49>] async_synchronize_full+0x54/0x70  <-- here we wait for async contexts to complete
      [  494.943783]  [<ffffffff8104b9f5>] ? async_synchronize_full_domain+0x1a/0x1a
      [  495.002547]  [<ffffffffa00114b1>] sd_remove+0x2c/0xa2 [sd_mod]
      [  495.051861]  [<ffffffff812fe94f>] __device_release_driver+0x86/0xcf
      [  495.104807]  [<ffffffff812fe9bd>] device_release_driver+0x25/0x32  <-- here we take device_lock()
      
      [  853.511341] INFO: task kworker/u:4:549 blocked for more than 120 seconds.
      [  853.568693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  853.635119] kworker/u:4     D ffff88013097b5d0     0   549      2 0x00000000
      [  853.695129]  ffff880132773c40 0000000000000046 ffff880130790000 ffff880132773fd8
      [  853.758990]  ffff880132773fd8 0000000000012500 ffff88013288a0b0 ffff880130790000
      [  853.822796]  0000000000000246 0000000000000040 ffff88013097b5c8 ffff880130790000
      [  853.886633] Call Trace:
      [  853.907631]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
      [  853.949670]  [<ffffffff8149cc44>] __mutex_lock_common+0x220/0x351
      [  854.001225]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
      [  854.049082]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
      [  854.097011]  [<ffffffff8149ce48>] mutex_lock_nested+0x2f/0x36   <-- here we wait for device_lock()
      [  854.145591]  [<ffffffff81304bd7>] device_resume+0x58/0x1c4
      [  854.192066]  [<ffffffff81304d61>] async_resume+0x1e/0x45
      [  854.237019]  [<ffffffff8104bc93>] async_run_entry_fn+0xc6/0x173  <-- ...while running in async context
      
      Provide a 'scsi_sd_probe_domain' so that async probe actions actions can
      be flushed without regard for the state of PM, and allow for the resume
      path to handle devices that have transitioned from SDEV_QUIESCE to
      SDEV_DEL prior to resume.
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      [alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume]
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      [jejb: remove unneeded config guards in include file]
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      a7a20d10
  2. 20 2月, 2012 1 次提交
    • M
      [SCSI] Handle disk devices which can not process medium access commands · 18a4d0a2
      Martin K. Petersen 提交于
      We have experienced several devices which fail in a fashion we do not
      currently handle gracefully in SCSI. After a failure these devices will
      respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
      but any command accessing the storage medium will time out.
      
      The following patch adds an callback that can be used by upper level
      drivers to inspect the results of an error handling command. This in
      turn has been used to implement additional checking in the SCSI disk
      driver.
      
      If a medium access command fails twice but TEST UNIT READY succeeds both
      times in the subsequent error handling we will offline the device. The
      maximum number of failed commands required to take a device offline can
      be tweaked in sysfs.
      
      Also add a new error flag to scsi_debug which allows this scenario to be
      easily reproduced.
      
      [jejb: fix up integer parsing to use kstrtouint]
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      18a4d0a2
  3. 17 11月, 2010 1 次提交
    • J
      SCSI host lock push-down · f281233d
      Jeff Garzik 提交于
      Move the mid-layer's ->queuecommand() invocation from being locked
      with the host lock to being unlocked to facilitate speeding up the
      critical path for drivers who don't need this lock taken anyway.
      
      The patch below presents a simple SCSI host lock push-down as an
      equivalent transformation.  No locking or other behavior should change
      with this patch.  All existing bugs and locking orders are preserved.
      
      Additionally, add one parameter to queuecommand,
      	struct Scsi_Host *
      and remove one parameter from queuecommand,
      	void (*done)(struct scsi_cmnd *)
      
      Scsi_Host* is a convenient pointer that most host drivers need anyway,
      and 'done' is redundant to struct scsi_cmnd->scsi_done.
      
      Minimal code disturbance was attempted with this change.  Most drivers
      needed only two one-line modifications for their host lock push-down.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      Acked-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f281233d
  4. 16 9月, 2010 1 次提交
  5. 01 5月, 2010 1 次提交
  6. 02 3月, 2010 1 次提交
  7. 19 1月, 2010 1 次提交
  8. 05 12月, 2009 1 次提交
    • V
      [SCSI] add queue_depth ramp up code · 4a84067d
      Vasu Dev 提交于
      Current FC HBA queue_depth ramp up code depends on last queue
      full time. The sdev already  has last_queue_full_time field to
      track last queue full time but stored value is truncated by
      last four bits.
      
      So this patch updates last_queue_full_time without truncating
      last 4 bits to store full value and then updates its only
      current usages in scsi_track_queue_full to ignore last four bits
      to keep current usages same while also use this field
      in added ramp up code.
      
      Adds scsi_handle_queue_ramp_up to ramp up queue_depth on
      successful completion of IO. The scsi_handle_queue_ramp_up will
      do ramp up on all luns of a target, just same as ramp down done
      on all luns on a target.
      
      The ramp up is skipped in case the change_queue_depth is not
      supported by LLD or already reached to added max_queue_depth.
      
      Updates added max_queue_depth on every new update to default
      queue_depth value.
      
      The ramp up is also skipped if lapsed time since either last
      queue ramp up or down is less than LLD specified
      queue_ramp_up_period.
      
      Adds queue_ramp_up_period to sysfs but only if change_queue_depth
      is supported since ramp up and queue_ramp_up_period is needed only
      in case change_queue_depth is supported first.
      
      Initializes queue_ramp_up_period to 120HZ jiffies as initial
      default value, it is same as used in existing lpfc and qla2xxx.
      
      -v2
       Combined all ramp code into this single patch.
      
      -v3
       Moves max_queue_depth initialization after slave_configure is
      called from after slave_alloc calling done. Also adjusted
      max_queue_depth check to skip ramp up if current queue_depth
      is >= max_queue_depth.
      
      -v4
       Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f
      to store or show its value.
      Signed-off-by: NVasu Dev <vasu.dev@intel.com>
      Tested-by: NChristof Schmitt <christof.schmitt@de.ibm.com>
      Tested-by: NGiridhar Malavali <giridhar.malavali@qlogic.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      4a84067d
  9. 02 10月, 2009 1 次提交
  10. 23 8月, 2009 1 次提交
  11. 09 6月, 2009 1 次提交
  12. 03 4月, 2009 1 次提交
  13. 13 3月, 2009 1 次提交
  14. 14 1月, 2009 1 次提交
  15. 03 1月, 2009 1 次提交
  16. 13 10月, 2008 1 次提交
    • M
      [SCSI] Add helper code so transport classes/driver can control queueing (v3) · f0c0a376
      Mike Christie 提交于
      SCSI-ml manages the queueing limits for the device and host, but
      does not do so at the target level. However something something similar
      can come in userful when a driver is transitioning a transport object to
      the the blocked state, becuase at that time we do not want to queue
      io and we do not want the queuecommand to be called again.
      
      The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers.
      You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit
      a transport level queueing issue like the hw cannot allocate some
      resource at the iscsi session/connection level, or the target has temporarily
      closed or shrunk the queueing window, or if we are transitioning
      to the blocked state.
      
      bnx2i, when they rework their firmware according to netdev
      developers requests, will also need to be able to limit queueing at this
      level. bnx2i will hook into libiscsi, but will allocate a scsi host per
      netdevice/hba, so unlike pure software iscsi/iser which is allocating
      a host per session, it cannot set the scsi_host->can_queue and return
      SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport.
      
      The iscsi class/driver can also set a scsi_target->can_queue value which
      reflects the max commands the driver/class can support. For iscsi this
      reflects the number of commands we can support for each session due to
      session/connection hw limits, driver limits, and to also reflect the
      session/targets's queueing window.
      
      Changes:
      v1 - initial patch.
      v2 - Fix scsi_run_queue handling of multiple blocked targets.
      Previously we would break from the main loop if a device was added back on
      the starved list. We now run over the list and check if any target is
      blocked.
      v3 - Rediff for scsi-misc.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      f0c0a376
  17. 09 10月, 2008 1 次提交
  18. 04 10月, 2008 1 次提交
  19. 27 7月, 2008 2 次提交
  20. 05 6月, 2008 1 次提交
  21. 03 5月, 2008 1 次提交
  22. 30 4月, 2008 1 次提交
  23. 21 4月, 2008 1 次提交
    • F
      block: move the padding adjustment to blk_rq_map_sg · f18573ab
      FUJITA Tomonori 提交于
      blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule
      that req->data_len (the true data length) is equal to sum(bio). It
      broke the scsi command completion code.
      
      commit e97a294e was introduced to fix
      the above issue. However, the partial completion code doesn't work
      with it. The commit is also a layer violation (scsi mid-layer should
      not know about the block layer's padding).
      
      This patch moves the padding adjustment to blk_rq_map_sg (suggested by
      James). The padding works like the drain buffer. This patch breaks the
      rule that req->data_len is equal to sum(sg), however, the drain buffer
      already broke it. So this patch just restores the rule that
      req->data_len is equal to sub(bio) without breaking anything new.
      
      Now when a low level driver needs padding, blk_rq_map_user and
      blk_rq_map_user_iov guarantee there's enough room for padding.
      blk_rq_map_sg can safely extend the last entry of a scatter list.
      
      blk_rq_map_sg must extend the last entry of a scatter list only for a
      request that got through bio_copy_user_iov. This patches introduces
      new REQ_COPY_USER flag.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f18573ab
  24. 08 4月, 2008 2 次提交
  25. 07 4月, 2008 1 次提交
    • H
      scsi: fix sense_slab/bio swapping livelock · 164fc5dc
      Hugh Dickins 提交于
      Since 2.6.25-rc7, I've been seeing an occasional livelock on one x86_64
      machine, copying kernel trees to tmpfs, paging out to swap.
      
      Signature: 6000 pages under writeback but never getting written; most
      tasks of interest trying to reclaim, but each get_swap_bio waiting for a
      bio in mempool_alloc's io_schedule_timeout(5*HZ); every five seconds an
      atomic page allocation failure report from kblockd failing to allocate a
      sense_buffer in __scsi_get_command.
      
      __scsi_get_command has a (one item) free_list to protect against this,
      but rc1's [SCSI] use dynamically allocated sense buffer
      de25deb1 upset that slightly.  When it
      fails to allocate from the separate sense_slab, instead of giving up, it
      must fall back to the command free_list, which is sure to have a
      sense_buffer attached.
      
      Either my earlier -rc testing missed this, or there's some recent
      contributory factor.  One very significant factor is SLUB, which merges
      slab caches when it can, and on 64-bit happens to merge both bio cache
      and sense_slab cache into kmalloc's 128-byte cache: so that under this
      swapping load, bios above are liable to gobble up all the slots needed
      for scsi_cmnd sense_buffers below.
      
      That's disturbing behaviour, and I tried a few things to fix it.  Adding
      a no-op constructor to the sense_slab inhibits SLUB from merging it, and
      stops all the allocation failures I was seeing; but it's rather a hack,
      and perhaps in different configurations we have other caches on the
      swapout path which are ill-merged.
      
      Another alternative is to revert the separate sense_slab, using
      cache-line-aligned sense_buffer allocated beyond scsi_cmnd from the one
      kmem_cache; but that might waste more memory, and is only a way of
      diverting around the known problem.
      
      While I don't like seeing the allocation failures, and hate the idea of
      all those bios piled up above a scsi host working one by one, it does
      seem to emerge fairly soon with the livelock fix.  So lacking better
      ideas, stick with that one clear fix for now.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Peter Zijlstra <a.p.ziljstra@chello.nl>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      164fc5dc
  26. 05 3月, 2008 1 次提交
  27. 08 2月, 2008 1 次提交
  28. 31 1月, 2008 1 次提交
    • B
      [SCSI] implement scsi_data_buffer · 30b0c37b
      Boaz Harrosh 提交于
      In preparation for bidi we abstract all IO members of scsi_cmnd,
      that will need to duplicate, into a substructure.
      
      - Group all IO members of scsi_cmnd into a scsi_data_buffer
        structure.
      - Adjust accessors to new members.
      - scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
        scsi_cmnd. And work on it.
      - Adjust scsi_init_io() and  scsi_release_buffers() for above
        change.
      - Fix other parts of scsi_lib/scsi.c to members migration. Use
        accessors where appropriate.
      
      - fix Documentation about scsi_cmnd in scsi_host.h
      
      - scsi_error.c
        * Changed needed members of struct scsi_eh_save.
        * Careful considerations in scsi_eh_prep/restore_cmnd.
      
      - sd.c and sr.c
        * sd and sr would adjust IO size to align on device's block
          size so code needs to change once we move to scsi_data_buff
          implementation.
        * Convert code to use scsi_for_each_sg
        * Use data accessors where appropriate.
      
      - tgt: convert libsrp to use scsi_data_buffer
      
      - isd200: This driver still bangs on scsi_cmnd IO members,
        so need changing
      
      [jejb: rebased on top of sg_table patches fixed up conflicts
      and used the synergy to eliminate use_sg and sg_count]
      Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      30b0c37b
  29. 24 1月, 2008 2 次提交
  30. 12 1月, 2008 3 次提交
  31. 07 1月, 2008 1 次提交
    • L
      Revert "scsi: revert "[SCSI] Get rid of scsi_cmnd->done"" · 7b3d9545
      Linus Torvalds 提交于
      This reverts commit ac40532e, which gets
      us back the original cleanup of 6f5391c2.
      
      It turns out that the bug that was triggered by that commit was
      apparently not actually triggered by that commit at all, and just the
      testing conditions had changed enough to make it appear to be due to it.
      
      The real problem seems to have been found by Peter Osterlund:
      
        "pktcdvd sets it [block device size] when opening the /dev/pktcdvd
         device, but when the drive is later opened as /dev/scd0, there is
         nothing that sets it back.  (Btw, 40944 is possible if the disk is a
         CDRW that was formatted with "cdrwtool -m 10236".)
      
         The problem is that pktcdvd opens the cd device in non-blocking mode
         when pktsetup is run, and doesn't close it again until pktsetup -d is
         run.  The effect is that if you meanwhile open the cd device,
         blkdev.c:do_open() doesn't call bd_set_size() because
         bdev->bd_openers is non-zero."
      
      In particular, to repeat the bug (regardless of whether commit
      6f5391c2 is applied or not):
      
        " 1. Start with an empty drive.
          2. pktsetup 0 /dev/scd0
          3. Insert a CD containing an isofs filesystem.
          4. mount /dev/pktcdvd/0 /mnt/tmp
          5. umount /mnt/tmp
          6. Press the eject button.
          7. Insert a DVD containing a non-writable filesystem.
          8. mount /dev/scd0 /mnt/tmp
          9. find /mnt/tmp -type f -print0 | xargs -0 sha1sum >/dev/null
          10. If the DVD contains data beyond the physical size of a CD, you
              get I/O errors in the terminal, and dmesg reports lots of
              "attempt to access beyond end of device" errors."
      
      which in turn is because the nested open after the media change won't
      cause the size to be set properly (because the original open still holds
      the block device, and we only do the bd_set_size() when we don't have
      other people holding the device open).
      
      The proper fix for that is probably to just do something like
      
      	bdev->bd_inode->i_size = (loff_t)get_capacity(disk)<<9;
      
      in fs/block_dev.c:do_open() even for the cases where we're not the
      original opener (but *not* call bd_set_size(), since that will also
      change the block size of the device).
      
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b3d9545
  32. 03 1月, 2008 1 次提交
  33. 11 12月, 2007 1 次提交
  34. 13 10月, 2007 2 次提交