1. 25 10月, 2013 4 次提交
  2. 23 10月, 2013 1 次提交
    • A
      [SCSI] sd: call blk_pm_runtime_init before add_disk · 10c580e4
      Aaron Lu 提交于
      Sujit has found a race condition that would make q->nr_pending
      unbalanced, it occurs as Sujit explained:
      
      "
      sd_probe_async() ->
      	add_disk() ->
      		disk_add_event() ->
      			schedule(disk_events_workfn)
      	sd_revalidate_disk()
      	blk_pm_runtime_init()
      return;
      
      Let's say the disk_events_workfn() calls sd_check_events() which tries
      to send test_unit_ready() and because of sd_revalidate_disk() trying to
      send another commands the test_unit_ready() might be re-queued as the
      tagged command queuing is disabled.
      
      So the race condition is -
      
      Thread 1 			  |		Thread 2
      sd_revalidate_disk()		  |	sd_check_events()
      ...nr_pending = 0 as q->dev = NULL|	scsi_queue_insert()
      blk_runtime_pm_init()		  | 	blk_pm_requeue_request() ->
      				  |	nr_pending = -1 since
      				  |	q->dev != NULL
      "
      
      The problem is, the test_unit_ready request doesn't get counted the
      first time it is queued, so the later decrement of q->nr_pending in
      blk_pm_requeue_request makes it unbalanced.
      
      Fix this by calling blk_pm_runtime_init before add_disk so that all
      requests initiated there will all be counted.
      Signed-off-by: NAaron Lu <aaron.lu@intel.com>
      Reported-and-tested-by: NSujit Reddy Thumma <sthumma@codeaurora.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      10c580e4
  3. 12 9月, 2013 1 次提交
  4. 22 8月, 2013 1 次提交
  5. 23 7月, 2013 1 次提交
    • E
      [SCSI] sd: fix crash when UA received on DIF enabled device · 085b513f
      Ewan D. Milne 提交于
      sd_prep_fn will allocate a larger CDB for the command via mempool_alloc
      for devices using DIF type 2 protection.  This CDB was being freed
      in sd_done, which results in a kernel crash if the command is retried
      due to a UNIT ATTENTION.  This change moves the code to free the larger
      CDB into sd_unprep_fn instead, which is invoked after the request is
      complete.
      
      It is no longer necessary to call scsi_print_command separately for
      this case as the ->cmnd will no longer be NULL in the normal code path.
      
      Also removed conditional test for DIF type 2 when freeing the larger
      CDB because the protection_type could have been changed via sysfs while
      the command was executing.
      Signed-off-by: NEwan D. Milne <emilne@redhat.com>
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      085b513f
  6. 04 7月, 2013 1 次提交
  7. 27 6月, 2013 1 次提交
    • M
      [SCSI] sd: Update WRITE SAME heuristics · 66c28f97
      Martin K. Petersen 提交于
      SATA drives located behind a SAS controller would incorrectly receive
      WRITE SAME commands. Tweak the heuristics so that:
      
       - If REPORT SUPPORTED OPERATION CODES is provided we will use that to
         choose between WRITE SAME(16), WRITE SAME(10) and disabled. This also
         fixes an issue with the old code which would issue WRITE SAME(10)
         despite the command not being whitelisted in REPORT SUPPORTED
         OPERATION CODES.
      
       - If REPORT SUPPORTED OPERATION CODES is not provided we will fall back
         to WRITE SAME(10) unless the device has an ATA Information VPD page.
         The assumption is that a SATL which is smart enough to implement
         WRITE SAME would also provide REPORT SUPPORTED OPERATION CODES.
      
      To facilitate the new heuristics scsi_report_opcode() has been modified
      to so we can distinguish between "operation not supported" and "RSOC not
      supported".
      Reported-by: NH. Peter Anvin <hpa@zytor.com>
      Tested-by: NBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      66c28f97
  8. 26 6月, 2013 1 次提交
  9. 05 6月, 2013 1 次提交
    • H
      [SCSI] sd: avoid deadlocks when running under multipath · 0761df9c
      Hannes Reinecke 提交于
      When multipathed systems run into an all-paths-down scenario
      all devices might be dropped, too. This causes 'del_gendisk'
      to be called, which will unregister the kobj_map->probe()
      function for all disk device numbers.
      When the device comes back the default ->probe() function
      is run which will call __request_module(), which will
      deadlock.
      As 'del_gendisk' typically does _not_ trigger a module unload
      the default ->probe() function is pointless anyway.
      This patch implements a dummy ->probe() function, which will
      just return NULL if the disk is not registered.
      This will avoid the deadlock. Plus it'll speed up device
      scanning.
      Signed-off-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      0761df9c
  10. 07 5月, 2013 3 次提交
  11. 03 5月, 2013 1 次提交
    • J
      [SCSI] sd: fix array cache flushing bug causing performance problems · 39c60a09
      James Bottomley 提交于
      Some arrays synchronize their full non volatile cache when the sd driver sends
      a SYNCHRONIZE CACHE command.  Unfortunately, they can have Terrabytes of this
      and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a
      writeback cache.  This leads to massive slowdowns on journalled filesystems.
      
      The fix is to allow userspace to turn off the writeback cache setting as a
      temporary measure (i.e. without doing the MODE SELECT to write it back to the
      device), so even though the device reported it has a writeback cache, the
      user, knowing that the cache is non volatile and all they care about is
      filesystem correctness, can turn that bit off in the kernel and avoid the
      performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands.
      
      The way you do this is add a 'temporary' prefix when performing the usual
      cache setting operations, so
      
      echo temporary write through > /sys/class/scsi_disk/<disk>/cache_type
      Reported-by: NRic Wheeler <rwheeler@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      39c60a09
  12. 30 11月, 2012 2 次提交
    • A
      [SCSI] sd: update sd to use the new pm callbacks · 691e3d31
      Aaron Lu 提交于
      Update sd driver to use the callbacks defined in dev_pm_ops.
      
      sd_freeze is NULL, the bus level callback has taken care of quiescing
      the device so there should be nothing needs to be done here.
      Consequently, sd_thaw is not needed here either.
      
      suspend, poweroff and runtime suspend share the same routine sd_suspend,
      which will sync flush and then stop the drive, this is the same as before.
      
      resume, restore and runtime resume share the same routine sd_resume,
      which will start the drive by putting it into active power state, this
      is also the same as before.
      Signed-off-by: NAaron Lu <aaron.lu@intel.com>
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      691e3d31
    • A
      [SCSI] sd: put to stopped power state when runtime suspend · a0147563
      Aaron Lu 提交于
      When device is runtime suspended, put it to stopped power state to save
      some power.
      
      This will also make the behaviour consistent with what the scsi_pm.c
      thinks about sd as the comment says:
      sd treats runtime suspend, system suspend and system hibernate identical.
      With this patch, it is now identical.
      And sd_shutdown will also do nothing when it finds the device has been
      runtime suspended, if we do not spin down the disk in runtime suspend
      by putting it into stopped power state, the disk will be shut down
      incorrectly.
      And the the same problem can be solved for runtime power off after
      runtime suspended case by this change.
      
      With the current runtime scheme for disk, it will only be runtime
      suspended when no process opens the disk, so this shouldn't happen a
      lot, which makes it acceptable to spin down the disk when runtime
      suspended. If some day a more aggressive runtime scheme is used, like
      the 'request based runtime pm for disk' that Alan Stern and Lin Ming
      has been working, we can introduce some policy to control this. But for
      now, make it simple and correct by spinning down the disk.
      Signed-off-by: NAaron Lu <aaron.lu@intel.com>
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      a0147563
  13. 27 11月, 2012 2 次提交
  14. 14 11月, 2012 2 次提交
    • M
      [SCSI] sd: Implement support for WRITE SAME · 5db44863
      Martin K. Petersen 提交于
      Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk
      driver.
      
       - We set the default maximum to 0xFFFF because there are several
         devices out there that only support two-byte block counts even with
         WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the
         device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK
         LIMITS VPD.
      
       - max_write_same_blocks can be overriden per-device basis in sysfs.
      
       - The UNMAP discovery heuristics remain unchanged but the discard
         limits are tweaked to match the "real" WRITE SAME commands.
      
       - In the error handling logic we now distinguish between WRITE SAME
         with and without UNMAP set.
      
      The discovery process heuristics are:
      
       - If the device reports a SCSI level of SPC-3 or greater we'll issue
         READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is
         supported. If that's the case we will use it.
      
       - If the device supports the block limits VPD and reports a MAXIMUM
         WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16).
      
       - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond
         0xFFFFFFFF or the block count exceeds 0xFFFF.
      
       - no_write_same is set for ATA, FireWire and USB.
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NMike Snitzer <snitzer@redhat.com>
      Reviewed-by: NJeff Garzik <jgarzik@redhat.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      5db44863
    • M
      [SCSI] sd: Permit merged discard requests · 26e85fcd
      Martin K. Petersen 提交于
      Support requests with more than one bio payload for discards. The total
      number of bytes to be discarded is stored in req->__data_len and used in
      sd_done() to complete the I/O.
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      26e85fcd
  15. 24 9月, 2012 3 次提交
  16. 20 7月, 2012 2 次提交
  17. 17 7月, 2012 1 次提交
  18. 23 6月, 2012 1 次提交
    • A
      SCSI & usb-storage: add try_rc_10_first flag · 6a0bdffa
      Alan Stern 提交于
      Several bug reports have been received recently for USB mass-storage
      devices that don't handle READ CAPACITY(16) commands properly.  They
      report bogus sizes, in some cases becoming unusable as a result.
      
      The bugs were triggered by commit
      09b6b51b (SCSI & usb-storage: add
      flags for VPD pages and REPORT LUNS), which caused usb-storage to stop
      overriding the SCSI level reported by devices.  By default, the sd
      driver will try READ CAPACITY(16) first for any device whose level is
      above SCSI_SPC_2.
      
      It seems likely that any device large enough to require the use of
      READ CAPACITY(16) (i.e., 2 TB or more) would be able to handle READ
      CAPACITY(10) commands properly.  Indeed, I don't know of any devices
      that don't handle READ CAPACITY(10) properly.
      
      Therefore this patch (as1559) adds a new flag telling the sd driver
      to try READ CAPACITY(10) before READ CAPACITY(16), and sets this flag
      for every USB mass-storage device.  If a device really is larger than
      2 TB, sd will fall back to READ CAPACITY(16) just as it used to.
      
      This fixes Bugzilla #43391.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Acked-by: NHans de Goede <hdegoede@redhat.com>
      CC: "James E.J. Bottomley" <JBottomley@parallels.com>
      CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a0bdffa
  19. 17 5月, 2012 1 次提交
    • D
      [SCSI] sd: limit the scope of the async probe domain · a7a20d10
      Dan Williams 提交于
      sd injects and synchronizes probe work on the global kernel-wide domain.
      This runs into conflict with PM that wants to perform resume actions in
      async context:
      
      [  494.237079] INFO: task kworker/u:3:554 blocked for more than 120 seconds.
      [  494.294396] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  494.360809] kworker/u:3     D 0000000000000000     0   554      2 0x00000000
      [  494.420739]  ffff88012e4d3af0 0000000000000046 ffff88013200c160 ffff88012e4d3fd8
      [  494.484392]  ffff88012e4d3fd8 0000000000012500 ffff8801394ea0b0 ffff88013200c160
      [  494.548038]  ffff88012e4d3ae0 00000000000001e3 ffffffff81a249e0 ffff8801321c5398
      [  494.611685] Call Trace:
      [  494.632649]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
      [  494.674687]  [<ffffffff8104b968>] async_synchronize_cookie_domain+0xb6/0x112
      [  494.734177]  [<ffffffff810461ff>] ? __init_waitqueue_head+0x50/0x50
      [  494.787134]  [<ffffffff8131a224>] ? scsi_remove_target+0x48/0x48
      [  494.837900]  [<ffffffff8104b9d9>] async_synchronize_cookie+0x15/0x17
      [  494.891567]  [<ffffffff8104ba49>] async_synchronize_full+0x54/0x70  <-- here we wait for async contexts to complete
      [  494.943783]  [<ffffffff8104b9f5>] ? async_synchronize_full_domain+0x1a/0x1a
      [  495.002547]  [<ffffffffa00114b1>] sd_remove+0x2c/0xa2 [sd_mod]
      [  495.051861]  [<ffffffff812fe94f>] __device_release_driver+0x86/0xcf
      [  495.104807]  [<ffffffff812fe9bd>] device_release_driver+0x25/0x32  <-- here we take device_lock()
      
      [  853.511341] INFO: task kworker/u:4:549 blocked for more than 120 seconds.
      [  853.568693] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  853.635119] kworker/u:4     D ffff88013097b5d0     0   549      2 0x00000000
      [  853.695129]  ffff880132773c40 0000000000000046 ffff880130790000 ffff880132773fd8
      [  853.758990]  ffff880132773fd8 0000000000012500 ffff88013288a0b0 ffff880130790000
      [  853.822796]  0000000000000246 0000000000000040 ffff88013097b5c8 ffff880130790000
      [  853.886633] Call Trace:
      [  853.907631]  [<ffffffff8149dd25>] schedule+0x5a/0x5c
      [  853.949670]  [<ffffffff8149cc44>] __mutex_lock_common+0x220/0x351
      [  854.001225]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
      [  854.049082]  [<ffffffff81304bd7>] ? device_resume+0x58/0x1c4
      [  854.097011]  [<ffffffff8149ce48>] mutex_lock_nested+0x2f/0x36   <-- here we wait for device_lock()
      [  854.145591]  [<ffffffff81304bd7>] device_resume+0x58/0x1c4
      [  854.192066]  [<ffffffff81304d61>] async_resume+0x1e/0x45
      [  854.237019]  [<ffffffff8104bc93>] async_run_entry_fn+0xc6/0x173  <-- ...while running in async context
      
      Provide a 'scsi_sd_probe_domain' so that async probe actions actions can
      be flushed without regard for the state of PM, and allow for the resume
      path to handle devices that have transitioned from SDEV_QUIESCE to
      SDEV_DEL prior to resume.
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      [alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume]
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      [jejb: remove unneeded config guards in include file]
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      a7a20d10
  20. 27 3月, 2012 1 次提交
  21. 19 3月, 2012 1 次提交
    • L
      [SCSI] sd: Add runtime pm in the sd_check_events() · 4e2247b2
      Lan Tianyu 提交于
      The sd_check_event() will be called periodly even when the device is in the
      suspended status to check media event. The scsi_test_unit_ready() in the
      sd_check_event() will issue scsi cmd request. Issuing scsi request when the
      device is in the suspeneded status will cause problem. For example, when a usb
      flash disk in the suspended status, scsi_test_unit_ready() issues a scsi
      request. The request will be returned as failed because the usb device is not
      active. The patch adds scsi_autopm_get_device() and scsi_autopm_put_device()
      around scsi_test_unit_ready() in the sd_check_event() to resolve such problem.
      Signed-off-by: NLan Tianyu <tianyu.lan@intel.com>
      Acked-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      4e2247b2
  22. 20 2月, 2012 1 次提交
    • M
      [SCSI] Handle disk devices which can not process medium access commands · 18a4d0a2
      Martin K. Petersen 提交于
      We have experienced several devices which fail in a fashion we do not
      currently handle gracefully in SCSI. After a failure these devices will
      respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
      but any command accessing the storage medium will time out.
      
      The following patch adds an callback that can be used by upper level
      drivers to inspect the results of an error handling command. This in
      turn has been used to implement additional checking in the SCSI disk
      driver.
      
      If a medium access command fails twice but TEST UNIT READY succeeds both
      times in the subsequent error handling we will offline the device. The
      maximum number of failed commands required to take a device offline can
      be tweaked in sysfs.
      
      Also add a new error flag to scsi_debug which allows this scenario to be
      easily reproduced.
      
      [jejb: fix up integer parsing to use kstrtouint]
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      18a4d0a2
  23. 19 2月, 2012 1 次提交
  24. 09 2月, 2012 1 次提交
    • A
      SCSI & usb-storage: add flags for VPD pages and REPORT LUNS · 09b6b51b
      Alan Stern 提交于
      This patch (as1507) adds a skip_vpd_pages flag to struct scsi_device
      and a no_report_luns flag to struct scsi_target.  The first is used to
      control whether sd will look at VPD pages for information on block
      provisioning, limits, and characteristics.  The second prevents
      scsi_report_lun_scan() from issuing a REPORT LUNS command.
      
      The patch also modifies usb-storage to set the new flag bits for all
      USB devices and targets, and to stop adjusting the scsi_level value.
      
      Historically we have seen that USB mass-storage devices often don't
      support VPD pages or REPORT LUNS properly.  Until now we have avoided
      these things by setting the scsi_level to SCSI_2 for all USB devices.
      But this has the side effect of storing the LUN bits into the second
      byte of each CDB, and now we have a report of a device which doesn't
      like that.  The best solution is to stop abusing scsi_level and
      instead have separate flags for VPD pages and REPORT LUNS.
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Reported-by: NPerry Wagle <wagle@mac.com>
      CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09b6b51b
  25. 15 1月, 2012 2 次提交
    • P
      block: fail SCSI passthrough ioctls on partition devices · 0bfc96cb
      Paolo Bonzini 提交于
      Linux allows executing the SG_IO ioctl on a partition or LVM volume, and
      will pass the command to the underlying block device.  This is
      well-known, but it is also a large security problem when (via Unix
      permissions, ACLs, SELinux or a combination thereof) a program or user
      needs to be granted access only to part of the disk.
      
      This patch lets partitions forward a small set of harmless ioctls;
      others are logged with printk so that we can see which ioctls are
      actually sent.  In my tests only CDROM_GET_CAPABILITY actually occurred.
      Of course it was being sent to a (partition on a) hard disk, so it would
      have failed with ENOTTY and the patch isn't changing anything in
      practice.  Still, I'm treating it specially to avoid spamming the logs.
      
      In principle, this restriction should include programs running with
      CAP_SYS_RAWIO.  If for example I let a program access /dev/sda2 and
      /dev/sdb, it still should not be able to read/write outside the
      boundaries of /dev/sda2 independent of the capabilities.  However, for
      now programs with CAP_SYS_RAWIO will still be allowed to send the
      ioctls.  Their actions will still be logged.
      
      This patch does not affect the non-libata IDE driver.  That driver
      however already tests for bd != bd->bd_contains before issuing some
      ioctl; it could be restricted further to forbid these ioctls even for
      programs running with CAP_SYS_ADMIN/CAP_SYS_RAWIO.
      
      Cc: linux-scsi@vger.kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: James Bottomley <JBottomley@parallels.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      [ Make it also print the command name when warning - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0bfc96cb
    • P
      block: add and use scsi_blk_cmd_ioctl · 577ebb37
      Paolo Bonzini 提交于
      Introduce a wrapper around scsi_cmd_ioctl that takes a block device.
      
      The function will then be enhanced to detect partition block devices
      and, in that case, subject the ioctls to whitelisting.
      
      Cc: linux-scsi@vger.kernel.org
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: James Bottomley <JBottomley@parallels.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      577ebb37
  26. 09 1月, 2012 1 次提交
  27. 30 10月, 2011 1 次提交
  28. 29 8月, 2011 1 次提交