1. 19 12月, 2017 1 次提交
  2. 15 12月, 2017 1 次提交
  3. 08 12月, 2017 1 次提交
  4. 17 10月, 2017 1 次提交
  5. 07 10月, 2017 1 次提交
  6. 26 9月, 2017 1 次提交
  7. 30 8月, 2017 1 次提交
    • B
      scsi: Rework handling of scsi_device.vpd_pg8[03] · ccf1e004
      Bart Van Assche 提交于
      Introduce struct scsi_vpd for the VPD page length, data and the RCU head
      that will be used to free the VPD data. Use kfree_rcu() instead of
      kfree() to free VPD data. Move the VPD buffer pointer check inside the
      RCU read lock in the sysfs code. Only annotate pointers that are shared
      across threads with __rcu. Use rcu_dereference() when dereferencing an
      RCU pointer. This patch suppresses about twenty sparse complaints about
      the vpd_pg8[03] pointers. This patch also fixes a race condition, namely
      that updating of the VPD pointers and length variables in struct
      scsi_device was not atomic with reference to the code reading these
      variables. See also "Does the update code tolerate concurrent accesses?"
      in Documentation/RCU/checklist.txt.
      
      Fixes: commit 09e2b0b1 ("scsi: rescan VPD attributes")
      Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
      Acked-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NShane Seymour <shane.seymour@hpe.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Shane Seymour <shane.seymour@hpe.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ccf1e004
  8. 26 8月, 2017 1 次提交
  9. 25 8月, 2017 1 次提交
  10. 02 7月, 2017 1 次提交
    • E
      scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state · f9279c96
      Ewan D. Milne 提交于
      The addition of the STARGET_REMOVE state had the side effect of
      introducing a race condition that can cause a crash.
      
      scsi_target_reap_ref_release() checks the starget->state to
      see if it still in STARGET_CREATED, and if so, skips calling
      transport_remove_device() and device_del(), because the starget->state
      is only set to STARGET_RUNNING after scsi_target_add() has called
      device_add() and transport_add_device().
      
      However, if an rport loss occurs while a target is being scanned,
      it can happen that scsi_remove_target() will be called while the
      starget is still in the STARGET_CREATED state.  In this case, the
      starget->state will be set to STARGET_REMOVE, and as a result,
      scsi_target_reap_ref_release() will take the wrong path.  The end
      result is a panic:
      
      [ 1255.356653] Oops: 0000 [#1] SMP
      [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i
      [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G        W       4.11.0+ #8
      [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013
      [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc]
      [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000
      [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0
      [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246
      [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000
      [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025
      [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659
      [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000
      [ 1255.474931] FS:  0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000
      [ 1255.483959] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0
      [ 1255.498332] Call Trace:
      [ 1255.501058]  kernfs_find_and_get_ns+0x31/0x60
      [ 1255.505916]  sysfs_unmerge_group+0x1d/0x60
      [ 1255.510498]  dpm_sysfs_remove+0x22/0x60
      [ 1255.514783]  device_del+0xf4/0x2e0
      [ 1255.518577]  ? device_remove_file+0x19/0x20
      [ 1255.523241]  attribute_container_class_device_del+0x1a/0x20
      [ 1255.529457]  transport_remove_classdev+0x4e/0x60
      [ 1255.534607]  ? transport_add_class_device+0x40/0x40
      [ 1255.540046]  attribute_container_device_trigger+0xb0/0xc0
      [ 1255.546069]  transport_remove_device+0x15/0x20
      [ 1255.551025]  scsi_target_reap_ref_release+0x25/0x40
      [ 1255.556467]  scsi_target_reap+0x2e/0x40
      [ 1255.560744]  __scsi_scan_target+0xaa/0x5b0
      [ 1255.565312]  scsi_scan_target+0xec/0x100
      [ 1255.569689]  fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc]
      [ 1255.576099]  process_one_work+0x14b/0x390
      [ 1255.580569]  worker_thread+0x4b/0x390
      [ 1255.584651]  kthread+0x109/0x140
      [ 1255.588251]  ? rescuer_thread+0x330/0x330
      [ 1255.592730]  ? kthread_park+0x60/0x60
      [ 1255.596815]  ret_from_fork+0x29/0x40
      [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
      [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0
      [ 1255.628479] CR2: 0000000000000068
      [ 1255.632756] ---[ end trace 34a69ba0477d036f ]---
      
      Fix this by adding another scsi_target state STARGET_CREATED_REMOVE
      to distinguish this case.
      
      Fixes: f05795d3 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state")
      Reported-by: NDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: NEwan D. Milne <emilne@redhat.com>
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NLaurence Oberman <loberman@redhat.com>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      f9279c96
  11. 13 6月, 2017 2 次提交
    • B
      scsi: Make __scsi_remove_device go straight from BLOCKED to DEL · 255ee932
      Bart Van Assche 提交于
      If a device is blocked, make __scsi_remove_device() cause it to
      transition to the DEL state. This means that all the commands issued in
      .shutdown() will error in the mid-layer, thus making the removal proceed
      without being stopped.
      
      This patch is a slightly modified version of a patch from James
      Bottomley. This patch avoids that the following lockup occurs:
      
      Call Trace:
       schedule+0x35/0x80
       schedule_timeout+0x237/0x2d0
       io_schedule_timeout+0xa6/0x110
       wait_for_completion_io+0xa3/0x110
       blk_execute_rq+0xdf/0x120
       scsi_execute+0xce/0x150 [scsi_mod]
       scsi_execute_req_flags+0x8f/0xf0 [scsi_mod]
       sd_sync_cache+0xa9/0x190 [sd_mod]
       sd_shutdown+0x6a/0x100 [sd_mod]
       sd_remove+0x64/0xc0 [sd_mod]
       __device_release_driver+0x8d/0x120
       device_release_driver+0x1e/0x30
       bus_remove_device+0xf9/0x170
       device_del+0x127/0x240
       __scsi_remove_device+0xc1/0xd0 [scsi_mod]
       scsi_forget_host+0x57/0x60 [scsi_mod]
       scsi_remove_host+0x72/0x110 [scsi_mod]
       srp_remove_work+0x8b/0x200 [ib_srp]
      Reported-by: NIsrael Rukshin <israelr@mellanox.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Israel Rukshin <israelr@mellanox.com>
      Cc: Max Gurtovoy <maxg@mellanox.com>
      Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      255ee932
    • B
      scsi: Protect SCSI device state changes with a mutex · 0db6ca8a
      Bart Van Assche 提交于
      Serializing SCSI device state changes avoids that two state changes can
      occur concurrently, e.g. the state changes in scsi_target_block() and
      __scsi_remove_device(). This serialization is essential to make patch
      "Make __scsi_remove_device go straight from BLOCKED to DEL" work
      reliably.
      
      Enable this mechanism for all scsi_target_*block() callers but not for
      the scsi_internal_device_unblock() calls from the mpt3sas driver because
      that driver can call scsi_internal_device_unblock() from atomic context.
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      0db6ca8a
  12. 15 12月, 2016 1 次提交
    • W
      scsi: avoid a permanent stop of the scsi device's request queue · d2a14525
      Wei Fang 提交于
      A race between scanning and fc_remote_port_delete() may result in a
      permanent stop if the device gets blocked before scsi_sysfs_add_sdev()
      and unblocked after.  The reason is that blocking a device sets both the
      SDEV_BLOCKED state and the QUEUE_FLAG_STOPPED.  However,
      scsi_sysfs_add_sdev() unconditionally sets SDEV_RUNNING which causes the
      device to be ignored by scsi_target_unblock() and thus never have its
      QUEUE_FLAG_STOPPED cleared leading to a device which is apparently
      running but has a stopped queue.
      
      We actually have two places where SDEV_RUNNING is set: once in
      scsi_add_lun() which respects the blocked flag and once in
      scsi_sysfs_add_sdev() which doesn't.  Since the second set is entirely
      spurious, simply remove it to fix the problem.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NZengxi Chen <chenzengxi@huawei.com>
      Signed-off-by: NWei Fang <fangwei1@huawei.com>
      Reviewed-by: NEwan D. Milne <emilne@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d2a14525
  13. 16 4月, 2016 2 次提交
  14. 12 4月, 2016 1 次提交
  15. 30 3月, 2016 1 次提交
  16. 11 3月, 2016 1 次提交
  17. 06 3月, 2016 3 次提交
  18. 12 2月, 2016 1 次提交
    • J
      scsi: fix soft lockup in scsi_remove_target() on module removal · 90a88d6e
      James Bottomley 提交于
      This softlockup is currently happening:
      
      [  444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29]
      [  444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed
      d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda
      _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_
      fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th
      ermal
      [  444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G           O    4.4.0-rc5-2.g1e923a3-default #1
      [  444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E           /D2164-A1, BIOS 5.00 R1.10.2164.A1               05/08/2006
      [  444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc]
      [  444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000
      [  444.088002] EIP: 0060:[<c07e7044>] EFLAGS: 00000286 CPU: 1
      [  444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20
      [  444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286
      [  444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8
      [  444.088002]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      [  444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0
      [  444.088002] Stack:
      [  444.088002]  f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90
      [  444.088002]  f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040
      [  444.088002]  f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004
      [  444.088002] Call Trace:
      [  444.088002]  [<c066b0f7>] scsi_remove_target+0x167/0x1c0
      [  444.088002]  [<f8f0a4ed>] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc]
      [  444.088002]  [<c026cb25>] process_one_work+0x155/0x3e0
      [  444.088002]  [<c026cde7>] worker_thread+0x37/0x490
      [  444.088002]  [<c027214b>] kthread+0x9b/0xb0
      [  444.088002]  [<c07e72c1>] ret_from_kernel_thread+0x21/0x40
      
      What appears to be happening is that something has pinned the target
      so it can't go into STARGET_DEL via final release and the loop in
      scsi_remove_target spins endlessly until that happens.
      
      The fix for this soft lockup is to not keep looping over a device that
      we've called remove on but which hasn't gone into DEL state.  This
      patch will retain a simplistic memory of the last target and not keep
      looping over it.
      Reported-by: NSebastian Herbszt <herbszt@gmx.de>
      Tested-by: NSebastian Herbszt <herbszt@gmx.de>
      Fixes: 40998193
      Cc: stable@vger.kernel.org
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      90a88d6e
  19. 03 12月, 2015 4 次提交
  20. 01 12月, 2015 1 次提交
  21. 20 11月, 2015 1 次提交
    • V
      scsi_sysfs: protect against double execution of __scsi_remove_device() · be821fd8
      Vitaly Kuznetsov 提交于
      On some host errors storvsc module tries to remove sdev by scheduling a job
      which does the following:
      
         sdev = scsi_device_lookup(wrk->host, 0, 0, wrk->lun);
         if (sdev) {
             scsi_remove_device(sdev);
             scsi_device_put(sdev);
         }
      
      While this code seems correct the following crash is observed:
      
       general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
       RIP: 0010:[<ffffffff81169979>]  [<ffffffff81169979>] bdi_destroy+0x39/0x220
       ...
       [<ffffffff814aecdc>] ? _raw_spin_unlock_irq+0x2c/0x40
       [<ffffffff8127b7db>] blk_cleanup_queue+0x17b/0x270
       [<ffffffffa00b54c4>] __scsi_remove_device+0x54/0xd0 [scsi_mod]
       [<ffffffffa00b556b>] scsi_remove_device+0x2b/0x40 [scsi_mod]
       [<ffffffffa00ec47d>] storvsc_remove_lun+0x3d/0x60 [hv_storvsc]
       [<ffffffff81080791>] process_one_work+0x1b1/0x530
       ...
      
      The problem comes with the fact that many such jobs (for the same device)
      are being scheduled simultaneously. While scsi_remove_device() uses
      shost->scan_mutex and scsi_device_lookup() will fail for a device in
      SDEV_DEL state there is no protection against someone who did
      scsi_device_lookup() before we actually entered __scsi_remove_device(). So
      the whole scenario looks like that: two callers do simultaneous (or
      preemption happens) calls to scsi_device_lookup() ant these calls succeed
      for both of them, after that they try doing scsi_remove_device().
      shost->scan_mutex only serializes their calls to __scsi_remove_device()
      and we end up doing the cleanup path twice.
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      be821fd8
  22. 10 11月, 2015 4 次提交
  23. 06 11月, 2015 1 次提交
  24. 27 10月, 2015 1 次提交
  25. 29 8月, 2015 1 次提交
  26. 16 7月, 2015 1 次提交
  27. 04 12月, 2014 1 次提交
  28. 24 11月, 2014 2 次提交
  29. 12 11月, 2014 1 次提交