1. 15 12月, 2016 1 次提交
    • B
      scsi: zfcp: fix use-after-"free" in FC ingress path after TMF · dac37e15
      Benjamin Block 提交于
      When SCSI EH invokes zFCP's callbacks for eh_device_reset_handler() and
      eh_target_reset_handler(), it expects us to relent the ownership over
      the given scsi_cmnd and all other scsi_cmnds within the same scope - LUN
      or target - when returning with SUCCESS from the callback ('release'
      them).  SCSI EH can then reuse those commands.
      
      We did not follow this rule to release commands upon SUCCESS; and if
      later a reply arrived for one of those supposed to be released commands,
      we would still make use of the scsi_cmnd in our ingress tasklet. This
      will at least result in undefined behavior or a kernel panic because of
      a wrong kernel pointer dereference.
      
      To fix this, we NULLify all pointers to scsi_cmnds (struct zfcp_fsf_req
      *)->data in the matching scope if a TMF was successful. This is done
      under the locks (struct zfcp_adapter *)->abort_lock and (struct
      zfcp_reqlist *)->lock to prevent the requests from being removed from
      the request-hashtable, and the ingress tasklet from making use of the
      scsi_cmnd-pointer in zfcp_fsf_fcp_cmnd_handler().
      
      For cases where a reply arrives during SCSI EH, but before we get a
      chance to NULLify the pointer - but before we return from the callback
      -, we assume that the code is protected from races via the CAS operation
      in blk_complete_request() that is called in scsi_done().
      
      The following stacktrace shows an example for a crash resulting from the
      previous behavior:
      
      Unable to handle kernel pointer dereference at virtual kernel address fffffee17a672000
      Oops: 0038 [#1] SMP
      CPU: 2 PID: 0 Comm: swapper/2 Not tainted
      task: 00000003f7ff5be0 ti: 00000003f3d38000 task.ti: 00000003f3d38000
      Krnl PSW : 0404d00180000000 00000000001156b0 (smp_vcpu_scheduled+0x18/0x40)
                 R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
      Krnl GPRS: 000000200000007e 0000000000000000 fffffee17a671fd8 0000000300000015
                 ffffffff80000000 00000000005dfde8 07000003f7f80e00 000000004fa4e800
                 000000036ce8d8f8 000000036ce8d9c0 00000003ece8fe00 ffffffff969c9e93
                 00000003fffffffd 000000036ce8da10 00000000003bf134 00000003f3b07918
      Krnl Code: 00000000001156a2: a7190000        lghi    %r1,0
                 00000000001156a6: a7380015        lhi    %r3,21
                #00000000001156aa: e32050000008    ag    %r2,0(%r5)
                >00000000001156b0: 482022b0        lh    %r2,688(%r2)
                 00000000001156b4: ae123000        sigp    %r1,%r2,0(%r3)
                 00000000001156b8: b2220020        ipm    %r2
                 00000000001156bc: 8820001c        srl    %r2,28
                 00000000001156c0: c02700000001    xilf    %r2,1
      Call Trace:
      ([<0000000000000000>] 0x0)
       [<000003ff807bdb8e>] zfcp_fsf_fcp_cmnd_handler+0x3de/0x490 [zfcp]
       [<000003ff807be30a>] zfcp_fsf_req_complete+0x252/0x800 [zfcp]
       [<000003ff807c0a48>] zfcp_fsf_reqid_check+0xe8/0x190 [zfcp]
       [<000003ff807c194e>] zfcp_qdio_int_resp+0x66/0x188 [zfcp]
       [<000003ff80440c64>] qdio_kick_handler+0xdc/0x310 [qdio]
       [<000003ff804463d0>] __tiqdio_inbound_processing+0xf8/0xcd8 [qdio]
       [<0000000000141fd4>] tasklet_action+0x9c/0x170
       [<0000000000141550>] __do_softirq+0xe8/0x258
       [<000000000010ce0a>] do_softirq+0xba/0xc0
       [<000000000014187c>] irq_exit+0xc4/0xe8
       [<000000000046b526>] do_IRQ+0x146/0x1d8
       [<00000000005d6a3c>] io_return+0x0/0x8
       [<00000000005d6422>] vtime_stop_cpu+0x4a/0xa0
      ([<0000000000000000>] 0x0)
       [<0000000000103d8a>] arch_cpu_idle+0xa2/0xb0
       [<0000000000197f94>] cpu_startup_entry+0x13c/0x1f8
       [<0000000000114782>] smp_start_secondary+0xda/0xe8
       [<00000000005d6efe>] restart_int_handler+0x56/0x6c
       [<0000000000000000>] 0x0
      Last Breaking-Event-Address:
       [<00000000003bf12e>] arch_spin_lock_wait+0x56/0xb0
      Suggested-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
      Fixes: ea127f9754 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
      Cc: <stable@vger.kernel.org> #2.6.32+
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      dac37e15
  2. 13 8月, 2016 1 次提交
    • S
      zfcp: close window with unblocked rport during rport gone · 4eeaa4f3
      Steffen Maier 提交于
      On a successful end of reopen port forced,
      zfcp_erp_strategy_followup_success() re-uses the port erp_action
      and the subsequent zfcp_erp_action_cleanup() now
      sees ZFCP_ERP_SUCCEEDED with
      erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT
      instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
      but must not perform zfcp_scsi_schedule_rport_register().
      
      We can detect this because the fresh port reopen erp_action
      is in its very first step ZFCP_ERP_STEP_UNINITIALIZED.
      
      Otherwise this opens a time window with unblocked rport
      (until the followup port reopen recovery would block it again).
      If a scsi_cmnd timeout occurs during this time window
      fc_timed_out() cannot work as desired and such command
      would indeed time out and trigger scsi_eh. This prevents
      a clean and timely path failover.
      This should not happen if the path issue can be recovered
      on FC transport layer such as path issues involving RSCNs.
      
      Also, unnecessary and repeated DID_IMM_RETRY for pending and
      undesired new requests occur because internally zfcp still
      has its zfcp_port blocked.
      
      As follow-on errors with scsi_eh, it can cause,
      in the worst case, permanently lost paths due to one of:
      sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk!
      sd <scsidev>: Device offlined - not ready after error recovery
      
      For fix validation and to aid future debugging with other recoveries
      we now also trace (un)blocking of rports.
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 5767620c ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED")
      Fixes: a2fa0aed ("[SCSI] zfcp: Block FC transport rports early on errors")
      Fixes: 5f852be9 ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
      Fixes: 338151e0 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
      Fixes: 3859f6a2 ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
      Cc: <stable@vger.kernel.org> #2.6.32+
      Reviewed-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4eeaa4f3
  3. 01 6月, 2015 1 次提交
  4. 24 11月, 2014 2 次提交
  5. 12 11月, 2014 1 次提交
    • C
      scsi: don't set tagging state from scsi_adjust_queue_depth · c8b09f6f
      Christoph Hellwig 提交于
      Remove the tagged argument from scsi_adjust_queue_depth, and just let it
      handle the queue depth.  For most drivers those two are fairly separate,
      given that most modern drivers don't care about the SCSI "tagged" status
      of a command at all, and many old drivers allow queuing of multiple
      untagged commands in the driver.
      
      Instead we start out with the ->simple_tags flag set before calling
      ->slave_configure, which is how all drivers actually looking at
      ->simple_tags except for one worke anyway.  The one other case looks
      broken, but I've kept the behavior as-is for now.
      
      Except for that we only change ->simple_tags from the ->change_queue_type,
      and when rejecting a tag message in a single driver, so keeping this
      churn out of scsi_adjust_queue_depth is a clear win.
      
      Now that the usage of scsi_adjust_queue_depth is more obvious we can
      also remove all the trivial instances in ->slave_alloc or ->slave_configure
      that just set it to the cmd_per_lun default.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c8b09f6f
  6. 01 6月, 2013 1 次提交
    • S
      [SCSI] zfcp: block queue limits with data router · 5fea4291
      Steffen Maier 提交于
      Commit 86a9668a
      "[SCSI] zfcp: support for hardware data router"
      reduced the initial block queue limits in the scsi_host_template to the
      absolute minimum and adjusted them later on. However, the adjustment was
      too late for the BSG devices of Scsi_Host and fc_host.
      
      Therefore, ioctl(..., SG_IO, ...) with request or response size > 4kB to a
      BSG device of an fc_host or a Scsi_Host fails with EINVAL. As a result,
      users of such ioctl such as HBA_SendCTPassThru() in libzfcphbaapi return
      with error HBA_STATUS_ERROR.
      
      Initialize the block queue limits in zfcp_scsi_host_template to the
      greatest common denominator (GCD).
      
      While we cannot exploit the slightly enlarged maximum request size with
      data router, this should be neglectible. Doing so also avoids running into
      trouble after live guest relocation (LGR) / migration from a data router
      FCP device to an FCP device that does not support data router. In that
      case, zfcp would figure out the new limits on adapter recovery, but the
      fc_host and Scsi_Host (plus in fact all sdevs) still exist with the old and
      now too large queue limits.
      
      It should also OK, not to use half the size as in the DIX case, because
      fc_host and Scsi_Host do not transport FCP requests including SCSI commands
      using protection data.
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Reviewed-by: NMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org> #3.2+
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      5fea4291
  7. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  8. 14 12月, 2011 1 次提交
  9. 01 11月, 2011 1 次提交
  10. 27 8月, 2011 2 次提交
  11. 27 7月, 2011 1 次提交
  12. 26 2月, 2011 2 次提交
  13. 22 12月, 2010 3 次提交
  14. 09 12月, 2010 1 次提交
    • C
      [SCSI] zfcp: Issue FCP command without holding SCSI host_lock · e55f8753
      Christof Schmitt 提交于
      Interrupting the connection to the FCP channel while I/O requests are
      being issued can lead to this deadlock. scsi_dispatch_cmd already
      holds the host_lock while the recovery trigger tries to acquire the
      host_lock again when iterating through the scsi_devices.
      
       INFO: lockdep is turned off.
       BUG: spinlock lockup on CPU#1, blast/9660, 0000000078f38878
       CPU: 1 Not tainted 2.6.35.7SWEN2 #2
       Process blast (pid: 9660, task: 0000000071f75940, ksp: 0000000074393ac0)
              0000000074393640 00000000743935c0 0000000000000002 0000000000000000
              0000000074393660 00000000743935d8 00000000743935d8 00000000005590c2
              0000000000000000 0000000078f38878 0000000026ede800 0000000078f38878
              000000000000000d 040000000000000c 0000000074393628 0000000000000000
              0000000000000000 0000000000100b2a 00000000743935c0 0000000074393600
       Call Trace:
       ([<0000000000100a32>] show_trace+0xee/0x144)
        [<00000000003be202>] do_raw_spin_lock+0x112/0x178
        [<000000000055d408>] _raw_spin_lock_irqsave+0x90/0xb0
        [<00000000003f1514>] __scsi_iterate_devices+0x38/0xbc
        [<00000000004849b0>] zfcp_erp_clear_adapter_status+0xd0/0x16c
        [<000000000048587a>] zfcp_erp_adapter_reopen+0x3a/0xb4
        [<0000000000489812>] zfcp_fsf_req_send+0x166/0x180
        [<000000000048c8d6>] zfcp_fsf_fcp_cmnd+0x272/0x408
        [<000000000048f864>] zfcp_scsi_queuecommand+0x11c/0x1e0
        [<00000000003f1f2a>] scsi_dispatch_cmd+0x1d6/0x324
        [<00000000003f9910>] scsi_request_fn+0x42c/0x56c
        [<00000000003828ae>] __blk_run_queue+0x86/0x140
        [<000000000037f742>] elv_insert+0x11a/0x208
        [<000000000038104c>] blk_insert_cloned_request+0x84/0xe4
        [<000003c0032b7c64>] dm_dispatch_request+0x6c/0x94 [dm_mod]
        [<000003c0032b7d5c>] map_request+0xd0/0x100 [dm_mod]
        [<000003c0032b9a78>] dm_request_fn+0xec/0x1bc [dm_mod]
        [<0000000000382c0e>] generic_unplug_device+0x5a/0x6c
        [<000003c0032b7f98>] dm_unplug_all+0x74/0x9c [dm_mod]
        [<00000000001d1272>] sync_page+0x76/0x9c
        [<00000000001d12ba>] sync_page_killable+0x22/0x60
        [<000000000055a768>] __wait_on_bit_lock+0xc0/0x124
        [<00000000001d1140>] __lock_page_killable+0x78/0x84
        [<00000000001d351c>] generic_file_aio_read+0x5a4/0x7e8
        [<0000000000228ec0>] do_sync_read+0xc8/0x12c
        [<0000000000229edc>] vfs_read+0xac/0x1ac
        [<000000000022a0d8>] SyS_read+0x58/0xa8
        [<00000000001146de>] sysc_noemu+0x10/0x16
        [<00000200000493c4>] 0x200000493c4
       INFO: lockdep is turned off.
      
      Call zfcp_fsf_fcp_cmnd without the host_lock and disable the
      interrupts when acquiring the req_q_lock. According to the patch
      description in "[PATCH] Eliminate error handler overload of the SCSI
      serial number", the serial_number is not used, so simply drop the
      queuecommand wrapper function and run zfcp_scsi_queuecommand without
      holding the host_lock.
      Reviewed-by: NSwen Schillig <swen@vnet.ibm.com>
      Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      e55f8753
  15. 17 11月, 2010 1 次提交
    • J
      SCSI host lock push-down · f281233d
      Jeff Garzik 提交于
      Move the mid-layer's ->queuecommand() invocation from being locked
      with the host lock to being unlocked to facilitate speeding up the
      critical path for drivers who don't need this lock taken anyway.
      
      The patch below presents a simple SCSI host lock push-down as an
      equivalent transformation.  No locking or other behavior should change
      with this patch.  All existing bugs and locking orders are preserved.
      
      Additionally, add one parameter to queuecommand,
      	struct Scsi_Host *
      and remove one parameter from queuecommand,
      	void (*done)(struct scsi_cmnd *)
      
      Scsi_Host* is a convenient pointer that most host drivers need anyway,
      and 'done' is redundant to struct scsi_cmnd->scsi_done.
      
      Minimal code disturbance was attempted with this change.  Most drivers
      needed only two one-line modifications for their host lock push-down.
      Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
      Acked-by: NJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f281233d
  16. 17 9月, 2010 4 次提交
  17. 11 9月, 2010 1 次提交
  18. 28 7月, 2010 5 次提交
  19. 03 5月, 2010 2 次提交
  20. 11 4月, 2010 1 次提交
  21. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  22. 18 2月, 2010 4 次提交
  23. 18 1月, 2010 1 次提交
  24. 05 12月, 2009 1 次提交