1. 15 12月, 2016 1 次提交
    • S
      scsi: zfcp: fix rport unblock race with LUN recovery · 6f2ce1c6
      Steffen Maier 提交于
      It is unavoidable that zfcp_scsi_queuecommand() has to finish requests
      with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time
      window when zfcp detected an unavailable rport but
      fc_remote_port_delete(), which is asynchronous via
      zfcp_scsi_schedule_rport_block(), has not yet blocked the rport.
      
      However, for the case when the rport becomes available again, we should
      prevent unblocking the rport too early.  In contrast to other FCP LLDDs,
      zfcp has to open each LUN with the FCP channel hardware before it can
      send I/O to a LUN.  So if a port already has LUNs attached and we
      unblock the rport just after port recovery, recoveries of LUNs behind
      this port can still be pending which in turn force
      zfcp_scsi_queuecommand() to unnecessarily finish requests with
      DID_IMM_RETRY.
      
      This also opens a time window with unblocked rport (until the followup
      LUN reopen recovery has finished).  If a scsi_cmnd timeout occurs during
      this time window fc_timed_out() cannot work as desired and such command
      would indeed time out and trigger scsi_eh. This prevents a clean and
      timely path failover.  This should not happen if the path issue can be
      recovered on FC transport layer such as path issues involving RSCNs.
      
      Fix this by only calling zfcp_scsi_schedule_rport_register(), to
      asynchronously trigger fc_remote_port_add(), after all LUN recoveries as
      children of the rport have finished and no new recoveries of equal or
      higher order were triggered meanwhile.  Finished intentionally includes
      any recovery result no matter if successful or failed (still unblock
      rport so other successful LUNs work).  For simplicity, we check after
      each finished LUN recovery if there is another LUN recovery pending on
      the same port and then do nothing.  We handle the special case of a
      successful recovery of a port without LUN children the same way without
      changing this case's semantics.
      
      For debugging we introduce 2 new trace records written if the rport
      unblock attempt was aborted due to still unfinished or freshly triggered
      recovery. The records are only written above the default trace level.
      
      Benjamin noticed the important special case of new recovery that can be
      triggered between having given up the erp_lock and before calling
      zfcp_erp_action_cleanup() within zfcp_erp_strategy().  We must avoid the
      following sequence:
      
      ERP thread                 rport_work      other context
      -------------------------  --------------  --------------------------------
      port is unblocked, rport still blocked,
       due to pending/running ERP action,
       so ((port->status & ...UNBLOCK) != 0)
       and (port->rport == NULL)
      unlock ERP
      zfcp_erp_action_cleanup()
      case ZFCP_ERP_ACTION_REOPEN_LUN:
      zfcp_erp_try_rport_unblock()
      ((status & ...UNBLOCK) != 0) [OLD!]
                                                 zfcp_erp_port_reopen()
                                                 lock ERP
                                                 zfcp_erp_port_block()
                                                 port->status clear ...UNBLOCK
                                                 unlock ERP
                                                 zfcp_scsi_schedule_rport_block()
                                                 port->rport_task = RPORT_DEL
                                                 queue_work(rport_work)
                                 zfcp_scsi_rport_work()
                                 (port->rport_task != RPORT_ADD)
                                 port->rport_task = RPORT_NONE
                                 zfcp_scsi_rport_block()
                                 if (!port->rport) return
      zfcp_scsi_schedule_rport_register()
      port->rport_task = RPORT_ADD
      queue_work(rport_work)
                                 zfcp_scsi_rport_work()
                                 (port->rport_task == RPORT_ADD)
                                 port->rport_task = RPORT_NONE
                                 zfcp_scsi_rport_register()
                                 (port->rport == NULL)
                                 rport = fc_remote_port_add()
                                 port->rport = rport;
      
      Now the rport was erroneously unblocked while the zfcp_port is blocked.
      This is another situation we want to avoid due to scsi_eh
      potential. This state would at least remain until the new recovery from
      the other context finished successfully, or potentially forever if it
      failed.  In order to close this race, we take the erp_lock inside
      zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or
      LUN.  With that, the possible corresponding rport state sequences would
      be: (unblock[ERP thread],block[other context]) if the ERP thread gets
      erp_lock first and still sees ((port->status & ...UNBLOCK) != 0),
      (block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock
      after the other context has already cleard ...UNBLOCK from port->status.
      
      Since checking fields of struct erp_action is unsafe because they could
      have been overwritten (re-used for new recovery) meanwhile, we only
      check status of zfcp_port and LUN since these are only changed under
      erp_lock elsewhere. Regarding the check of the proper status flags (port
      or port_forced are similar to the shown adapter recovery):
      
      [zfcp_erp_adapter_shutdown()]
      zfcp_erp_adapter_reopen()
       zfcp_erp_adapter_block()
        * clear UNBLOCK ---------------------------------------+
       zfcp_scsi_schedule_rports_block()                       |
       write_lock_irqsave(&adapter->erp_lock, flags);-------+  |
       zfcp_erp_action_enqueue()                            |  |
        zfcp_erp_setup_act()                                |  |
         * set ERP_INUSE -----------------------------------|--|--+
       write_unlock_irqrestore(&adapter->erp_lock, flags);--+  |  |
      .context-switch.                                         |  |
      zfcp_erp_thread()                                        |  |
       zfcp_erp_strategy()                                     |  |
        write_lock_irqsave(&adapter->erp_lock, flags);------+  |  |
        ...                                                 |  |  |
        zfcp_erp_strategy_check_target()                    |  |  |
         zfcp_erp_strategy_check_adapter()                  |  |  |
          zfcp_erp_adapter_unblock()                        |  |  |
           * set UNBLOCK -----------------------------------|--+  |
        zfcp_erp_action_dequeue()                           |     |
         * clear ERP_INUSE ---------------------------------|-----+
        ...                                                 |
        write_unlock_irqrestore(&adapter->erp_lock, flags);-+
      
      Hence, we should check for both UNBLOCK and ERP_INUSE because they are
      interleaved.  Also we need to explicitly check ERP_FAILED for the link
      down case which currently does not clear the UNBLOCK flag in
      zfcp_fsf_link_down_info_eval().
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 8830271c ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport")
      Fixes: a2fa0aed ("[SCSI] zfcp: Block FC transport rports early on errors")
      Fixes: 5f852be9 ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
      Fixes: 338151e0 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
      Fixes: 3859f6a2 ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
      Cc: <stable@vger.kernel.org> #2.6.32+
      Reviewed-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      6f2ce1c6
  2. 13 8月, 2016 1 次提交
    • S
      zfcp: close window with unblocked rport during rport gone · 4eeaa4f3
      Steffen Maier 提交于
      On a successful end of reopen port forced,
      zfcp_erp_strategy_followup_success() re-uses the port erp_action
      and the subsequent zfcp_erp_action_cleanup() now
      sees ZFCP_ERP_SUCCEEDED with
      erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT
      instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
      but must not perform zfcp_scsi_schedule_rport_register().
      
      We can detect this because the fresh port reopen erp_action
      is in its very first step ZFCP_ERP_STEP_UNINITIALIZED.
      
      Otherwise this opens a time window with unblocked rport
      (until the followup port reopen recovery would block it again).
      If a scsi_cmnd timeout occurs during this time window
      fc_timed_out() cannot work as desired and such command
      would indeed time out and trigger scsi_eh. This prevents
      a clean and timely path failover.
      This should not happen if the path issue can be recovered
      on FC transport layer such as path issues involving RSCNs.
      
      Also, unnecessary and repeated DID_IMM_RETRY for pending and
      undesired new requests occur because internally zfcp still
      has its zfcp_port blocked.
      
      As follow-on errors with scsi_eh, it can cause,
      in the worst case, permanently lost paths due to one of:
      sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk!
      sd <scsidev>: Device offlined - not ready after error recovery
      
      For fix validation and to aid future debugging with other recoveries
      we now also trace (un)blocking of rports.
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 5767620c ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED")
      Fixes: a2fa0aed ("[SCSI] zfcp: Block FC transport rports early on errors")
      Fixes: 5f852be9 ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
      Fixes: 338151e0 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
      Fixes: 3859f6a2 ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
      Cc: <stable@vger.kernel.org> #2.6.32+
      Reviewed-by: NBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4eeaa4f3
  3. 27 7月, 2015 1 次提交
  4. 15 4月, 2015 1 次提交
  5. 20 11月, 2014 1 次提交
  6. 23 8月, 2013 1 次提交
    • M
      [SCSI] zfcp: fix schedule-inside-lock in scsi_device list loops · 924dd584
      Martin Peschke 提交于
      BUG: sleeping function called from invalid context at kernel/workqueue.c:2752
      in_atomic(): 1, irqs_disabled(): 1, pid: 360, name: zfcperp0.0.1700
      CPU: 1 Not tainted 3.9.3+ #69
      Process zfcperp0.0.1700 (pid: 360, task: 0000000075b7e080, ksp: 000000007476bc30)
      <snip>
      Call Trace:
      ([<00000000001165de>] show_trace+0x106/0x154)
       [<00000000001166a0>] show_stack+0x74/0xf4
       [<00000000006ff646>] dump_stack+0xc6/0xd4
       [<000000000017f3a0>] __might_sleep+0x128/0x148
       [<000000000015ece8>] flush_work+0x54/0x1f8
       [<00000000001630de>] __cancel_work_timer+0xc6/0x128
       [<00000000005067ac>] scsi_device_dev_release_usercontext+0x164/0x23c
       [<0000000000161816>] execute_in_process_context+0x96/0xa8
       [<00000000004d33d8>] device_release+0x60/0xc0
       [<000000000048af48>] kobject_release+0xa8/0x1c4
       [<00000000004f4bf2>] __scsi_iterate_devices+0xfa/0x130
       [<000003ff801b307a>] zfcp_erp_strategy+0x4da/0x1014 [zfcp]
       [<000003ff801b3caa>] zfcp_erp_thread+0xf6/0x2b0 [zfcp]
       [<000000000016b75a>] kthread+0xf2/0xfc
       [<000000000070c9de>] kernel_thread_starter+0x6/0xc
       [<000000000070c9d8>] kernel_thread_starter+0x0/0xc
      
      Apparently, the ref_count for some scsi_device drops down to zero,
      triggering device removal through execute_in_process_context(), while
      the lldd error recovery thread iterates through a scsi device list.
      Unfortunately, execute_in_process_context() decides to immediately
      execute that device removal function, instead of scheduling asynchronous
      execution, since it detects process context and thinks it is safe to do
      so. But almost all calls to shost_for_each_device() in our lldd are
      inside spin_lock_irq, even in thread context. Obviously, schedule()
      inside spin_lock_irq sections is a bad idea.
      
      Change the lldd to use the proper iterator function,
      __shost_for_each_device(), in combination with required locking.
      
      Occurences that need to be changed include all calls in zfcp_erp.c,
      since those might be executed in zfcp error recovery thread context
      with a lock held.
      
      Other occurences of shost_for_each_device() in zfcp_fsf.c do not
      need to be changed (no process context, no surrounding locking).
      
      The problem was introduced in Linux 2.6.37 by commit
      b62a8d9b
      "[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit".
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NMartin Peschke <mpeschke@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org #2.6.37+
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      924dd584
  7. 01 6月, 2013 1 次提交
  8. 24 9月, 2012 1 次提交
    • S
      [SCSI] zfcp: No automatic port_rescan on events · 43f60cbd
      Steffen Maier 提交于
      In FC fabrics with large zones, the automatic port_rescan on incoming ELS
      and any adapter recovery can cause quite some traffic at the very same
      time, especially if lots of Linux images share an HBA, which is common on
      s390. This can cause trouble and failures. Fix this by making such port
      rescans dependent on a user configurable module parameter.
      
      The following unconditional automatic port rescans remain as is:
      On setting an adapter online and
      on manual user-triggered writes to the sysfs attribute port_rescan.
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      43f60cbd
  9. 20 7月, 2012 1 次提交
    • H
      s390/comments: unify copyright messages and remove file names · a53c8fab
      Heiko Carstens 提交于
      Remove the file name from the comment at top of many files. In most
      cases the file name was wrong anyway, so it's rather pointless.
      
      Also unify the IBM copyright statement. We did have a lot of sightly
      different statements and wanted to change them one after another
      whenever a file gets touched. However that never happened. Instead
      people start to take the old/"wrong" statements to use as a template
      for new files.
      So unify all of them in one go.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      a53c8fab
  10. 26 2月, 2011 2 次提交
  11. 22 12月, 2010 3 次提交
  12. 09 12月, 2010 2 次提交
  13. 17 9月, 2010 4 次提交
  14. 28 7月, 2010 5 次提交
  15. 22 7月, 2010 1 次提交
  16. 03 5月, 2010 1 次提交
  17. 18 2月, 2010 3 次提交
  18. 05 12月, 2009 4 次提交
  19. 22 10月, 2009 1 次提交
    • C
      [SCSI] zfcp: Handle WWPN mismatch in PLOGI payload · 934aeb58
      Christof Schmitt 提交于
      For ports, zfcp gets the DID from the FC nameserver and tries to open
      the port. If the open succeeds, zfcp compares the WWPN from the
      nameserver with the WWPN in the PLOGI payload. In case of a mismatch,
      zfcp assumes that the DID of the port just changed and we opened the
      wrong port. This means that zfcp has to forget the DID, lookup the DID
      again and retry.
      
      This error case had a problem that zfcp forgets the DID, but never
      looks up a new one, stalling the ERP in this case. Fix this by
      triggering the DID lookup and properly exit from the ERP. The DID
      lookup will trigger a new ERP action.
      
      Also ensure when trying to open the port again with the new DID, first
      close the open port, even in the NOESC case.
      Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
      934aeb58
  20. 05 9月, 2009 5 次提交