1. 08 11月, 2018 1 次提交
  2. 12 9月, 2018 2 次提交
  3. 30 8月, 2018 1 次提交
  4. 03 8月, 2018 2 次提交
  5. 11 7月, 2018 3 次提交
  6. 08 5月, 2018 1 次提交
  7. 19 4月, 2018 2 次提交
  8. 13 3月, 2018 1 次提交
  9. 23 2月, 2018 1 次提交
  10. 13 2月, 2018 3 次提交
  11. 02 11月, 2017 1 次提交
  12. 08 8月, 2017 1 次提交
  13. 04 7月, 2017 1 次提交
  14. 13 6月, 2017 1 次提交
  15. 07 3月, 2017 2 次提交
    • J
      scsi: lpfc: add transport eh_timed_out reference · 856984b7
      James Smart 提交于
      Christoph's prior patch missed the template for the sli3 adapters,
      which is now the "no host reset" template. Add the transport
      eh_timed_out handler to the no host reset template
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      856984b7
    • J
      scsi: lpfc: Fix eh_deadline setting for sli3 adapters. · 96418b5e
      James Smart 提交于
      A previous change unilaterally removed the hba reset entry point
      from the sli3 host template. This was done to allow tape devices
      being used for back up from being removed. Why was this done ?
      When there was non-responding device on the fabric, the error
      escalation policy would escalate to the reset handler. When the
      reset handler was called, it would reset the adapter, dropping
      link, thus logging out and terminating all i/o's - on any target.
      If there was a tape device on the same adapter that wasn't in
      error, it would kill the tape i/o's, effectively killing the
      tape device state.  With the reset point removed, the adapter
      reset avoided the fabric logout, allowing the other devices to
      continue to operate unaffected. A hack - yes. Hint: we really
      need a transport I_T nexus reset callback added to the eh process
      (in between the SCSI target reset and hba reset points), so a
      fc logout could occur to the one bad target only and stop the error
      escalation process.
      
      This patch commonizes the approach so it can be used for sli3 and sli4
      adapters, but mandates the admin, via module parameter, specifically
      identify which adapters the resets are to be removed for. Additionally,
      bus_reset, which sends Target Reset TMFs to all targets, is also removed
      from the template as it too has the same effect as the adapter reset.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NLaurence Oberman <loberman@redhat.com>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      96418b5e
  16. 23 2月, 2017 4 次提交
  17. 07 2月, 2017 1 次提交
  18. 05 1月, 2017 1 次提交
  19. 01 12月, 2016 1 次提交
  20. 09 11月, 2016 3 次提交
  21. 27 7月, 2016 1 次提交
    • M
      lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() · 05a05872
      Mauricio Faria de Oliveira 提交于
      The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
      'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.
      
      That's not true in the .eh_(device|target|bus)_reset_handler path,
      because lpfc_send_taskmgmt() sends commands not from the midlayer, so
      does not set 'lpfc_cmd->pCmd'.
      
      That is true in the .queuecommand path because lpfc_queuecommand()
      stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd is
      stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is passed
      to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.
      
      This problem can be hit on SCSI EH, and immediately with sg_reset.
      These 2 test-cases demonstrate the problem/fix with next-20160601.
      
      Test-case 1) sg_reset
      
          # strace sg_reset --device /dev/sdm
          <...>
          open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
          ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
          +++ killed by SIGSEGV +++
          Segmentation fault
      
          # dmesg
          Unable to handle kernel paging request for data at address 0x00000000
          Faulting instruction address: 0xd00000001c88442c
          Oops: Kernel access of bad area, sig: 11 [#1]
          <...>
          CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
          <...>
          NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
          LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
          Call Trace:
          [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
          [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
          [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
          [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
          [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
          [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
          [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
          [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
          [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
          [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
          [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
          [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
          [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
          [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
          Instruction dump:
          <...>
      
          With fix:
      
          # strace sg_reset --device /dev/sdm
          <...>
          open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
          ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
          close(3)                                = 0
          exit_group(0)                           = ?
          +++ exited with 0 +++
      
          # dmesg
          [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002
      
      Test-case 2) SCSI EH
      
          Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
          -       cmd->scsi_done(cmd);
          +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
          +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
          +       else
          +               cmd->scsi_done(cmd);
      
          # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose
      
          # dd if=/dev/sdm of=/dev/null iflag=direct &
          <...>
      
          After a while:
      
          # dmesg
          lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
          lpfc: skip scsi_done()
          <...>
          Unable to handle kernel paging request for data at address 0x00000000
          Faulting instruction address: 0xd0000000199e448c
          Oops: Kernel access of bad area, sig: 11 [#1]
          <...>
          CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
          <...>
          NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
          LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
          Call Trace:
          [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
          [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
          [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
          [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
          [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
          [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
          [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
          [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
          [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
          [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
          Instruction dump:
          <...>
      
          With fix:
      
          # dmesg
          lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
          lpfc: skip scsi_done()
          <...>
          lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
          <...>
          lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
          <...>
          lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
          <...>
          lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
          <...>
      
      Fixes: 8b0dff14 ("lpfc: Add support for using block multi-queue")
      Cc: <stable@vger.kernel.org> # v4.2+
      Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Acked-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      05a05872
  22. 16 7月, 2016 3 次提交
  23. 24 2月, 2016 1 次提交
  24. 22 12月, 2015 2 次提交