1. 09 11月, 2016 2 次提交
  2. 02 11月, 2016 1 次提交
    • B
      scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init · a5dd506e
      Bill Kuzeja 提交于
      A system can get hung task timeouts if a qlogic board fails during
      initialization (if the board breaks again or fails the init). The hang
      involves the scsi scan.
      
      In a nutshell, since commit beb9e315 ("qla2xxx: Prevent removal and
      board_disable race"):
      
      ...it is possible to have freed ha (base_vha->hw) early by a call to
      qla2x00_remove_one when pdev->enable_cnt equals zero:
      
             if (!atomic_read(&pdev->enable_cnt)) {
                     scsi_host_put(base_vha->host);
                     kfree(ha);
                     pci_set_drvdata(pdev, NULL);
                     return;
      
      Almost always, the scsi_host_put above frees the vha structure
      (attached to the end of the Scsi_Host we're putting) since it's the last
      put, and life is good.  However, if we are entering this routine because
      the adapter has broken sometime during initialization AND a scsi scan is
      already in progress (and has done its own scsi_host_get), vha will not
      be freed. What's worse, the scsi scan will access the freed ha structure
      through qla2xxx_scan_finished:
      
              if (time > vha->hw->loop_reset_delay * HZ)
                      return 1;
      
      The scsi scan keeps checking to see if a scan is complete by calling
      qla2xxx_scan_finished. There is a timeout value that limits the length
      of time a scan can take (hw->loop_reset_delay, usually set to 5
      seconds), but this definition is in the data structure (hw) that can get
      freed early.
      
      This can yield unpredictable results, the worst of which is that the
      scsi scan can hang indefinitely. This happens when the freed structure
      gets reused and loop_reset_delay gets overwritten with garbage, which
      the scan obliviously uses as its timeout value.
      
      The fix for this is simple: at the top of qla2xxx_scan_finished, check
      for the UNLOADING bit in the vha structure (_vha is not freed at this
      point).  If UNLOADING is set, we exit the scan for this adapter
      immediately. After this last reference to the ha structure, we'll exit
      the scan for this adapter, and continue on.
      
      This problem is hard to hit, but I have run into it doing negative
      testing many times now (with a test specifically designed to bring it
      out), so I can verify that this fix works. My testing has been against a
      RHEL7 driver variant, but the bug and patch are equally relevant to to
      the upstream driver.
      
      Fixes: beb9e315 ("qla2xxx: Prevent removal and board_disable race")
      Cc: <stable@vger.kernel.org> # v3.18+
      Signed-off-by: NBill Kuzeja <william.kuzeja@stratus.com>
      Acked-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a5dd506e
  3. 15 9月, 2016 1 次提交
  4. 31 8月, 2016 1 次提交
  5. 09 8月, 2016 1 次提交
  6. 16 7月, 2016 20 次提交
  7. 06 7月, 2016 1 次提交
    • B
      qla2xxx: Fix NULL pointer deref in QLA interrupt · 262e2bfd
      Bruno Prémont 提交于
      In qla24xx_process_response_queue() rsp->msix->cpuid may trigger NULL
      pointer dereference when rsp->msix is NULL:
      
      [    5.622457] NULL pointer dereference at 0000000000000050
      [    5.622457] IP: [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
      [    5.622457] PGD 0
      [    5.622457] Oops: 0000 [#1] SMP
      [    5.622457] Modules linked in:
      [    5.622457] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.6.3-x86_64 #1
      [    5.622457] Hardware name: HP ProLiant DL360 G5, BIOS P58 05/02/2011
      [    5.622457] task: ffff8801a88f3740 ti: ffff8801a8954000 task.ti: ffff8801a8954000
      [    5.622457] RIP: 0010:[<ffffffff8155e614>]  [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
      [    5.622457] RSP: 0000:ffff8801afb03de8  EFLAGS: 00010002
      [    5.622457] RAX: 0000000000000000 RBX: 0000000000000032 RCX: 00000000ffffffff
      [    5.622457] RDX: 0000000000000002 RSI: ffff8801a79bf8c8 RDI: ffff8800c8f7e7c0
      [    5.622457] RBP: ffff8801afb03e68 R08: 0000000000000000 R09: 0000000000000000
      [    5.622457] R10: 00000000ffff8c47 R11: 0000000000000002 R12: ffff8801a79bf8c8
      [    5.622457] R13: ffff8800c8f7e7c0 R14: ffff8800c8f60000 R15: 0000000000018013
      [    5.622457] FS:  0000000000000000(0000) GS:ffff8801afb00000(0000) knlGS:0000000000000000
      [    5.622457] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    5.622457] CR2: 0000000000000050 CR3: 0000000001e07000 CR4: 00000000000006e0
      [    5.622457] Stack:
      [    5.622457]  ffff8801afb03e30 ffffffff810c0f2d 0000000000000086 0000000000000002
      [    5.622457]  ffff8801afb03e28 ffffffff816570e1 ffff8800c8994628 0000000000000002
      [    5.622457]  ffff8801afb03e60 ffffffff816772d4 b47c472ad6955e68 0000000000000032
      [    5.622457] Call Trace:
      [    5.622457]  <IRQ>
      [    5.622457]  [<ffffffff810c0f2d>] ? __wake_up_common+0x4d/0x80
      [    5.622457]  [<ffffffff816570e1>] ? usb_hcd_resume_root_hub+0x51/0x60
      [    5.622457]  [<ffffffff816772d4>] ? uhci_hub_status_data+0x64/0x240
      [    5.622457]  [<ffffffff81560d00>] qla24xx_intr_handler+0xf0/0x2e0
      [    5.622457]  [<ffffffff810d569e>] ? get_next_timer_interrupt+0xce/0x200
      [    5.622457]  [<ffffffff810c89b4>] handle_irq_event_percpu+0x64/0x100
      [    5.622457]  [<ffffffff810c8a77>] handle_irq_event+0x27/0x50
      [    5.622457]  [<ffffffff810cb965>] handle_edge_irq+0x65/0x140
      [    5.622457]  [<ffffffff8101a498>] handle_irq+0x18/0x30
      [    5.622457]  [<ffffffff8101a276>] do_IRQ+0x46/0xd0
      [    5.622457]  [<ffffffff817f8fff>] common_interrupt+0x7f/0x7f
      [    5.622457]  <EOI>
      [    5.622457]  [<ffffffff81020d38>] ? mwait_idle+0x68/0x80
      [    5.622457]  [<ffffffff8102114a>] arch_cpu_idle+0xa/0x10
      [    5.622457]  [<ffffffff810c1b97>] default_idle_call+0x27/0x30
      [    5.622457]  [<ffffffff810c1d3b>] cpu_startup_entry+0x19b/0x230
      [    5.622457]  [<ffffffff810324c6>] start_secondary+0x136/0x140
      [    5.622457] Code: 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 8b 47 58 a8 02 0f 84 c5 00 00 00 48 8b 46 50 49 89 f4 65 8b 15 34 bb aa 7e <39> 50 50 74 11 89 50 50 48 8b 46 50 8b 40 50 41 89 86 60 8b 00
      [    5.622457] RIP  [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
      [    5.622457]  RSP <ffff8801afb03de8>
      [    5.622457] CR2: 0000000000000050
      [    5.622457] ---[ end trace fa2b19c25106d42b ]---
      [    5.622457] Kernel panic - not syncing: Fatal exception in interrupt
      
      The affected code was introduced by commit cdb898c5
      (qla2xxx: Add irq affinity notification).
      
      Only dereference rsp->msix when it has been set so the machine can boot
      fine. Possibly rsp->msix is unset because:
      [    3.479679] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.33-k.
      [    3.481839] qla2xxx [0000:13:00.0]-001d: : Found an ISP2432 irq 17 iobase 0xffffc90000038000.
      [    3.484081] qla2xxx [0000:13:00.0]-0035:0: MSI-X; Unsupported ISP2432 (0x2, 0x3).
      [    3.485804] qla2xxx [0000:13:00.0]-0037:0: Falling back-to MSI mode -258.
      [    3.890145] scsi host0: qla2xxx
      [    3.891956] qla2xxx [0000:13:00.0]-00fb:0: QLogic QLE2460 - PCI-Express Single Channel 4Gb Fibre Channel HBA.
      [    3.894207] qla2xxx [0000:13:00.0]-00fc:0: ISP2432: PCIe (2.5GT/s x4) @ 0000:13:00.0 hdma+ host#=0 fw=7.03.00 (9496).
      [    5.714774] qla2xxx [0000:13:00.0]-500a:0: LOOP UP detected (4 Gbps).
      Signed-off-by: NBruno Prémont <bonbons@linux-vserver.org>
      Acked-by: NQuinn Tran <quinn.tran@qlogic.com>
      CC: <stable@vger.kernel.org>  # 4.5+
      Fixes: cdb898c5Signed-off-by: NJames Bottomley <jejb@linux.vnet.ibm.com>
      262e2bfd
  8. 21 6月, 2016 1 次提交
  9. 10 5月, 2016 2 次提交
  10. 28 4月, 2016 1 次提交
  11. 16 4月, 2016 1 次提交
  12. 12 4月, 2016 1 次提交
  13. 19 3月, 2016 1 次提交
    • A
      qla2xxx: avoid maybe_uninitialized warning · bc7095a9
      Arnd Bergmann 提交于
      The qlt_check_reserve_free_req() function produces an incorrect warning
      when CONFIG_PROFILE_ANNOTATED_BRANCHES is set:
      
      drivers/scsi/qla2xxx/qla_target.c: In function 'qlt_check_reserve_free_req':
      drivers/scsi/qla2xxx/qla_target.c:1887:3: error: 'cnt_in' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         ql_dbg(ql_dbg_io, vha, 0x305a,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             "qla_target(%d): There is no room in the request ring: vha->req->ring_index=%d, vha->req->cnt=%d, req_cnt=%d Req-out=%d Req-in=%d Req-Length=%d\n",
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             vha->vp_idx, vha->req->ring_index,
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             vha->req->cnt, req_cnt, cnt, cnt_in, vha->req->length);
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/scsi/qla2xxx/qla_target.c:1887:3: error: 'cnt' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      The problem is that gcc fails to track the state of the condition across
      an annotated branch.
      
      This slightly rearranges the code to move the second if() block
      into the first one, to avoid the warning while retaining the
      behavior of the code.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-By: NHimanshu Madhani <himanshu.madhani@qlogic.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bc7095a9
  14. 11 3月, 2016 2 次提交
  15. 24 2月, 2016 4 次提交