1. 17 11月, 2007 2 次提交
    • M
      [SCSI] zfcp: fix cleanup of dismissed error recovery actions · 86e8dfc5
      Martin Peschke 提交于
      Calling zfcp_erp_strategy_check_action() after zfcp_erp_action_to_running()
      in zfcp_erp_strategy() might cause an unbalanced up() for erp_ready_sem,
      which makes the zfcp recovery fail somewhere along the way:
      
      erp thread processing erp_action:
      |
      |	someone waking up erp thread for erp_action
      |	|
      |	|		someone else dismissing erp_action:
      |	|		|
      V	V		V
      
      	write_lock_irqsave(&adapter->erp_lock, flags);
      	...
      	if (zfcp_erp_action_exists(erp_action) == ZFCP_ERP_ACTION_RUNNING) {
      		zfcp_erp_action_to_ready(erp_action);
      		up(&adapter->erp_ready_sem);	/* first up() for erp_action */
      	}
      	write_unlock_irqrestore(&adapter->erp_lock, flags);
      
      write_lock_irqsave(&adapter->erp_lock, flags);
      ...
      zfcp_erp_action_to_running(erp_action);
      write_unlock_restore(&adapter->erp_lock, flags);
      /* processing erp_action */
      
      			write_lock_irqsave(&adapter->erp_lock, flags);
      			...
      			erp_action->status |= ZFCP_STATUS_ERP_DISMISSED;
      			if (zfcp_erp_action_exists(erp_action) ==
      						ZFCP_ERP_ACTION_RUNNING) {
      				zfcp_erp_action_to_ready(erp_action);
      				up(&adapter->erp_ready_sem);
      				/* second, unbalanced up() for erp_action */
      			}
      			...
      			write_unlock_restore(&adapter->erp_lock, flags);
      
      write_lock_irqsave(&adapter->erp_lock, flags);
      if (erp_action->status & ZFCP_STATUS_ERP_DISMISSED) {
      	zfcp_erp_action_dequeue(erp_action);
      	retval = ZFCP_ERP_DISMISSED;
      }
      ...
      write_unlock_restore(&adapter->erp_lock, flags);
      down(&adapter->erp_ready_sem);
      /* this down() is meant to balance the first up() */
      
      The erp thread must not dismiss an erp_action after moving that action to
      erp_running_head. Instead it should just go through the down() operation,
      which balances the first up(), and run through zfcp_erp_strategy one more
      time for the second up(), which eventually cleans up erp_action. Which
      is similar to the normal processing of an event for erp_action doing
      something asynchronously (e.g. waiting for the completion of an fsf_req).
      
      This only works if we make sure that a dismissed erp_action is passed to
      zfcp_erp_strategy() prior to the other action, which caused actions to be
      dismissed. Therefore the patch implements this rule: running actions go to
      the head of the ready list; new actions go to the tail of the ready list;
      the erp thread picks actions to be processed from the ready list's head.
      Signed-off-by: NMartin Peschke <mp3@de.ibm.com>
      Acked-by: NSwen Schillig <swen@vnet.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      86e8dfc5
    • M
      [SCSI] zfcp: fix dismissal of error recovery actions · d0076f77
      Martin Peschke 提交于
      zfcp_erp_action_dismiss() used to ignore any actions in the ready list. This
      is a bug. Any action superseded by a stronger action needs to be dismissed.
      This patch changes zfcp_erp_action_dismiss() so that it dismisses actions
      regardless of their list affiliation. The ERP thread is able to handle this.
      It is important to kick the erp thread only for actions in the running list,
      though, as an imbalance of wakeup signals would confuse the erp thread
      otherwise.
      Signed-off-by: NMartin Peschke <mp3@de.ibm.com>
      Acked-by: NSwen Schillig <swen@vnet.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      d0076f77
  2. 24 10月, 2007 1 次提交
  3. 23 10月, 2007 1 次提交
  4. 13 10月, 2007 3 次提交
  5. 12 10月, 2007 1 次提交
  6. 19 7月, 2007 1 次提交
  7. 20 6月, 2007 1 次提交
  8. 09 5月, 2007 4 次提交
  9. 01 4月, 2007 1 次提交
  10. 11 2月, 2007 2 次提交
  11. 06 2月, 2007 1 次提交
  12. 11 10月, 2006 1 次提交
  13. 24 9月, 2006 2 次提交
  14. 07 8月, 2006 2 次提交
  15. 04 7月, 2006 1 次提交
    • H
      [PATCH] zfcp: fix incorrect usage of erp_lock · 9f09c548
      Heiko Carstens 提交于
        =================================
        [ INFO: inconsistent lock state ]
        ---------------------------------
        inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
        swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
         (&adapter->erp_lock){+-..}, at: [<000000000026c7f8>] zfcp_erp_async_handler+0x3c/0x70
        {hardirq-on-W} state was registered at:
          [<000000000005f33e>] __lock_acquire+0x30a/0xed0
          [<00000000000604ae>] lock_acquire+0x9a/0xc8
          [<000000000035a7ae>] _write_lock+0x4e/0x68
          [<000000000026d822>] zfcp_erp_adapter_strategy_generic+0x286/0xd94
          [<000000000026fd72>] zfcp_erp_strategy_do_action+0x91e/0x1a94
          [<0000000000271a3a>] zfcp_erp_thread+0x21a/0x1568
          [<0000000000019096>] kernel_thread_starter+0x6/0xc
          [<0000000000019090>] kernel_thread_starter+0x0/0xc
        irq event stamp: 12078
        hardirqs last  enabled at (12077): [<0000000000019416>] cpu_idle+0x206/0x250
        hardirqs last disabled at (12078): [<0000000000020458>] io_no_vtime+0xc/0x1c
        softirqs last  enabled at (12072): [<0000000000040b62>] __do_softirq+0x13a/0x180
        softirqs last disabled at (12059): [<000000000001fd58>] do_softirq+0xec/0xf0
      
        other info that might help us debug this:
        no locks held by swapper/0.
      
        stack backtrace:
        00000000012bb648 0000000000000002 0000000000000000 00000000012bb758
               00000000012bb6c0 0000000000399122 0000000000399122 0000000000016b0a
               0000000000000000 0000000000000001 0000000000000000 00000000004660e8
               0000000000000000 000000000000000d 00000000012bb6b8 00000000012bb730
               0000000000368b90 0000000000016b0a 00000000012bb6b8 00000000012bb708
        Call Trace:
        ([<0000000000016a26>] show_trace+0x76/0xdc)
         [<0000000000016b2c>] show_stack+0xa0/0xd0
         [<0000000000016b8a>] dump_stack+0x2e/0x3c
         [<000000000005e3da>] print_usage_bug+0x27e/0x290
         [<000000000005e934>] mark_lock+0x548/0x6c0
         [<000000000005fb0c>] __lock_acquire+0xad8/0xed0
         [<00000000000604ae>] lock_acquire+0x9a/0xc8
         [<000000000035a662>] _write_lock_irqsave+0x62/0x80
         [<000000000026c7f8>] zfcp_erp_async_handler+0x3c/0x70
         [<0000000000279178>] zfcp_fsf_req_dispatch+0xd8/0x1fa8
         [<000000000027e538>] zfcp_fsf_req_complete+0x104/0xe4c
         [<0000000000274534>] zfcp_qdio_reqid_check+0xf4/0x178
         [<000000000027469e>] zfcp_qdio_response_handler+0xe6/0x430
         [<0000000000219dd4>] tiqdio_thinint_handler+0xd20/0x213c
         [<000000000020229a>] do_adapter_IO+0xb2/0xc0
         [<0000000000206f32>] do_IRQ+0x136/0x16c
         [<0000000000020462>] io_no_vtime+0x16/0x1c
         [<0000000000019432>] cpu_idle+0x222/0x250
        ([<0000000000019416>] cpu_idle+0x206/0x250)
         [<000000000001405a>] rest_init+0x5a/0x68
         [<0000000000536998>] start_kernel+0x39c/0x3dc
         [<0000000000013046>] _stext+0x46/0x1000
      
      Fix incorrect usage of erp_lock. Using the write_lock() variant is wrong,
      since this might lead to deadlocks.
      Acked-by: NAndreas Herrmann <aherrman@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9f09c548
  16. 27 6月, 2006 1 次提交
  17. 29 5月, 2006 6 次提交
  18. 10 3月, 2006 1 次提交
    • A
      [SCSI] zfcp: fix device registration issues · ad58f7db
      Andreas Herrmann 提交于
      The patch fixes following issues:
      
      (1) Replace scsi_add_device with scsi_scan_target.
      (Thus the rport instead of the scsi_host becomes parent of a
      scsi_target again.)
      
      (2) Avoid scsi_device allocation during registration of an remote port.
      (Would be done during fc_scsi_scan_rport.)
      
      (3) Fix queuecommand behaviour when an zfcp unit is blocked.
      (Call scsi_done with DID_NO_CONNECT instead of returning
      SCSI_MLQUEUE_DEVICE_BUSY otherwise we might end up waiting
      for completion in blk_execute_rq for ever.)
      Signed-off-by: NAndreas Herrmann <aherrman@de.ibm.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@SteelEye.com>
      ad58f7db
  19. 13 2月, 2006 2 次提交
  20. 02 2月, 2006 1 次提交
  21. 15 1月, 2006 2 次提交
  22. 02 12月, 2005 2 次提交
  23. 20 9月, 2005 1 次提交