- 30 7月, 2009 5 次提交
-
-
由 Christof Schmitt 提交于
Depending on interruptions on some storage systems, the complete channel can stall which looks like an outbound queue stall to Linux. When trying to acquire a free SBAL for a non-SCSI command, zfcp waits for 5 seconds for a free slot to appear. This is the right place to detect a queue stall: If the wait times out, we assume a stalled queue and try to recover this. The overall strategy should be to trigger the erp from specific events, and not try an overall escalation from one failed port to a full-blown queue recovery. If we manage to send a command, the status codes for this command or a timeout will trigger the right follow-on actions. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
-ENOMEM is for memory allocation problems, -EIO for queue/SBAL allocation problems. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
The ELS ADISC and the GID_PN requests sent from zfcp fit into unchained FSF requests. Change the FSF allocation logic to use unchained requests whenever possible where everything fits in one SBAL. This avoids acquiring more SBALs than necessary, especially during zfcp recovery when things might be stalled. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
When a fsf_req or a qtcb cannot be allocated return -ENOMEM instead of -EIO. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Swen Schillig 提交于
We should not modify the port status after triggering an ERP action for the port. It is not guaranteed which status is finally active when the ERP action is performed. This can lead to situations which are unwanted and hard to debug in case of a failure. Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 13 6月, 2009 1 次提交
-
-
由 Christof Schmitt 提交于
Don't access the block layer request, get the payload length instead from the FC job. Simplify access to the zfcp_port, only the d_id is required, if the port is no longer accessed later. This is possible when the els_handler does not access the port pointer from the ELS request. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 24 5月, 2009 4 次提交
-
-
由 Christof Schmitt 提交于
Keep the information about the device and model id in zfcp_ccw. This requires an additional helper function to check for the privileged cfdc subchannel, but it allows the removal of the redundant defines from the zfcp_def header file. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Martin Petermann 提交于
In rare cases, open port request might timeout, erp calls zfcp_port_put, port gets dequeued. Now, the late returning (or dismissed) fsf-port-open calls the fsf_port_open_handler that tries to reference the port data structure leading to a kernel oops. Signed-off-by: NMartin Petermann <martin.petermann@de.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Add comments where there is a deliberate fall through in switch/case statements. This makes some code checkers happy and makes it clear that there is no missing break statement. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
enum dma_data_direction only has the 4 values DMA_BIDIRECTIONAL, DMA_TO_DEVICE, DMA_FROM_DEVICE and DMA_NONE. No need to have the default case. While changing this, setup sbtype in one place to make sparse happy. The default value of retval is already -EIO, so remove the additional assignment for these two cases. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 27 4月, 2009 5 次提交
-
-
由 Christof Schmitt 提交于
The zfcp_port might have been removed, while the FC fast_io_fail timer is still running and could trigger the terminate_rport_io callback. Set the pointer to the zfcp_port to NULL and check accordingly before using it. Reviewed-by: NMartin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Martin Petermann 提交于
The current sbal counting can be wrong if a fsf request is waiting for free sbals and at the same time qdio request queue is shutdown and re-opened. Revering a previous patch fixes this issue. Signed-off-by: NMartin Petermann <martin.petermann@de.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Error codes specific to the control file requests are evaluated by the actcli tool, so don't report -ENXIO for those. Generic problems are still checked for outside the command specific handler. Reviewed-by: NMartin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Fix problem that zfcp_fsf_exchange_config_data_sync and zfcp_fsf_exchange_config_data_sync could try to call zfcp_fsf_req_free with a NULL pointer. Reviewed-by: NMartin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Martin Petermann 提交于
Avoid referencing a fsf request after sending it in fcp_fsf_req_send, it might have already completed and deallocated. Signed-off-by: NMartin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 13 3月, 2009 12 次提交
-
-
由 Christof Schmitt 提交于
Report the fc_host_port_type as FC_PORTTYPE_NPIV when the subchannel is running in NPIV mode. This allows to see the correct type with lsscsi -H -t --list Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Use the I/O blocking mechanism in the FC transport class to allow faster failovers for multipathing: - Call fc_remote_port_delete early to set the rport to BLOCKED. - Check the rport status in queuecommand with fc_remote_portchkready to no longer accept new I/O for this port and fail the I/O with the appropriate scsi_cmnd result. - Implement the terminate_rport_io handler to abort all pending I/O requests - Return SCSI commands with DID_TRANSPORT_DISRUPTED while erp is running. - When updating the remote port status, check for late changes and update the remote ports status accordingly. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Swen Schillig 提交于
After an error condition resolved a remote storage port was never re-opened. The incoming RSCN was not processed accordingly due to a misinterpreted status flag / return value combination. Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Swen Schillig 提交于
The current number based id ERP logging is replaced by a string based tag version. The benefit is an easier location of the code in question and the removal of the lengthy array referencing the individual messages. The string (7 bytes) based version does not use more space since those bytes were "used" anyway due to the alignment of the structure. The encoding of the 7 byte string is as follows [0-1] = filename [2-5] = task/function [6] = section Due to the character of this string (fixed length) a string termination is not required here. Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Swen Schillig 提交于
The status read response FSF_STATUS_READ_SUB_ERROR_PORT is not defined in the specs and therefore not valid. All occurrences are removed from the code. Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Issue ELS ADISC requests from workqueue. This allows the link test request to be sent when the request queue is full due to I/O load for other remote ports. It also simplifies request queue locking, zfcp_fsf_send_fcp_command_task is now the only function that has interrupts disabled from the caller. This is also a prereq for the FC passthrough support that issues ELS requests from userspace. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
When the SCSI midlayer is running error recovery, the low-level error recovery in zfcp could be running and preventing the SCSI midlayer to issue error recovery requests. To avoid unnecessary error recovery escalation, wait for the zfcp erp to finish and retry if necessary. While reworking the SCSI eh handlers, alsa cleanup the code and simplify the interface from zfcp_scsi to the fsf layer. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
For calls from zfcp erp, scsi_eh and sysfs switch the calls issuing FSF requests to zfcp_fsf_req_sbal_get to wait for free SBALs. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Only increment the req_id for successfully issued requests. This avoids some confusion when debugging issued fsf requests. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
The lock only needs to protect the softirq context called from qdio against the userspace context called from sysfs. spin_lock and spin_lock_bh is enough. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
PORT_PHYS_CLOSING is only set and cleared, but not actually used for status checking. PORT_INVALID_WWPN is set when the GID_PN request does not return a d_id for a remote port, e.g. when a remote port has been unplugged. For this case, the d_id is zero. In the erp we can check the d_id and use the normal escalation procedure that gives up after three retries and remove the special case. PORT_NO_WWPN is unused: Each port in the remote port list has a valid wwpn. The WKA ports are now tracked outside the port list. Remove the PORT_NO_WWPN flag, since this is no longer set for any port. Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Martin K. Petersen 提交于
The SUGGEST_* flags in the SCSI command result have been out of fashion for a while and we don't actually use them in the error handling. Remove the remaining occurrences. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 30 12月, 2008 5 次提交
-
-
由 Christof Schmitt 提交于
Remove a message that was emitted for a port that could not initially be opened. This is a rare case when the port discovery hits an initiator port and only confuses the user with an initator port logged in the message. Remove the whole special case: The failed "open port" request triggers required follow-up actions anyway. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Acked-by: NFelix Beck <felix@linux.vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Add the support to send CT and ELS requests as unchained FSF requests. This is required for older hardware and was somehow omitted during the cleanup of the FSF layer. The req_count and resp_count attributes are unused, so remove them instead of adding a special case for setting them. Also add debug data and a warning, when the ct request hits a limit. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Acked-by: NMartin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
The port flag DID_DID indicates whether we know the current id of the port. This is always set in parallel. Since the id 0 is invalid (because the port id 0 is invalid) we can remove the DID_DID flag: d_id of 0 indicates an invalid d_id != 0 is a valid one. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Acked-by: NFelix Beck <felix@linux.vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
When waiting for a request claim the SBAL before waiting. This way, locking before each check of the free counter is not required and sparse does not emit warnings for the complicated locking scheme. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Acked-by: NFelix Beck <felix@linux.vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Move the closing parenthesis before the line break. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Acked-by: NFelix Beck <felix@linux.vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 25 12月, 2008 1 次提交
-
-
由 Christof Schmitt 提交于
Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 02 12月, 2008 3 次提交
-
-
由 Swen Schillig 提交于
The check of having a valid pointer was performed before the processing was secured by the lock. Between those two steps the pointer can turn invalid. During further processing another value is used (referenced by the pointer described above) as a function pointer which is never verified to be valid either, resulting under some circumstances in an invalid function call. This patch is fixing both issues. Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Swen Schillig 提交于
Aborting a SCSI cmnd might requrie to send a abort_fsf_cmnd. If the creation of this fsf_req fails an ERR_PTR is returned where a NULL value would be expected as an error indicator. This ERR_PTR is dereferenced as valid fsf_req in succeeding processing leading to an error. Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
Running two wka_port_get calls in parallel could issue two open_port requests, overwriting the port handle. Don't issue an open_port for the state PORT_OPENING, and only read the data from GOOD responses. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Acked-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 06 11月, 2008 3 次提交
-
-
由 Christof Schmitt 提交于
Fix the handling of the request list in the error path: - Use irqsave for the lock as in the good path. - Before removing the request, check if it is still in the list, a call to dismiss_all might have changed the list in between. - zfcp_qdio_send does not change the queue counters on failure, trying revert something is wrong, so remove this. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Christof Schmitt 提交于
When allocating fsf requests without qtcb, store the pointer to the mempool in the fsf requests for later call to mempool_free. This codepath is only used by the status_read requests. Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
由 Heiko Carstens 提交于
The per adapter req_list_lock must be held with interrupts disabled, otherwise we might end up with nice deadlocks as lockdep tells us (see below). zfcp 0.0.1804: QDIO problem occurred. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.27-rc8-00035-g4a770358-dirty #86 --------------------------------------------------------- swapper/0 just changed the state of lock: (&adapter->erp_lock){++..}, at: [<00000000002c82ae>] zfcp_erp_adapter_reopen+0x4e/0x8c but this lock took another, hard-irq-unsafe lock in the past: (&adapter->req_list_lock){-+..} and interrupts could create inverse lock ordering between them. [tons of backtraces, but only the interesting part follows] the second lock's dependencies: -> (&adapter->req_list_lock){-+..} ops: 2280627634176 { initial-use at: [<0000000000071f10>] __lock_acquire+0x504/0x18bc [<000000000007335c>] lock_acquire+0x94/0xbc [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0 [<00000000002cf684>] zfcp_fsf_req_dismiss_all+0x50/0x140 [<00000000002c87ee>] zfcp_erp_adapter_strategy_generic+0x66/0x3d0 [<00000000002c9498>] zfcp_erp_thread+0x88c/0x1318 [<000000000001b0d2>] kernel_thread_starter+0x6/0xc [<000000000001b0cc>] kernel_thread_starter+0x0/0xc in-softirq-W at: [<0000000000072172>] __lock_acquire+0x766/0x18bc [<000000000007335c>] lock_acquire+0x94/0xbc [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0 [<00000000002ca73e>] zfcp_qdio_int_resp+0xbe/0x2ac [<000000000027a1d6>] qdio_kick_inbound_handler+0x82/0xa0 [<000000000027daba>] tiqdio_inbound_processing+0x62/0xf8 [<0000000000047ba4>] tasklet_action+0x100/0x1f4 [<0000000000048b5a>] __do_softirq+0xae/0x154 [<0000000000021e4a>] do_softirq+0xea/0xf0 [<00000000000485de>] irq_exit+0xde/0xe8 [<0000000000268c64>] do_IRQ+0x160/0x1fc [<00000000000261a2>] io_return+0x0/0x8 [<000000000001b8f8>] cpu_idle+0x17c/0x224 hardirq-on-W at: [<0000000000072190>] __lock_acquire+0x784/0x18bc [<000000000007335c>] lock_acquire+0x94/0xbc [<00000000003d702c>] _spin_lock+0x5c/0x9c [<00000000002caff6>] zfcp_fsf_req_send+0x3e/0x158 [<00000000002ce7fe>] zfcp_fsf_exchange_config_data+0x106/0x124 [<00000000002c8948>] zfcp_erp_adapter_strategy_generic+0x1c0/0x3d0 [<00000000002c98ea>] zfcp_erp_thread+0xcde/0x1318 [<000000000001b0d2>] kernel_thread_starter+0x6/0xc [<000000000001b0cc>] kernel_thread_starter+0x0/0xc } ... key at: [<0000000000e356c8>] __key.26629+0x0/0x8 Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmit@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
-
- 17 10月, 2008 1 次提交
-
-
由 Stefan Raspl 提交于
This patch writes the channel and fabric latencies in nanoseconds per request via blktrace for later analysis. The utilization of the inbound and outbound adapter queue is also reported. Signed-off-by: NStefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: NMartin Peschke <mp3@de.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-