- 07 12月, 2009 31 次提交
-
-
由 Stefan Weinhuber 提交于
Most of the error conditions reported by a FICON storage server indicate situations which can be recovered. Sometimes the host just needs to retry an I/O request, but sometimes the recovery is more complex and requires the device driver to wait, choose a different path, etc. The DASD device driver has a fully featured error recovery for normal block layer I/O, but not for internal I/O request which are for example used during the device bring up. This can lead to situations where the IPL of a system fails because DASD devices are not properly recognized. This patch will extend the internal I/O handling to use the existing error recovery procedures. Signed-off-by: NStefan Weinhuber <wein@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Weinhuber 提交于
The DASD device driver needs to explicitly enable the prefix command on the storage server, before it can be used. Originally we enabled this command along with others only if we wanted to support PAV. However, today we require this command for other features like High Performance FICON as well, so we need to always enable prefix. Signed-off-by: NStefan Weinhuber <wein@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
the todclk.h header file is dead code. Remove it. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Weinhuber 提交于
When a DASD device is used with the DIAG discipline, the DIAG initialization will indicate success or error with a respective return code. So far we have interpreted a return code of 4 as error, but it actually means that the initialization was successful, but the device is read-only. To allow read-only devices to be used with DIAG we need to accept a return code of 4 as success. Re-initialization of the DIAG access is also part of the DIAG error recovery. If we find that the access mode of a device has been changed from writable to read-only while the device was in use, we print an error message. Signed-off-by: NStefan Weinhuber <wein@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
If we detect a busy subchannel after the driver's set_offline callback returned in ccw_device_set_offline, the current behavior is to unregister the device, which may lead to undesired consequences. Change this to just quiesce the subchannel and go on with the offline processing. Note: This is no excuse for not fixing these drivers - after the set_offline callback they should have no running IO! Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Improve error recovery for internal I/Os by repeating each I/O 256 times per path to cope with long-running non-permanent error conditions. Also retry each path twice to cope with link flapping, i.e. single paths becoming unavailable in the order in which they are tried. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
IO subchannels are always unregistered in process context, so use spin_lock_irq in the corresponding remove callback. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
Ensure that there will be no more interrupts for an unregistered device by using the same quiesce and disable loop as in io_subchannel_shutdown. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
Try to disable the old subchannel before we ask the driver core to move the attached device to a new parent. This way we can use the QUIESCE state during shutdown which prevents a possible use after free situation in some error cases. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
Handle a failing cio_disable_subchannel at the end of our device recognition as if the recognition itself failed. This way subsequent registration steps do not need to handle enabled subchannels. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
DEV_STATE_QUIESCE is used to stop all IO on a busy subchannel. This patch fixes the following problems related to the QUIESCE state: * Fix a potential race condition which could occur when the resulting state was DEV_STATE_OFFLINE. * Add missing locking around cio_disable_subchannel, ccw_device_cancel_halt_clear and the cdev's handler. * Loop until we know for sure that the subchannel is disabled. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
The function ccw_device_unregister has to ensure to remove all references obtained by device_add and device_initialize. Unfortunately it gets called for devices which are 1) uninitialized, 2) initialized but unregistered, and 3) registered devices. To distinguish 1) and 2) this patch introduces a new flag "initialized", which is 1 as long as we hold the initial device reference. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
We used to maintain a "registered" flag in our ccw_device_private structure. This patch removes the "registered" flag and converts all users of it to device_is_registered which has the exact same meaning. Note: The usage the atomic operation test_and_clear_bit is replaced by the non-atomic if (device_is_registered()) device_del(). This will not do harm, since we serialize calls to ccw_device_unregister with a single-threaded workqueue. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
An Unconditional Reserve + Release operation (steal lock) for a boxed device may fail when encountering special error cases (e.g. unit checks or path errors). Fix this by using the more robust ccw_request infrastructure for performing the steal lock CCW program. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Set-pgid operations fail for some device types under z/VM for which the hypervisor has already set the pgid. Also reserved devices or changed pgids are not correctly recognized. Fix these problems by using a combination of sense-pgid and set-pgid and by also accepting pre-defined pgid settings. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Split setting (driver wants feature enabled) and status (feature setup was successful) for PGID related ccw device features so that setup errors can be detected. Previously, incorrectly handled setup errors could in rare cases lead to erratic I/O behavior and permanently unusuable devices. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
After changing all internal I/O functions to use the newly introduced ccw request infrastructure, retries are handled automatically after a clear operation. Therefore remove the internal retry flag and associated code. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Accept a request for setting a not-operational device offline. This way, users can remove devices from Linux which would otherwise remain unusable until reboot. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Use the newly introduced ccw request infrastructure to implement pgid related operations: sense pgid, set pgid and disband pg. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Use the newly introduced ccw request infrastructure to implement the sense id operation. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Reduce code duplication by introducing a central infrastructure to perform an internal I/O operation on a CCW device. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Remove the call to BUG() for situations which are unexpected but do not cause actual problems. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Device recognition needs to be started with the ccw device lock held to prevent race conditions between I/O starting and interrupt reception. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Handle verification errors consistently through the existing callback ccw_device_done to reduce cleanup code duplication. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Remove the return code from ccw_device_recognition and handle recognition errors through the existing callback ccw_device_recog_done to reduce cleanup code duplication. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Print a warning message in case a ccw device enters boxed or not operational state during online/offline processing. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Introduce a central mechanism for performing delayed ccw device work to ensure that different types of work do not overwrite each other. Prioritization ensures that the most important work is always performed while less important tasks are either obsoleted or repeated later. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Ensure that current and future users of sch->work do not overwrite each other by introducing a single mechanism for delayed subchannel work. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
Change the initiative to update subchannel-ccw device associations to the subchannel: when there is an indication that the internal association no longer reflects the current hardware state, mark each affected subchannel as requiring attention. Once processing reaches a subchannel, determine the correct association for that subchannel at that time and perform the necessary device_move operations. This change fixes problems with the previous approach which would leave devices in an inconsistent state when a new hardware change occurred while a device_move was already scheduled. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
sch_create_and_recog_new_device() associates a parent subchannel with its ccw device child even though this is already done by the subsequently called io_subchannel_recog(). Also make sure io_subchannel_recog() sets the association under lock. Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Peter Oberparleiter 提交于
io_subchannel_probe() frees memory for sch->private which is later freed again when io_subchannel_remove() is called. Fix this problem by removing the cleanup in io_subchannel_probe(). Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 13 11月, 2009 2 次提交
-
-
由 Martin Schwidefsky 提交于
In a system where the ctrl-alt-del init action initiated by signal quiesce suspends the machine the quiesce handler override for _machine_restart, _machine_halt and _machine_power_off needs to be undone, otherwise the override is still present in the resumed system. The next shutdown would then load the quiesce state psw instead of performing the correct shutdown action. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Gerald Schaefer 提交于
The monreader device driver doesn't set dev->driver_data to NULL after freeing the corresponding data structure. This leads to a use after free bug in the freeze/thaw suspend/resume functions after the device has been opened and closed once. Fix this by clearing dev->driver_data in the close() function. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 10月, 2009 4 次提交
-
-
由 Heiko Carstens 提交于
After copying uts->nodename to the static nodename array the static version isn't necessarily zero termininated, since the size of the array is one byte too short. Afterwards doing strncat(data, nodename, strlen(nodename)); may copy an arbitrary large amount of bytes. Fix this by getting rid of the static array and using strncat with proper length limit. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Fix missing unregister_sysctl_table in case the SCLP doesn't provide the requested feature. Also simplify the whole error handling while at it. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Martin Schwidefsky 提交于
If a suspended z/VM guest has been logged off before the resume the 'SET SMSG IUCV' CP command need to be repeated to reenable sending message via SMSG. This fixes the following error: HCPMFS057I H4214002 not receiving; SMSG off Error: non-zero CP response for command 'SMSG H4214002 CMM SHRINK 5010': #57 Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
Fix the size of the local buffer and use snprintf to prevent further miscalculations. Also fix the usage of bitwise vs logic operations. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 22 10月, 2009 3 次提交
-
-
由 Christof Schmitt 提交于
When configuring a LUN for use in zfcp, flush the SCSI work to ensure the SCSI device has been created before returning. This means that a configuration procedure can run these commands in a script and the SCSI device is available immediately after the unit_add: echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.181d/online echo 0x401040C300000000 > \ /sys/bus/ccw/drivers/zfcp/0.0.181d/0x500507630313c562/unit_add lsscsi Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
-
由 Christof Schmitt 提交于
Add HZ since the start_timer function expects jiffies, not seconds. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
-
由 Christof Schmitt 提交于
After opening a remote port zfcp checks if the WWPN returned in the PLOGI maches the WWPN of the port that should have been opened. On a mismatch zfcp assumes that the DID just changed, queries the FC nameserver and tries again. If the situation persists the erp will give up. With this strategy, if the remote port always returns the wrong PLOGI data, the remote port will not be opened. Introduce a warning, so that the system administrator knows why the remote port is not being opened and to have a pointer to investigate the problem on the storage system. Reviewed-by: NSwen Schillig <swen@vnet.ibm.com> Signed-off-by: NChristof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>
-