- 13 9月, 2017 1 次提交
-
-
由 Stefan Haberland 提交于
Fix a panic in blk_mq_hctx_has_pending() that is caused by a racy call to blk_mq_run_hw_queues in a dasd function that might get called with the request queue not yet initialized during initialization. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 08 9月, 2017 1 次提交
-
-
由 Stefan Haberland 提交于
Use new blk-mq interfaces. Use multiple queues and also use the block layer complete helper that finish the IO on the CPU that initiated it. Reviewed-by: NJan Hoeppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 8月, 2017 1 次提交
-
-
由 Jan Höppner 提交于
The z/VM hypervisor provides virtual disks (VDISK) which are backed by main memory of the hypervisor. Those devices are seen as DASD FBA disks within the Linux guest. Whenever data is written to such a device, memory is allocated on-the-fly by z/VM accordingly. This memory, however, is not being freed if data on the device is deleted by the guest OS. In order to make memory usable after deletion again, add discard support to the FBA discipline. While at it, update comments regarding the DASD_FEATURE_* flags. Reviewed-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NJan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 23 8月, 2017 2 次提交
-
-
由 Jan Höppner 提交于
Unsigned long long and unsigned long were different in size for 31-bit. For 64-bit the size for both datatypes is 8 Bytes and since the support for 31-bit is long gone we can clean up a little and change everything to unsigned long. Change get_phys_clock() along the way to accept unsigned long as well so that the DASD code can be consistent. Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NJan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Add average times to the DASD statistics interface. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 05 7月, 2017 1 次提交
-
-
由 Sebastian Ott 提交于
Fix these set but not used warnings: drivers/s390/block/dasd.c:3933:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable] drivers/s390/block/dasd_alias.c:757:6: warning: variable 'rc' set but not used [-Wunused-but-set-variable] In addition to that remove the test if an unsigned is < 0: drivers/s390/block/dasd_devmap.c:153:11: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 12 6月, 2017 2 次提交
-
-
由 Stefan Haberland 提交于
The safe offline processing may hang forever because it waits for I/O which can not be started because of the offline flag that prevents new I/O from being started. Allow I/O to be started during safe offline processing because in this special case we take care that the queues are empty before throwing away the device. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
The safe offline processing needs, as well as the normal offline processing, to be locked against multiple parallel executions. But it should be able to be overtaken by a normal offline processing to make sure that the device does not wait forever for outstanding I/O if the user wants to. Unfortunately the parallel processing of safe offline and normal offline might lead to a race situation where both threads report successful execution to the CIO layer which in turn tries to deregister the kobject of the device twice. This leads to a refcount_t: underflow; use-after-free. error and the device is not able to be set online again afterwards without a reboot. Correct the locking of the safe offline processing by doing the following: - Use the cdev lock to secure all set and test operations to the device flags. - Two safe offline processes are locked against each other using the DASD_FLAG_SAFE_OFFLINE and DASD_FLAG_SAFE_OFFLINE_RUNNING device flags. The differentiation between offline triggered and offline running is needed since the normal offline attribute is owned by CIO and we have to pass over control in between. - The dasd_generic_set_offline process handles the offline processing. It is locked against parallel execution using the DASD_FLAG_OFFLINE. - Only a running safe offline should be able to be overtaken by a single normal offline. This is ensured by clearing the DASD_FLAG_SAFE_OFFLINE_RUNNING flag when a normal offline overtakes. So this can only happen ones. - The safe offline just aborts in this case doing nothing and the normal offline processing finishes as usual. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 09 6月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 31 1月, 2017 2 次提交
-
-
由 Stefan Haberland 提交于
If safe offline is called for a DASD alias device a null pointer is passed to fsync_bdev. So check for existence of the blockdevice before calling fsync_bdev. Should not be a real world problem since safe offline for an alias device does not make sense and fsync_bdev can deal with a NULL pointer which it gets after successful NULL pointer dereferencing on s390. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Check if the device pointer is valid. Just a sanity check since we already are in the int handler of the device. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 12 12月, 2016 2 次提交
-
-
由 Stefan Haberland 提交于
With this feature, the DASD device driver more robustly handles DASDs that are attached via multiple channel paths and are subject to constant Interface-Control-Checks (IFCCs) and Channel-Control-Checks (CCCs) or loss of High-Performance-FICON (HPF) functionality on one or more of these paths. If a channel path does not work correctly, it is removed from normal operation as long as other channel paths are available. All extended error recovery states can be queried and reset via user space interfaces. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: NJan Hoeppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Store flags and path_data per channel path. Implement get/set functions for various path masks. The patch does not add functional changes. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: NJan Hoeppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 28 10月, 2016 3 次提交
-
-
由 Jan Höppner 提交于
Before we set a device offline, the open_count for the block device is checked and certain flags are checked and set as well. However, this is all done without holding any lock. Potentially, if the open_count was checked but the DASD_FLAG_OFFLINE wasn't set yet, a different process might want to increase the open_count depending on whether DASD_FLAG_OFFLINE is set or not in the meanwhile. This is quite racy and can lead to the loss of the device for that process and subsequently lead to a panic. Fix this by checking the open_count and setting the offline flags while holding the ccwdev lock. Reviewed-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NJan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Jan Höppner 提交于
block->request_queue is used many times in dasd_setup_queue. Define a separate variable to increase readability a bit and to make it better reusable. Reviewed-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NJan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Jan Höppner 提交于
Currently the block queue value max_segments is set to -1L, which is then implicitly casted to unsigned short in blk_queue_max_segments. This results in 65535 (64k) max_segments. Even though the resulting value is correct, setting it implicitly using -1L is rather confusing. Set the value explicitly using the USHRT_MAX macro instead. Reviewed-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NJan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 26 9月, 2016 2 次提交
-
-
由 Stefan Haberland 提交于
A DASD device consists of the device itself and a discipline with a corresponding private structure. These fields are set up during online processing right after the device is created and before it is processed by the state machine and made available for I/O. During offline processing the discipline pointer and the private data gets freed within the state machine and without protection of the existing reference count. This might lead to a kernel panic because a function might have taken a device reference and accesses the discipline pointer and/or private data of the device while this is already freed. Fix by freeing the discipline pointer and the private data after ensuring that there is no reference to the device left. Reviewed-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Internal I/O is processed by the _sleep_on_function which might wait for a device to get operational. During offline processing this will never happen and therefore the refcount of the device will not drop to zero and the offline processing blocks as well. Fix by letting requests fail in the _sleep_on function during offline processing. No further handling of the requests is necessary since this is internal I/O and the device is thrown away afterwards. Reviewed-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 20 9月, 2016 1 次提交
-
-
由 Stefan Haberland 提交于
The DASD device driver throws change events for the DASD blockdevice after the online processing is done so that udev rules can take actions after it. The change event was missing for unformatted devices. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 10 8月, 2016 1 次提交
-
-
由 Stefan Haberland 提交于
When a device is in a status where CIO has killed all I/O by itself the interrupt for a clear request may not contain an irb to determine the clear function. Instead it contains an error pointer -EIO. This was ignored by the DASD int_handler leading to a hanging device waiting for a clear interrupt. Handle -EIO error pointer correctly for requests that are clear pending and treat the clear as successful. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 16 4月, 2016 2 次提交
-
-
由 Jan Höppner 提交于
Implement new DASD IOCTL BIODASDCHECKFMT to check a range of tracks on a DASD volume for correct formatting. The following characteristics are checked: - Block size - ECKD key length - ECKD record ID - Number of records per track Signed-off-by: NJan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
With this feature, applications can query if a DASD volume is online to another operating system instances by checking the online status of all attached hosts from the storage server. Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 17 2月, 2016 1 次提交
-
-
由 Stefan Haberland 提交于
Commit ca369d51 ("sd: Fix device-imposed transfer length limits") introduced a new queue limit max_dev_sectors which limits the maximum sectors for requests. The default value leads to small dasd requests and therefor to a performance drop. Set the max_dev_sectors value to the same value as the max_hw_sectors to use the maximum available request size for DASD devices. Signed-off-by: NStefan Haberland <sth@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 4.4+ Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 30 12月, 2015 1 次提交
-
-
由 Stefan Haberland 提交于
Enabling failfast should let request fail immediately if either an error occurred or the device gets disconnected. For disconnected devices new requests are not fetches from the block queue and therefore failfast is not triggered. Fix by letting the DASD driver fetch requests for disconnected devices with failfast active. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 14 10月, 2015 1 次提交
-
-
由 Christian Borntraeger 提交于
We were able to reduce the CPU overhead of big paging scenarios when announcing our paging disks as non-rotational. Almost all dasd devices are implemented in storage servers with cache, raid, striping and lots of magic. There is no point in optimizing the disk schedulers and swap code for a single platter moving arm rotational disks. Given the complexity of the setup and the fact that this change is mostly to disable the additional overhead in swap code, lets keep the other functionality unchanged and do not disable the this device as entropy source - unlike other non-rotational devices. Suggested-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 13 7月, 2015 1 次提交
-
-
由 Stefan Haberland 提交于
The dasd device driver selects which (alias or base) device is used for a given requests when the request is build. If the chosen alias device is set offline before the request gets queued to the device queue the starting function may use device structures that are already freed. This might lead to a hanging offline process or a kernel panic. Add a check to the starting function that returns the request to the upper layer if the device is already in offline processing. In addition to that prevent that an alias device that's already in offline processing gets chosen as start device. Reviewed-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: NPeter Oberparleiter <peter.oberparleiter@linux.vnet.ibm.com> Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 13 5月, 2015 1 次提交
-
-
由 Peter Oberparleiter 提交于
Enabling a DASD that was configured to use the DIAG250 access method while the corresponding kernel module dasd_diag_mod has not been loaded fails with an error message. To fix this, users need to manually load the dasd_diag_mod module. This procedure can be simplified by automatically loading the dasd_diag_mod from within the kernel when a DASD configured for DIAG250 is set online. Signed-off-by: NPeter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 15 4月, 2015 3 次提交
-
-
由 Stefan Haberland 提交于
The DASD device driver prevents I/O from being started on stopped devices. This also prevented channel paths to be verified and so the device was unable to be resumed. Fix by allowing path verification requests on stopped devices. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
The DASD device driver only has a limited amount of memory to build I/O requests. This memory was used by blocklayer requests leading to an inability to build needed internal requests to resume the device. Fix by preventing the DASD driver to fetch requests for a stopped device. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Reference-ID: RQM 2520 Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Fix ref counting for DASD devices leading to an inability to set a DASD device offline. Before a worker is scheduled the DASD device driver takes a reference to the device. If the worker was already scheduled this reference was never freed. Fix by giving the reference to the DASD device free when schedule_work() returns false. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 25 3月, 2015 2 次提交
-
-
由 Stefan Haberland 提交于
Remove the hard coded scheduler for the DASD device driver to enable change of the scheduler during runtime. Set recommended deadline scheduler via additional udev rule. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Remove the 31 bit support in order to reduce maintenance cost and effectively remove dead code. Since a couple of years there is no distribution left that comes with a 31 bit kernel. The 31 bit kernel also has been broken since more than a year before anybody noticed. In addition I added a removal warning to the kernel shown at ipl for 5 minutes: a960062e ("s390: add 31 bit warning message") which let everybody know about the plan to remove 31 bit code. We didn't get any response. Given that the last 31 bit only machine was introduced in 1999 let's remove the code. Anybody with 31 bit user space code can still use the compat mode. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 29 1月, 2015 2 次提交
-
-
由 Sebastian Ott 提交于
The dasd driver has a lot of duplicated code to handle dasd_global_profile. With this patch we use the same code for the global and the per device profiling data. Note that dasd_stats_write had to change slightly to maintain some odd differences between A) per device and global profile and B) proc and sysfs interface usage. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Sebastian Ott 提交于
Access to DASDs global statistics is done without locking which can lead to inconsistent data. Add locking to fix this. Also move the relevant structs in a global dasd_profile struct. Signed-off-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 28 11月, 2014 3 次提交
-
-
由 Stefan Haberland 提交于
Fix race for sleep_on requests leading to list corruption. The SLEEP_ON_END_TAG is set during CQR clean up. Remove it from interrupt handler to avoid the CQR from being cleared when it is still in the device_queue. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
During device activation all paths could be lost and since the device is not active it has no indication of this fact - hence the CQR will time-out. The following cancelation might fail with -EINVAL because CIO took over control and started path verification. In this case mark the CQR as being CLEARED since it could not be running any more. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 09 10月, 2014 2 次提交
-
-
由 Stefan Haberland 提交于
Add support for Control Unit Initiated Reconfiguration (CUIR) to Linux, a storage server interface to reconcile concurrent hardware changes between storage and host. Reviewed-by: NStefan Weinhuber <wein@de.ibm.com> Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Stefan Haberland 提交于
Error recovery requests may not be cleaned up correctly so that other needed erp requests can not be build because of insufficient memory. This would lead to an infinite loop trying to build erp requests. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 22 7月, 2014 1 次提交
-
-
由 Stefan Haberland 提交于
Kernel panic or a hanging device during format if an alias device is set offline or I/O errors occur. Omit the error recovery procedure for alias devices and do retries on the base device with full erp. Signed-off-by: NStefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-