- 26 7月, 2014 5 次提交
-
-
由 Christoph Hellwig 提交于
This patch adds support for an alternate I/O path in the scsi midlayer which uses the blk-mq infrastructure instead of the legacy request code. Use of blk-mq is fully transparent to drivers, although for now a host template field is provided to opt out of blk-mq usage in case any unforseen incompatibilities arise. In general replacing the legacy request code with blk-mq is a simple and mostly mechanical transformation. The biggest exception is the new code that deals with the fact the I/O submissions in blk-mq must happen from process context, which slightly complicates the I/O completion handler. The second biggest differences is that blk-mq is build around the concept of preallocated requests that also include driver specific data, which in SCSI context means the scsi_cmnd structure. This completely avoids dynamic memory allocations for the fast path through I/O submission. Due the preallocated requests the MQ code path exclusively uses the host-wide shared tag allocator instead of a per-LUN one. This only affects drivers actually using the block layer provided tag allocator instead of their own. Unlike the old path blk-mq always provides a tag, although drivers don't have to use it. For now the blk-mq path is disable by defauly and must be enabled using the "use_blk_mq" module parameter. Once the remaining work in the block layer to make blk-mq more suitable for slow devices is complete I hope to make it the default and eventually even remove the old code path. Based on the earlier scsi-mq prototype by Nicholas Bellinger. Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and various sugestions and code contributions. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Blk-mq drivers usually preallocate their S/G list as part of the request, but if we want to support the very large S/G lists currently supported by the SCSI code that would tie up a lot of memory in the preallocated request pool. Add support to the scatterlist code so that it can initialize a S/G list that uses a preallocated first chunks and dynamically allocated additional chunks. That way the scsi-mq code can preallocate a first page worth of S/G entries as part of the request, and dynamically extend the S/G list when needed. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Replace the calls to the various blk_end_request variants with opencode equivalents. Blk-mq is using a model that gives the driver control between the bio updates and the actual completion, and making the old code follow that same model allows us to keep the code more similar for both paths. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
This saves us an atomic operation for each I/O submission and completion for the usual case where the driver doesn't set a per-target can_queue value. Only a few iscsi hardware offload drivers set the per-target can_queue value at the moment. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Seems like these counters are missing any sort of synchronization for updates, as a over 10 year old comment from me noted. Fix this by using atomic counters, and while we're at it also make sure they are in the same cacheline as the _busy counters and not needlessly stored to in every I/O completion. With the new model the _busy counters can temporarily go negative, so all the readers are updated to check for > 0 values. Longer term every successful I/O completion will reset the counters to zero, so the temporarily negative values will not cause any harm. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
- 25 7月, 2014 8 次提交
-
-
由 Christoph Hellwig 提交于
Avoid taking the queue_lock to check the per-device queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Unlike the host and target busy counters this doesn't allow us to avoid the queue_lock in the request_fn due to the way the interface works, but it'll allow us to prepare for using the blk-mq code, which doesn't use the queue_lock at all, and it at least avoids a queue_lock round trip in scsi_device_unbusy, which is still important given how busy the queue_lock is. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Avoid taking the host-wide host_lock to check the per-host queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Avoid taking the host-wide host_lock to check the per-target queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Prepare for not taking a host-wide lock in the dispatch path by pushing the lock down into the places that actually need it. Note that this patch is just a preparation step, as it will actually increase lock roundtrips and thus decrease performance on its own. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
The blk-mq code path will set this to a different function, so make the code simpler by setting it up in a legacy-request specific place. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Make sure we only have the logic for requeing commands in one place. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Factor out a helper to set the _blocked values, which we'll reuse for the blk-mq code path. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NWebb Scales <webbnh@hp.com> Acked-by: NJens Axboe <axboe@kernel.dk> Tested-by: NBart Van Assche <bvanassche@acm.org> Tested-by: NRobert Elliott <elliott@hp.com>
-
由 Christoph Hellwig 提交于
Factor out command setup code that will be shared with the blk-mq code path. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NWebb Scales <webbnh@hp.com>
-
- 18 7月, 2014 8 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Christoph Hellwig 提交于
The data direction fiel in the SCSI command is derived only from the block request structure. Move setting it up into common code instead of duplicating it in the ULDs. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Christoph Hellwig 提交于
We should call the device handler prep_fn for all TYPE_FS requests, not just simple read/write calls that are handled by the disk driver. Restructure the common I/O code to call the prep_fn handler and zero out the CDB, and just leave the call to scsi_init_io to the ULDs. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Christoph Hellwig 提交于
scsi_init_io should only be called for requests that transfer data, so move the assert that a request has segments from the callers into scsi_init_io. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Maurizio Lombardi 提交于
During IO with fabric faults, one generally sees several "Unhandled error code" messages in the syslog as shown below: sd 4:0:6:2: [sdbw] Unhandled error code sd 4:0:6:2: [sdbw] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK sd 4:0:6:2: [sdbw] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 end_request: I/O error, dev sdbw, sector 0 This comes from scsi_io_completion (in scsi_lib.c) while handling error codes other than DID_RESET or not deferred sense keys i.e. this is actually handled by the SCSI mid layer. But what gets displayed here is "Unhandled error code" which is quite misleading as it indicates something that is not addressed by the mid layer. The description string is based on the sense key and sometimes on the additional sense code; since the ACTION_FAIL case always prints the sense key and the additional sense code, this patch removes the description string completely because it does not add useful information. Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Hannes Reinecke 提交于
Using dev_printk variants prefixes the logging message with the originating device, which makes debugging easier. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Bottomley 提交于
Flush commands don't transfer data and thus need to be special cased in the I/O completion handler so that we can propagate errors to the block layer and filesystem. Signed-off-by: NJames Bottomley <JBottomley@Parallels.com> Reported-by: NSteven Haber <steven@qumulo.com> Tested-by: NSteven Haber <steven@qumulo.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 06 6月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
With the optimizations around not clearing the full request at alloc time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC up to the user allocating the request. Add a blk_rq_set_block_pc() that sets the command type to REQ_TYPE_BLOCK_PC, and properly initializes the members associated with this type of request. Update callers to use this function instead of manipulating rq->cmd_type directly. Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed attempt. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 5月, 2014 3 次提交
-
-
由 Christoph Hellwig 提交于
Instead of letting the ULD play games with the prep_fn move back to the model of a central prep_fn with a callback to the ULD. This already cleans up and shortens the code by itself, and will be required to properly support blk-mq in the SCSI midlayer. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NNicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: NMike Christie <michaelc@cs.wisc.edu> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Christoph Hellwig 提交于
By folding scsi_end_request into its only caller we can significantly clean up the completion logic. We can use simple goto labels now to only have a single place to finish or requeue command there instead of the previous convoluted logic. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NNicholas Bellinger <nab@linux-iscsi.org> Reviewed-by: NMike Christie <michaelc@cs.wisc.edu> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
由 Christoph Hellwig 提交于
Instead of trying to guess when we have a BIDI buffer in scsi_release_buffers add a function to explicitly free the BIDI ressoures in the one place that handles them. This avoids needing a special __scsi_release_buffers for the case where we already have freed the request as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
-
- 22 4月, 2014 1 次提交
-
-
由 Alan Stern 提交于
We're seeing a case where the contents of scmd->result isn't being reset after a SCSI command encounters an error, is resubmitted, times out and then gets handled. The error handler acts on the stale result of the previous error instead of the timeout. Fix this by properly zeroing the scmd->status before the command is resubmitted. Signed-off-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
- 21 4月, 2014 2 次提交
-
-
由 Christoph Hellwig 提交于
Patch commit 04796336 Author: Christoph Hellwig <hch@infradead.org> Date: Thu Feb 20 14:20:55 2014 -0800 [SCSI] do not manipulate device reference counts in scsi_get/put_command Introduced a use after free:I in the kill case of scsi_prep_return we have to release our device reference, but we do this trying to reference the just freed command. Use the local sdev pointer instead. Fixes: 04796336Reported-by: NJoe Lawrence <joe.lawrence@stratus.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Christoph Hellwig 提交于
Patch commit 04796336 Author: Christoph Hellwig <hch@infradead.org> Date: Thu Feb 20 14:20:55 2014 -0800 [SCSI] do not manipulate device reference counts in scsi_get/put_command Introduced a use after free: when scsi_init_io fails we have to release our device reference, but we do this trying to reference the just freed command. Add a local scsi_device pointer to fix this. Fixes: 04796336Reported-by: NSander Eikelenboom <linux@eikelenboom.it> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
- 16 4月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
This was used in the olden days, back when onions were proper yellow. Basically it mapped to the current buffer to be transferred. With highmem being added more than a decade ago, most drivers map pages out of a bio, and rq->buffer isn't pointing at anything valid. Convert old style drivers to just use bio_data(). For the discard payload use case, just reference the page in the bio. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 10 4月, 2014 2 次提交
-
-
由 Martin K. Petersen 提交于
cmd_flags in struct request is now 64 bits wide but the scsi_execute functions truncated arguments passed to int leading to errors. Make sure the flags parameters are u64. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Cc: Jens Axboe <axboe@fb.com> CC: Jan Kara <jack@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
The queue parameter is never used, just get rid of it. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 16 3月, 2014 6 次提交
-
-
由 Christoph Hellwig 提交于
Avoid a spurious device get/put pair by cleaning up scsi_requeue_command and folding scsi_unprep_request into it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Bart Van Assche 提交于
Eliminate a get_device() / put_device() pair from scsi_next_command(). Both are atomic operations hence removing these slightly improves performance. [hch: slight changes due to different context] Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Bart Van Assche 提交于
SCSI devices may only be removed by calling scsi_remove_device(). That function must invoke blk_cleanup_queue() before the final put of sdev->sdev_gendev. Since blk_cleanup_queue() waits for the block queue to drain and then tears it down, scsi_request_fn cannot be active anymore after blk_cleanup_queue() has returned and hence the get_device()/put_device() pair in scsi_request_fn is unnecessary. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NTejun Heo <tj@kernel.org> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMike Christie <michaelc@cs.wisc.edu> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Christoph Hellwig 提交于
Many callers won't need this and we can optimize them away. In addition the handling in the __-prefixed variants was inconsistant to start with. Based on an earlier patch from Bart Van Assche. [jejb: fix kerneldoc probelm picked up by Fengguang Wu] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Christoph Hellwig 提交于
If we don't have starved devices we don't need to take the host lock to iterate over them. Also split the function up to be more clear. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Eiichi Tsukata 提交于
Currently, scsi error handling in scsi_io_completion() tries to unconditionally requeue scsi command when device keeps some error state. For example, UNIT_ATTENTION causes infinite retry with action == ACTION_RETRY. This is because retryable errors are thought to be temporary and the scsi device will soon recover from those errors. Normally, such retry policy is appropriate because the device will soon recover from temporary error state. But there is no guarantee that device is able to recover from error state immediately. Some hardware error can prevent device from recovering. This patch adds timeout in scsi_io_completion() to avoid infinite command retry in scsi_io_completion(). Once scsi command retry time is longer than this timeout, the command is treated as failure. Signed-off-by: NEiichi Tsukata <eiichi.tsukata.xh@hitachi.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
- 18 2月, 2014 1 次提交
-
-
由 Russell King 提交于
We must use a 64-bit for this, otherwise overflowed bits get lost, and that can result in a lower than intended value set. Fixes: 8e0cb8a1 ("ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations") Fixes: 7d35496d ("ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations") Tested-Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 31 10月, 2013 1 次提交
-
-
由 Santosh Shilimkar 提交于
DMA bounce limit is the maximum direct DMA'able memory beyond which bounce buffers has to be used to perform dma operations. SCSI driver relies on dma_mask but its calculation is based on max_*pfn which don't have uniform meaning across architectures. So make use of dma_max_pfn() which is expected to return the DMAable maximum pfn value across architectures. Cc: linux-scsi@vger.kernel.org Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 26 8月, 2013 1 次提交
-
-
由 Ewan D. Milne 提交于
Generate a uevent when the following Unit Attention ASC/ASCQ codes are received: 2A/01 MODE PARAMETERS CHANGED 2A/09 CAPACITY DATA HAS CHANGED 38/07 THIN PROVISIONING SOFT THRESHOLD REACHED 3F/03 INQUIRY DATA HAS CHANGED 3F/0E REPORTED LUNS DATA HAS CHANGED Log kernel messages when the following Unit Attention ASC/ASCQ codes are received that are not as specific as those above: 2A/xx PARAMETERS CHANGED 3F/xx TARGET OPERATING CONDITIONS HAVE CHANGED Added logic to set expecting_lun_change for other LUNs on the target after REPORTED LUNS DATA HAS CHANGED is received, so that duplicate uevents are not generated, and clear expecting_lun_change when a REPORT LUNS command completes, in accordance with the SPC-3 specification regarding reporting of the 3F 0E ASC/ASCQ UA. [jejb: remove SPC3 test in scsi_report_lun_change and some docbook fixes and unused variable fix, both reported by Fengguang Wu] Signed-off-by: NEwan D. Milne <emilne@redhat.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-