- 11 6月, 2015 3 次提交
-
-
由 Geoff Levand 提交于
Add myself as co-maintainer of the ps3vram block driver, and add linuxppc-dev as a relevant mailing list. I have been acting as maintainer of this driver for the last several years, and if there is some inquiry regarding it I would like to be notified. Signed-off-by: NGeoff Levand <geoff@infradead.org> Acked-by: NJim Paris <jim@jtan.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Geert Uytterhoeven 提交于
The ps3vram driver is a plain block device driver since commit f507cd22 ("ps3/block: Replace mtd/ps3vram by block/ps3vram"). Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NGeoff Levand <geoff@infradead.org> Acked-by: NJim Paris <jim@jtan.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Geoff Levand 提交于
Fix sparse warnings like these: drivers/block/ps3vram.c: warning: incorrect type in assignment (different address spaces) drivers/block/ps3vram.c: expected unsigned int [usertype] *ctrl drivers/block/ps3vram.c: got void [noderef] <asn:2>* Cc: Jim Paris <jim@jtan.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NGeoff Levand <geoff@infradead.org> Acked-by: NJim Paris <jim@jtan.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 06 6月, 2015 5 次提交
-
-
由 Keith Busch 提交于
Namespaces may be dynamically allocated and deleted or attached and detached. This has the driver rescan the device for namespace changes after each device reset or namespace change asynchronous event. There could potentially be many detached namespaces that we don't want polluting /dev/ with unusable block handles, so this will delete disks if the namespace is not active as indicated by the response from identify namespace. This also skips adding the disk if no capacity is provisioned to the namespace in the first place. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
-
由 Jens Axboe 提交于
We export this function and NVMe wants to use it, but for some reason it was never added to the block header. Do that. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jon Derrick 提交于
Protects against reordering and/or preempting which would allow the kthread to access the queue descriptor before it is set up Signed-off-by: NJon Derrick <jonathan.derrick@intel.com> Acked-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
We need the ability to perform an nvme controller reset as discussed on the mailing list thread: http://lists.infradead.org/pipermail/linux-nvme/2015-March/001585.html This adds a sysfs entry that when written to will reset perform an NVMe controller reset if the controller was successfully initialized in the first place. This also adds locking around resetting the device in the async probe method so the driver can't schedule two resets. Signed-off-by: NKeith Busch <keith.busch@intel.com> Cc: Brandon Schultz <brandon.schulz@hgst.com> Cc: David Sariel <david.sariel@pmcs.com> Updated by Jens to: 1) Merge this with the ioctl reset patch from David Sariel. The ioctl path now shares the reset code from the sysfs path. 2) Don't flush work if we fail issuing the reset. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 02 6月, 2015 5 次提交
-
-
由 Akinobu Mita 提交于
When irqmode=2 (IRQ completion handler is timer) and queue_mode=1 (Block interface to use is rq), the completion handler should restart request handling for any pending requests on a queue because request processing stops when the number of commands are queued more than hw_queue_depth (null_rq_prep_fn returns BLKPREP_DEFER). Without this change, the following command cannot finish. # modprobe null_blk irqmode=2 queue_mode=1 hw_queue_depth=1 # fio --name=t --rw=read --size=1g --direct=1 \ --ioengine=libaio --iodepth=64 --filename=/dev/nullb0 Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Akinobu Mita 提交于
When irqmode=2 (IRQ completion handler is timer), timer handler should be called on the same CPU where the timer has been started. Since completion_queues are per-cpu and the completion handler only touches completion_queue for local CPU, we need to prevent the handler from running on a different CPU where the timer has been started. Otherwise, the IO cannot be completed until another completion handler is executed on that CPU. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
The driver needs to track shared tags to support multiple namespaces that may be dynamically allocated or deleted. Relying on the first request_queue's hctx's is not appropriate as we cannot clear outstanding tags for all namespaces using this handle, nor can the driver easily track all request_queue's hctx as namespaces are attached/detached. Instead, this patch uses the nvme_dev's tagset to get the shared tag resources instead of through a request_queue hctx. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
-
由 Keith Busch 提交于
Storage controllers may expose multiple block devices that share hardware resources managed by blk-mq. This patch enhances the shared tags so a low-level driver can access the shared resources not tied to the unshared h/w contexts. This way the LLD can dynamically add and delete disks and request queues without having to track all the request_queue hctx's to iterate outstanding tags. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 30 5月, 2015 4 次提交
-
-
由 Jens Axboe 提交于
We don't need to honor chunk sizes for IO that doesn't carry any data. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
We can safely merge anything that wont generate an SG list entry, so if the bio is data-less (discard), don't look at potential SG gaps. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
Do not retry failed sync commands so the original status may be seen without issuing unnecessary retries. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 27 5月, 2015 1 次提交
-
-
由 Julia Lawall 提交于
Remove unneeded variable used to store return value. Generated by: scripts/coccinelle/misc/returnvar.cocci Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NJulia Lawall <julia.lawall@lip6.fr> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 23 5月, 2015 1 次提交
-
-
由 Keith Busch 提交于
Replaces req->sense_len usage, which is not owned by the LLD, to req->special to contain the command result for driver created commands, and sets the result unconditionally on completion. Signed-off-by: NKeith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@fb.com> Fixes: d29ec824 ("nvme: submit internal commands through the block layer") Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 22 5月, 2015 11 次提交
-
-
由 Christoph Hellwig 提交于
Currently dm-multipath has to clone the bios for every request sent to the lower devices, which wastes cpu cycles and ties down memory. This patch instead adds a new REQ_CLONE flag that instructs req_bio_endio to not complete bios attached to a request, which we set on clone requests similar to bios in a flush sequence. With this change I/O errors on a path failure only get propagated to dm-multipath, which can then either resubmit the I/O or complete the bios on the original request. I've done some basic testing of this on a Linux target with ALUA support, and it survives path failures during I/O nicely. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Snitzer 提交于
Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains") regressed all existing callers that followed this pattern: 1) saving a bio's original bi_end_io 2) wiring up an intermediate bi_end_io 3) restoring the original bi_end_io from intermediate bi_end_io 4) calling bio_endio() to execute the restored original bi_end_io The regression was due to BIO_CHAIN only ever getting set if bio_inc_remaining() is called. For the above pattern it isn't set until step 3 above (step 2 would've needed to establish BIO_CHAIN). As such the first bio_endio(), in step 2 above, never decremented __bi_remaining before calling the intermediate bi_end_io -- leaving __bi_remaining with the value 1 instead of 0. When bio_inc_remaining() occurred during step 3 it brought it to a value of 2. When the second bio_endio() was called, in step 4 above, it should've called the original bi_end_io but it didn't because there was an extra reference that wasn't dropped (due to atomic operations being optimized away since BIO_CHAIN wasn't set upfront). Fix this issue by removing the __bi_remaining management complexity for all callers that use the above pattern -- bio_chain() is the only interface that _needs_ to be concerned with __bi_remaining. For the above pattern callers just expect the bi_end_io they set to get called! Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls that aren't associated with the bio_chain() interface. Also, the bio_inc_remaining() interface has been moved local to bio.c. Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains") Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Use block layer queues with an internal cmd_type to submit internally generated NVMe commands. This both simplifies the code a lot and allow for a better structure. For example now the LighNVM code can construct commands without knowing the details of the underlying I/O descriptors. Or a future NVMe over network target could inject commands, as well as could the SCSI translation and ioctl code be reused for such a beast. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
NVMe device always support the FUA bit, and the SCSI translations accepts the DPO bit, which doesn't have much of a meaning for us. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Erorr handling for the scsi translation was completely broken, as there were two different positive error number spaces overlapping. Fix this up by removing one of them, and centralizing the generation of the other positive values in a single place. Also fix up a few places that didn't handle the NVMe error codes properly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
This function handles two totally different opcodes, so split it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Most users want the generic device, so store that in struct nvme_dev instead of the pci_dev. This also happens to be a nice step towards making some code reusable for non-PCI transports. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Note that we keep the unused timeout argument, but allow callers to pass 0 instead of a timeout if they want the default. This will allow adding a timeout to the pass through path later on. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 20 5月, 2015 9 次提交
-
-
由 Jens Axboe 提交于
gcc, righfully, complains: drivers/block/loop.c:1369:1: warning: label 'out' defined but not used [-Wunused-label] Kill it. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jarod Wilson 提交于
With the mutex_trylock bit gone from blkdev_reread_part(), the retry logic in dasd_scan_partitions() shouldn't be necessary. CC: Christoph Hellwig <hch@infradead.org> CC: Jens Axboe <axboe@kernel.dk> CC: Tejun Heo <tj@kernel.org> CC: Alexander Viro <viro@zeniv.linux.org.uk> CC: Markus Pargmann <mpa@pengutronix.de> CC: Stefan Weinhuber <wein@de.ibm.com> CC: Stefan Haberland <stefan.haberland@de.ibm.com> CC: Sebastian Ott <sebott@linux.vnet.ibm.com> CC: Fabian Frederick <fabf@skynet.be> CC: Ming Lei <ming.lei@canonical.com> CC: David Herrmann <dh.herrmann@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: nbd-general@lists.sourceforge.net CC: linux-s390@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
Also remove the obsolete comment. Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJarod Wilson <jarod@redhat.com> Acked-by: NJarod Wilson <jarod@redhat.com> Acked-by: NSebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJarod Wilson <jarod@redhat.com> Acked-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
loop_clr_fd() can be run piggyback with lo_release(), and under this situation, reread partition may always fail because bd_mutex has been held already. This patch detects the situation by the reference count, and call __blkdev_reread_part() to avoid acquiring the lock again. In the meantime, this patch switches to new kernel APIs of blkdev_reread_part() and __blkdev_reread_part(). Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJarod Wilson <jarod@redhat.com> Acked-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lei 提交于
The lo_ctl_mutex is held for running all ioctl handlers, and in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for rereading partitions, which requires bd_mutex. So it is easy to cause failure because trylock(bd_mutex) may fail inside blkdev_reread_part(), and follows the lock context: blkid or other application: ->open() ->mutex_lock(bd_mutex) ->lo_open() ->mutex_lock(lo_ctl_mutex) losetup(set fd ioctl): ->mutex_lock(lo_ctl_mutex) ->ioctl_by_bdev(BLKRRPART) ->trylock(bd_mutex) This patch trys to eliminate the ABBA lock dependency by removing lo_ctl_mutext in lo_open() with the following approach: 1) make lo_refcnt as atomic_t and avoid acquiring lo_ctl_mutex in lo_open(): - for open vs. add/del loop, no any problem because of loop_index_mutex - freeze request queue during clr_fd, so I/O can't come until clearing fd is completed, like the effect of holding lo_ctl_mutex in lo_open - both open() and release() have been serialized by bd_mutex already 2) don't hold lo_ctl_mutex for decreasing/checking lo_refcnt in lo_release(), then lo_ctl_mutex is only required for the last release. Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJarod Wilson <jarod@redhat.com> Acked-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
We need the blkdev_reread_part() changes for drivers to adapt.
-
由 Ming Lei 提交于
The only possible problem of using mutex_lock() instead of trylock is about deadlock. If there aren't any locks held before calling blkdev_reread_part(), deadlock can't be caused by this conversion. If there are locks held before calling blkdev_reread_part(), and if these locks arn't required in open, close handler and I/O path, deadlock shouldn't be caused too. Both user space's ioctl(BLKRRPART) and md_setup_drive() from init/do_mounts_md.c belongs to the 1st case, so the conversion is safe for the two cases. For loop, the previous patches in this pathset has fixed the ABBA lock dependency, so the conversion is OK. For nbd, tx_lock is held when calling the function: - both open and release won't hold the lock - when blkdev_reread_part() is run, I/O thread has been stopped already, so tx_lock won't be acquired in I/O path at that time. - so the conversion won't cause deadlock for nbd For dasd, both dasd_open(), dasd_release() and request function don't acquire any mutex/semphone, so the conversion should be safe. Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJarod Wilson <jarod@redhat.com> Acked-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jarod Wilson 提交于
This patch exports blkdev_reread_part() for block drivers, also introduce __blkdev_reread_part(). For some drivers, such as loop, reread of partitions can be run from the release path, and bd_mutex may already be held prior to calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce __blkdev_reread_part for use in such cases. CC: Christoph Hellwig <hch@lst.de> CC: Jens Axboe <axboe@kernel.dk> CC: Tejun Heo <tj@kernel.org> CC: Alexander Viro <viro@zeniv.linux.org.uk> CC: Markus Pargmann <mpa@pengutronix.de> CC: Stefan Weinhuber <wein@de.ibm.com> CC: Stefan Haberland <stefan.haberland@de.ibm.com> CC: Sebastian Ott <sebott@linux.vnet.ibm.com> CC: Fabian Frederick <fabf@skynet.be> CC: Ming Lei <ming.lei@canonical.com> CC: David Herrmann <dh.herrmann@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Peter Zijlstra <peterz@infradead.org> CC: nbd-general@lists.sourceforge.net CC: linux-s390@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 5月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Stop abusing struct page functionality and the swap end_io handler, and instead add a modified version of the blk-lib.c bio_batch helpers. Also move the block I/O code into swap.c as they are directly tied into each other. Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NPavel Machek <pavel@ucw.cz> Tested-by: NMing Lin <mlin@kernel.org> Acked-by: NPavel Machek <pavel@ucw.cz> Acked-by: NRafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: NJens Axboe <axboe@fb.com>
-