- 15 4月, 2017 1 次提交
-
-
由 Omar Sandoval 提交于
The Kyber I/O scheduler is an I/O scheduler for fast devices designed to scale to multiple queues. Users configure only two knobs, the target read and synchronous write latencies, and the scheduler tunes itself to achieve that latency goal. The implementation is based on "tokens", built on top of the scalable bitmap library. Tokens serve as a mechanism for limiting requests. There are two tiers of tokens: queueing tokens and dispatch tokens. A queueing token is required to allocate a request. In fact, these tokens are actually the blk-mq internal scheduler tags, but the scheduler manages the allocation directly in order to implement its policy. Dispatch tokens are device-wide and split up into two scheduling domains: reads vs. writes. Each hardware queue dispatches batches round-robin between the scheduling domains as long as tokens are available for that domain. These tokens can be used as the mechanism to enable various policies. The policy Kyber uses is inspired by active queue management techniques for network routing, similar to blk-wbt. The scheduler monitors latencies and scales the number of dispatch tokens accordingly. Queueing tokens are used to prevent starvation of synchronous requests by asynchronous requests. Various extensions are possible, including better heuristics and ionice support. The new scheduler isn't set as the default yet. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 28 2月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Similar to the PCI version, just calling into virtio instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 07 2月, 2017 1 次提交
-
-
由 Scott Bauer 提交于
This patch implements the necessary logic to bring an Opal enabled drive out of a factory-enabled into a working Opal state. This patch set also enables logic to save a password to be replayed during a resume from suspend. Signed-off-by: NScott Bauer <scott.bauer@intel.com> Signed-off-by: NRafael Antognolli <Rafael.Antognolli@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 01 2月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
We only need this code to support scsi, ide, cciss and virtio. And at least for virtio it's a deprecated feature to start with. This should shrink the kernel size for embedded device that only use, say eMMC a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 28 1月, 2017 1 次提交
-
-
由 Omar Sandoval 提交于
This fixes a couple of problems: 1. In the !CONFIG_DEBUG_FS case, the stub definitions were bogus. 2. In the !CONFIG_BLOCK case, blk-mq-debugfs.c shouldn't be compiled at all. Fix the stub definitions and add a CONFIG_BLK_DEBUG_FS Kconfig option. Fixes: 07e4fead ("blk-mq: create debugfs directory tree") Signed-off-by: NOmar Sandoval <osandov@fb.com> Augment Kconfig description. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 27 1月, 2017 1 次提交
-
-
由 Omar Sandoval 提交于
In preparation for putting blk-mq debugging information in debugfs, create a directory tree mirroring the one in sysfs: # tree -d /sys/kernel/debug/block /sys/kernel/debug/block |-- nvme0n1 | `-- mq | |-- 0 | | `-- cpu0 | |-- 1 | | `-- cpu1 | |-- 2 | | `-- cpu2 | `-- 3 | `-- cpu3 `-- vda `-- mq `-- 0 |-- cpu0 |-- cpu1 |-- cpu2 `-- cpu3 Also add the scaffolding for the actual files that will go in here, either under the hardware queue or software queue directories. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 18 1月, 2017 2 次提交
-
-
由 Jens Axboe 提交于
This is basically identical to deadline-iosched, except it registers as a MQ capable scheduler. This is still a single queue design. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
由 Jens Axboe 提交于
This adds a set of hooks that intercepts the blk-mq path of allocating/inserting/issuing/completing requests, allowing us to develop a scheduler within that framework. We reuse the existing elevator scheduler API on the registration side, but augment that with the scheduler flagging support for the blk-mq interfce, and with a separate set of ops hooks for MQ devices. We split driver and scheduler tags, so we can run the scheduling independently of device queue depth. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
- 11 11月, 2016 2 次提交
-
-
由 Jens Axboe 提交于
We can hook this up to the block layer, to help throttle buffered writes. wbt registers a few trace points that can be used to track what is happening in the system: wbt_lat: 259:0: latency 2446318 wbt_stat: 259:0: rmean=2446318, rmin=2446318, rmax=2446318, rsamples=1, wmean=518866, wmin=15522, wmax=5330353, wsamples=57 wbt_step: 259:0: step down: step=1, window=72727272, background=8, normal=16, max=32 This shows a sync issue event (wbt_lat) that exceeded it's time. wbt_stat dumps the current read/write stats for that window, and wbt_step shows a step down event where we now scale back writes. Each trace includes the device, 259:0 in this case. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
For legacy block, we simply track them in the request queue. For blk-mq, we track them on a per-sw queue basis, which we can then sum up through the hardware queues and finally to a per device state. The stats are tracked in, roughly, 0.1s interval windows. Add sysfs files to display the stats. The feature is off by default, to avoid any extra overhead. In-kernel users of it can turn it on by setting QUEUE_FLAG_STATS in the queue flags. We currently don't turn it on if someone just reads any of the stats files, that is something we could add as well. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 10月, 2016 1 次提交
-
-
由 Hannes Reinecke 提交于
Implement zoned block device zone information reporting and reset. Zone information are reported as struct blk_zone. This implementation does not differentiate between host-aware and host-managed device models and is valid for both. Two functions are provided: blkdev_report_zones for discovering the zone configuration of a zoned block device, and blkdev_reset_zones for resetting the write pointer of sequential zones. The helper function blk_queue_zone_size and bdev_zone_size are also provided for, as the name suggest, obtaining the zone size (in 512B sectors) of the zones of the device. Signed-off-by: NHannes Reinecke <hare@suse.de> [Damien: * Removed the zone cache * Implement report zones operation based on earlier proposal by Shaun Tancheff <shaun.tancheff@seagate.com>] Signed-off-by: NDamien Le Moal <damien.lemoal@hgst.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com> Tested-by: NShaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 22 9月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
Replace the block-mq notifier list management with the multi instance facility in the cpu hotplug state machine. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-block@vger.kernel.org Cc: rt@linutronix.de Cc: Christoph Hellwing <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 9月, 2016 1 次提交
-
-
由 Stephen Rothwell 提交于
and building block/blk-mq-pci.o should depend on CONFIG_BLOCK Fixes: 973c4e37 ("blk-mq: provide a default queue mapping for PCI device") Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 15 9月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 09 1月, 2016 1 次提交
-
-
由 Vishal Verma 提交于
Take the core badblocks implementation from md, and make it generally available. This follows the same style as kernel implementations of linked lists, rb-trees etc, where you can have a structure that can be embedded anywhere, and accessor functions to manipulate the data. The only changes in this copy of the code are ones to generalize function/variable names from md-specific ones. Also add init and free functions. Signed-off-by: NVishal Verma <vishal.l.verma@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 12 12月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
The new name is irq_poll as iopoll is already taken. Better suggestions welcome. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
-
- 27 9月, 2014 1 次提交
-
-
由 Martin K. Petersen 提交于
The T10 Protection Information format is also used by some devices that do not go through the SCSI layer (virtual block devices, NVMe). Relocate the relevant functions to a block layer library that can be used without involving SCSI. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 20 5月, 2014 2 次提交
-
-
由 Jens Axboe 提交于
Continue moving some of the block files that are scattered around. bounce.c contains only code for bouncing the contents of a bio. It's block proper code, not mm code. Suggested-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Like commit f9c78b2b, move this block related file outside of fs/ and into the core block directory, block/. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 5月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
They really belong in block/, especially now since it's not in drivers/block/ anymore. Additionally, the get_maintainer script gets it wrong when in fs/. Suggested-by: NChristoph Hellwig <hch@infradead.org> Acked-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 25 10月, 2013 1 次提交
-
-
由 Jens Axboe 提交于
Linux currently has two models for block devices: - The classic request_fn based approach, where drivers use struct request units for IO. The block layer provides various helper functionalities to let drivers share code, things like tag management, timeout handling, queueing, etc. - The "stacked" approach, where a driver squeezes in between the block layer and IO submitter. Since this bypasses the IO stack, driver generally have to manage everything themselves. With drivers being written for new high IOPS devices, the classic request_fn based driver doesn't work well enough. The design dates back to when both SMP and high IOPS was rare. It has problems with scaling to bigger machines, and runs into scaling issues even on smaller machines when you have IOPS in the hundreds of thousands per device. The stacked approach is then most often selected as the model for the driver. But this means that everybody has to re-invent everything, and along with that we get all the problems again that the shared approach solved. This commit introduces blk-mq, block multi queue support. The design is centered around per-cpu queues for queueing IO, which then funnel down into x number of hardware submission queues. We might have a 1:1 mapping between the two, or it might be an N:M mapping. That all depends on what the hardware supports. blk-mq provides various helper functions, which include: - Scalable support for request tagging. Most devices need to be able to uniquely identify a request both in the driver and to the hardware. The tagging uses per-cpu caches for freed tags, to enable cache hot reuse. - Timeout handling without tracking request on a per-device basis. Basically the driver should be able to get a notification, if a request happens to fail. - Optional support for non 1:1 mappings between issue and submission queues. blk-mq can redirect IO completions to the desired location. - Support for per-request payloads. Drivers almost always need to associate a request structure with some driver private command structure. Drivers can tell blk-mq this at init time, and then any request handed to the driver will have the required size of memory associated with it. - Support for merging of IO, and plugging. The stacked model gets neither of these. Even for high IOPS devices, merging sequential IO reduces per-command overhead and thus increases bandwidth. For now, this is provided as a potential 3rd queueing model, with the hope being that, as it matures, it can replace both the classic and stacked model. That would get us back to having just 1 real model for block devices, leaving the stacked approach to dm/md devices (as it was originally intended). Contributions in this patch from the following people: Shaohua Li <shli@fusionio.com> Alexander Gordeev <agordeev@redhat.com> Christoph Hellwig <hch@infradead.org> Mike Christie <michaelc@cs.wisc.edu> Matias Bjorling <m@bjorling.me> Jeff Moyer <jmoyer@redhat.com> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 10月, 2013 1 次提交
-
-
由 Paul Gortmaker 提交于
Recently commit bab55417 ("block: support embedded device command line partition") introduced CONFIG_CMDLINE_PARSER. However, that name is too generic and sounds like it enables/disables generic kernel boot arg processing, when it really is block specific. Before this option becomes a part of a full/final release, add the BLK_ prefix to it so that it is clear in absence of any other context that it is block specific. In addition, fix up the following less critical items: - help text was not really at all helpful. - index file for Documentation was not updated - add the new arg to Documentation/kernel-parameters.txt - clarify wording in source comments Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Cai Zhiyong <caizhiyong@huawei.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 9月, 2013 1 次提交
-
-
由 Cai Zhiyong 提交于
Read block device partition table from command line. The partition used for fixed block device (eMMC) embedded device. It is no MBR, save storage space. Bootloader can be easily accessed by absolute address of data on the block device. Users can easily change the partition. This code reference MTD partition, source "drivers/mtd/cmdlinepart.c" About the partition verbose reference "Documentation/block/cmdline-partition.txt" [akpm@linux-foundation.org: fix printk text] [yongjun_wei@trendmicro.com.cn: fix error return code in parse_parts()] Signed-off-by: NCai Zhiyong <caizhiyong@huawei.com> Cc: Karel Zak <kzak@redhat.com> Cc: "Wanglin (Albert)" <albert.wanglin@huawei.com> Cc: Marius Groeger <mag@sysgo.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 1月, 2012 2 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 8月, 2011 1 次提交
-
-
由 Mike Christie 提交于
This moves the FC classes bsg code to the block layer and makes it a lib so that other classes like iscsi and SAS can use it. It is helpful because working with the request queue, bios, creating scatterlists, etc are a pain that the LLD does not have to worry about with normal IOs and should not have to worry about for bsg requests. Signed-off-by: NMike Christie <michaelc@cs.wisc.edu> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 16 9月, 2010 1 次提交
-
-
由 Vivek Goyal 提交于
o Actual implementation of throttling policy in block layer. Currently it implements READ and WRITE bytes per second throttling logic. IOPS throttling comes in later patches. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 10 9月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
Without ordering requirements, barrier and ordering are minomers. Rename block/blk-barrier.c to block/blk-flush.c. Rename of symbols will follow. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 29 4月, 2010 1 次提交
-
-
由 Dmitry Monakhov 提交于
Move blkdev_issue_discard from blk-barrier.c because it is not barrier related. Later the file will be populated by other helpers. Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 04 12月, 2009 1 次提交
-
-
由 Vivek Goyal 提交于
o This is basic implementation of blkio controller cgroup interface. This is the common interface visible to user space and should be used by different IO control policies as we implement those. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 03 10月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
AS is mostly a subset of CFQ, so there's little point in still providing this separate IO scheduler. Hopefully at some point we can get down to one single IO scheduler again, at least this brings us closer by having only one intelligent IO scheduler. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 11 9月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
This borrows some code from NAPI and implements a polled completion mode for block devices. The idea is the same as NAPI - instead of doing the command completion when the irq occurs, schedule a dedicated softirq in the hopes that we will complete more IO when the iopoll handler is invoked. Devices have a budget of commands assigned, and will stay in polled mode as long as they continue to consume their budget from the iopoll softirq handler. If they do not, the device is set back to interrupt completion mode. This patch holds the core bits for blk-iopoll, device driver support sold separately. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 01 7月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
The initial patches to support this through sysfs export were broken and have been if 0'ed out in any release. So lets just kill the code and reclaim some space in struct request_queue, if anyone would later like to fixup the sysfs bits, the git history can easily restore the removed bits. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 09 2月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: cleanup Move blktrace.c to kernel/trace, also move its config entry. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 10月, 2008 2 次提交
-
-
由 Jens Axboe 提交于
Right now SCSI and others do their own command timeout handling. Move those bits to the block layer. Instead of having a timer per command, we try to be a bit more clever and simply have one per-queue. This avoids the overhead of having to tear down and setup a timer for each command, so it will result in a lot less timer fiddling. Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 03 7月, 2008 2 次提交
-
-
由 Adel Gadllah 提交于
This patch exports the per-gendisk command filter to user space through sysfs, so it can be changed by the system administrator. All users of the old cmd filter have been converted to use the new one. Original patch from Peter Jones. Signed-off-by: NAdel Gadllah <adel.gadllah@gmail.com> Signed-off-by: NPeter Jones <pjones@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Martin K. Petersen 提交于
Some block devices support verifying the integrity of requests by way of checksums or other protection information that is submitted along with the I/O. This patch implements support for generating and verifying integrity metadata, as well as correctly merging, splitting and cloning bios and requests that have this extra information attached. See Documentation/block/data-integrity.txt for more information. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 30 1月, 2008 2 次提交
-
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
Adds files for barrier handling, rq execution, io context handling, mapping data to requests, and queue settings. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-