- 06 2月, 2015 7 次提交
-
-
由 Dongsu Park 提交于
Rewrite __bio_copy_iov using the copy_page_{from,to}_iter helpers, and split it into two simpler functions. This commit should contain only literal replacements, without functional changes. Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDongsu Park <dongsu.park@profitbricks.com> [hch: removed the __bio_copy_iov wrapper] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
And also remove the unused bdev argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
This saves a little code, and allow to simplify the error handling. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Kent Overstreet 提交于
Make use of a new interface provided by iov_iter, backed by scatter-gather list of iovec, instead of the old interface based on sg_iovec. Also use iov_iter_advance() instead of manual iteration. This commit should contain only literal replacements, without functional changes. Cc: Christoph Hellwig <hch@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Signed-off-by: NKent Overstreet <kmo@daterainc.com> [dpark: add more description in commit message] Signed-off-by: NDongsu Park <dongsu.park@profitbricks.com> [hch: fixed to do a deep clone of the iov_iter, and to properly use the iov_iter direction] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
The code sniplet to walk all bio_vecs and free their pages is opencoded in way to many places, so factor it into a helper. Also convert the slightly more complex cases in bio_kern_endio and __bio_copy_iov where we break the freeing from an existing loop into a separate one. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Just open code the trivial mapping from a kernel virtual address to a bio instead of going through the complex user address mapping machinery. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <tom.leiming@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 12 12月, 2014 1 次提交
-
-
由 Maurizio Lombardi 提交于
The original behaviour is to refuse to add a new page if the maximum number of segments has been reached, regardless of the fact the page we are going to add can be merged into the last segment or not. Unfortunately, when the system runs under heavy memory fragmentation conditions, a driver may try to add multiple pages to the last segment. The original code won't accept them and EBUSY will be reported to userspace. This patch modifies the function so it refuses to add a page only in case the latter starts a new segment and the maximum number of segments has already been reached. The bug can be easily reproduced with the st driver: 1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE to 16 2) modprobe st buffer_kbs=1024 3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10 dd: error writing `/dev/st0': Device or resource busy Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Cc: Jet Chen <jet.chen@intel.com> Cc: Tomas Henzl <thenzl@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 24 11月, 2014 1 次提交
-
-
由 Gu Zheng 提交于
Many block drivers accounting io stat based on bio (e.g. NVMe...), the blk_account_io_start/end() which is based on request does not make sense to them, so here we introduce the similar help function named generic_start/end_io_acct base on raw sectors, and it can simplify some driver's open io accounting code. Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 10月, 2014 1 次提交
-
-
由 Junichi Nomura 提交于
Users of bio_clone_fast() do not want bios with their own bvecs. Allocating a bvec mempool as part of the bioset intended for such users is a waste of memory. bioset_create_nobvec() creates a bioset that doesn't have the bvec mempool. Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 02 8月, 2014 1 次提交
-
-
由 Mikulas Patocka 提交于
Various subsystems can ask the bio subsystem to create a bio slab cache with some free space before the bio. This free space can be used for any purpose. Device mapper uses this per-bio-data feature to place some target-specific and device-mapper specific data before the bio, so that the target-specific data doesn't have to be allocated separately. This per-bio-data mechanism is used in place of kmalloc, so we need the allocated slab to have the same memory alignment as memory allocated with kmalloc. Change bio_find_or_create_slab() so that it uses ARCH_KMALLOC_MINALIGN alignment when creating the slab cache. This is needed so that dm-crypt can use per-bio-data for encryption - the crypto subsystem assumes this data will have the same alignment as kmalloc'ed memory. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NJens Axboe <axboe@fb.com>
-
- 15 7月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit 254c4407. It causes crashes with cryptsetup, even after a few iterations and updates. Drop it for now.
-
- 02 7月, 2014 1 次提交
-
-
由 Maurizio Lombardi 提交于
The original behaviour is to refuse to add a new page if the maximum number of segments has been reached, regardless of the fact the page we are going to add can be merged into the last segment or not. Unfortunately, when the system runs under heavy memory fragmentation conditions, a driver may try to add multiple pages to the last segment. The original code won't accept them and EBUSY will be reported to userspace. This patch modifies the function so it refuses to add a page only in case the latter starts a new segment and the maximum number of segments has already been reached. The bug can be easily reproduced with the st driver: 1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE to 16 2) modprobe st buffer_kbs=1024 3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10 dd: error writing `/dev/st0': Device or resource busy [ming.lei@canonical.com: update bi_iter.bi_size before recounting segments] Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com> Signed-off-by: NMing Lei <ming.lei@canonical.com> Tested-by: NDongsu Park <dongsu.park@profitbricks.com> Tested-by: NJet Chen <jet.chen@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 25 6月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
Another restriction inherited for NVMe - those devices don't support SG lists that have "gaps" in them. Gaps refers to cases where the previous SG entry doesn't end on a page boundary. For NVMe, all SG entries must start at offset 0 (except the first) and end on a page boundary (except the last). Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 11 6月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
With commit 762380ad added support for chunk sizes and no merging across them, it broke the rule of always allowing adding of a single page to an empty bio. So relax the restriction a bit to allow for that, similarly to what we have always done. This fixes a crash with mkfs.xfs and 512b sector sizes on NVMe. Reported-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 06 6月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
Some drivers have different limits on what size a request should optimally be, depending on the offset of the request. Similar to dividing a device into chunks. Add a setting that allows the driver to inform the block layer of such a chunk size. The block layer will then prevent merging across the chunks. This is needed to optimally support NVMe with a non-zero stripe size. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 5月, 2014 1 次提交
-
-
由 Jens Axboe 提交于
They really belong in block/, especially now since it's not in drivers/block/ anymore. Additionally, the get_maintainer script gets it wrong when in fs/. Suggested-by: NChristoph Hellwig <hch@infradead.org> Acked-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 14 5月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
Unlike the more usual refcnting, what css_tryget() provides is the distinction between online and offline csses instead of protection against upping a refcnt which already reached zero. cgroup is planning to provide actual tryget which fails if the refcnt already reached zero. Let's rename the existing trygets so that they clearly indicate that they're onliness. I thought about keeping the existing names as-are and introducing new names for the planned actual tryget; however, given that each controller participates in the synchronization of the online state, it seems worthwhile to make it explicit that these functions are about on/offline state. Rename css_tryget() to css_tryget_online() and css_tryget_from_dir() to css_tryget_online_from_dir(). This is pure rename. v2: cgroup_freezer grew new usages of css_tryget(). Update accordingly. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
-
- 23 4月, 2014 2 次提交
-
-
由 Fabian Frederick 提交于
nr_segs is no longer used in bio_alloc_map_data since c8db4448 ("block: Don't save/copy bvec array anymore") Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Fabian Frederick 提交于
bs is no longer used in biovec_create_pool since 9f060e22 ("block: Convert integrity to bvec_alloc_bs()") Signed-off-by: NFabian Frederick <fabf@skynet.be> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 22 4月, 2014 1 次提交
-
-
由 Randy Dunlap 提交于
Fix new kernel-doc warnings in fs/bio.c: Warning(fs/bio.c:316): No description found for parameter 'bio' Warning(fs/bio.c:316): No description found for parameter 'parent' Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 02 4月, 2014 1 次提交
-
-
由 Al Viro 提交于
sg_iovec array passed to it can be const Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 19 2月, 2014 1 次提交
-
-
由 Mikulas Patocka 提交于
When using device mapper, there are many "bio: create slab" messages in the log. Device mapper targets have different front_pad, so each time when we load a target that wasn't loaded before, we allocate a slab with the appropriate front_pad and there is associated "bio: create slab" message. This patch removes these messages, there is no need for them. Signed-off-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 11 2月, 2014 1 次提交
-
-
由 Kent Overstreet 提交于
Immutable biovecs changed the way bio segments are treated in such a way that bio_for_each_segment() cannot now do what we want for discard/write same bios, since bi_size means something completely different for them. Fortunately discard and write same bios never have more than a single biovec, so bio_for_each_segment() is unnecessary and not terribly meaningful for them, but we still have to special case them in a few places. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Tested-by: NRichard W.M. Jones <rjones@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 2月, 2014 1 次提交
-
-
由 Tejun Heo 提交于
cgroup_subsys is a bit messier than it needs to be. * The name of a subsys can be different from its internal identifier defined in cgroup_subsys.h. Most subsystems use the matching name but three - cpu, memory and perf_event - use different ones. * cgroup_subsys_id enums are postfixed with _subsys_id and each cgroup_subsys is postfixed with _subsys. cgroup.h is widely included throughout various subsystems, it doesn't and shouldn't have claim on such generic names which don't have any qualifier indicating that they belong to cgroup. * cgroup_subsys->subsys_id should always equal the matching cgroup_subsys_id enum; however, we require each controller to initialize it and then BUG if they don't match, which is a bit silly. This patch cleans up cgroup_subsys names and initialization by doing the followings. * cgroup_subsys_id enums are now postfixed with _cgrp_id, and each cgroup_subsys with _cgrp_subsys. * With the above, renaming subsys identifiers to match the userland visible names doesn't cause any naming conflicts. All non-matching identifiers are renamed to match the official names. cpu_cgroup -> cpu mem_cgroup -> memory perf -> perf_event * controllers no longer need to initialize ->subsys_id and ->name. They're generated in cgroup core and set automatically during boot. * Redundant cgroup_subsys declarations removed. * While updating BUG_ON()s in cgroup_init_early(), convert them to WARN()s. BUGging that early during boot is stupid - the kernel can't print anything, even through serial console and the trap handler doesn't even link stack frame properly for back-tracing. This patch doesn't introduce any behavior changes. v2: Rebased on top of fe1217c4 ("net: net_cls: move cgroupfs classid handling into core"). Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: N"David S. Miller" <davem@davemloft.net> Acked-by: N"Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NAristeu Rozanski <aris@redhat.com> Acked-by: NIngo Molnar <mingo@redhat.com> Acked-by: NLi Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Thomas Graf <tgraf@suug.ch>
-
- 09 1月, 2014 2 次提交
-
-
由 Jens Axboe 提交于
This reverts commit 95d44038. The patch is broken for on-stack bios, amongst other things.
-
由 Muthukumar Ratty 提交于
In bio_endio if bio doesn't have bi_end_io (should be an error case), we set bio to NULL and continue silently without freeing the bio. It would be good to have a WARN and free the bio to avoid memory leak. Signed-off-by: NMuthukumar Ratty <muthur@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 11月, 2013 13 次提交
-
-
由 Kent Overstreet 提交于
Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
The new bio_split() can split arbitrary bios - it's not restricted to single page bios, like the old bio_split() (previously renamed to bio_pair_split()). It also has different semantics - it doesn't allocate a struct bio_pair, leaving it up to the caller to handle completions. Then convert the existing bio_pair_split() users to the new bio_split() - and also nvme, which was open coding bio splitting. (We have to take that BUG_ON() out of bio_integrity_trim() because this bio_split() needs to use it, and there's no reason it has to be used on bios marked as cloned; BIO_CLONED doesn't seem to have clearly documented semantics anyways.) Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Neil Brown <neilb@suse.de>
-
由 Kent Overstreet 提交于
This is prep work for introducing a more general bio_split(). Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: NeilBrown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Sage Weil <sage@inktank.com>
-
由 Kent Overstreet 提交于
This adds a generic mechanism for chaining bio completions. This is going to be used for a bio_split() replacement, and it turns out to be very useful in a fair amount of driver code - a fair number of drivers were implementing this in their own roundabout ways, often painfully. Note that this means it's no longer to call bio_endio() more than once on the same bio! This can cause problems for drivers that save/restore bi_end_io. Arguably they shouldn't be saving/restoring bi_end_io at all - in all but the simplest cases they'd be better off just cloning the bio, and immutable biovecs is making bio cloning cheaper. But for now, we add a bio_endio_nodec() for these cases. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Now that drivers have been converted to the new bvec_iter primitives, there's no need to trim the bvec before we submit it; and we can't trim it once we start sharing bvecs. It used to be that passing a partially completed bio (i.e. one with nonzero bi_idx) to generic_make_request() was a dangerous thing - various drivers would choke on such things. But with immutable biovecs and our new bio splitting that shares the biovecs, submitting partially completed bios has to work (and should work, now that all the drivers have been completed to the new primitives) Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Now that drivers have been converted to the bvec_iter primitives, they shouldn't be modifying the biovec anymore and thus saving it is unnecessary - code that was previously making a backup of the bvec array can now just save bio->bi_iter. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
We need to convert the dm code to the new bvec_iter primitives which respect bi_bvec_done; they also allow us to drastically simplify dm's bio splitting code. Also, it's no longer necessary to save/restore the bvec array anymore - driver conversions for immutable bvecs are done, so drivers should never be modifying it. Also kill bio_sector_offset(), dm was the only user and it doesn't make much sense anymore. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Reviewed-by: NMike Snitzer <snitzer@redhat.com>
-
由 Kent Overstreet 提交于
bio_clone() just got more expensive - however, most users of bio_clone() don't actually need to modify the biovec. If they aren't modifying the biovec, and they can guarantee that the original bio isn't freed before the clone (also true in most cases), we can just point the clone at the original bio's biovec. Signed-off-by: NKent Overstreet <kmo@daterainc.com>
-
由 Kent Overstreet 提交于
bio_clone() needs to produce a bio that's suitable for the caller to munge with the biovec. Part of the immutable biovec patch series is fixing stuff up so that submitting partially completed bios is safe and works: thus, we now need bio_clone() on a partially completed bio to produce a bio for which bi_idx and bi_bvec done are 0 - like they would be if the caller had just allocated a new bio. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Now that we've got a mechanism for immutable biovecs - bi_iter.bi_bvec_done - we need to convert drivers to use primitives that respect it instead of using the bvec array directly. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: NeilBrown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com
-
由 Kent Overstreet 提交于
When we start sharing biovecs, keeping bi_vcnt accurate for splits is going to be error prone - and unnecessary, if we refactor some code. So bio_segments() has to go - but most of the existing users just needed to know if the bio had multiple segments, which is easier - add a bio_multiple_segments() for them. (Two of the current uses of bio_segments() are going to go away in a couple patches, but the current implementation of bio_segments() is unsafe as soon as we start doing driver conversions for immutable biovecs - so implement a dumb version for bisectability, it'll go away in a couple patches) Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
-
由 Kent Overstreet 提交于
Our fancy new bvec iterator makes code like this much easier to write. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
This adds a mechanism by which we can advance a bio by an arbitrary number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done indicates the number of bytes completed in the current bvec. Various driver code still needs to be updated to not refer to the bvec directly before we can use this for interesting things, like efficient bio splitting. Signed-off-by: NKent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net
-