提交 · 28e89fd914a22e8a64f05ae2f0048b06165f371b · openeuler / raspberrypi-kernel

08 6月, 2018 1 次提交

block: add bioset_init_from_src() helper · 28e89fd9

由 Jens Axboe 提交于 6月 07, 2018

Add a helper that allows a caller to initialize a new bio_set,
using the settings from an existing bio_set.
Reported-by: NVenkat R.B <vrbagal1@linux.vnet.ibm.com>
Tested-by: NVenkat R.B <vrbagal1@linux.vnet.ibm.com>
Tested-by: NLi Wang <liwang@redhat.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

28e89fd9

31 5月, 2018 1 次提交

block: Drop bioset_create() · dad08527

由 Kent Overstreet 提交于 5月 20, 2018

All users have been converted to bioset_init(), kill off the
old API.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dad08527

15 5月, 2018 8 次提交

block: Export bio check/set pages_dirty · 1900fcc4

由 Kent Overstreet 提交于 5月 08, 2018

Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1900fcc4

block: Add warning for bi_next not NULL in bio_endio() · 0ba99ca4

由 Kent Overstreet 提交于 5月 08, 2018

Recently found a bug where a driver left bi_next not NULL and then
called bio_endio(), and then the submitter of the bio used
bio_copy_data() which was treating src and dst as lists of bios.

Fixed that bug by splitting out bio_list_copy_data(), but in case other
things are depending on bi_next in weird ways, add a warning to help
avoid more bugs like that in the future.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0ba99ca4

block: Add missing flush_dcache_page() call · 6e6e811d

由 Kent Overstreet 提交于 5月 08, 2018

Since a bio can point to userspace pages (e.g. direct IO), this is
generally necessary.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6e6e811d

block: Split out bio_list_copy_data() · 45db54d5

由 Kent Overstreet 提交于 5月 08, 2018

Found a bug (with ASAN) where we were passing a bio to bio_copy_data()
with bi_next not NULL, when it should have been - a driver had left
bi_next set to something after calling bio_endio().

Since the normal case is only copying single bios, split out
bio_list_copy_data() to avoid more bugs like this in the future.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

45db54d5

block: Add bio_copy_data_iter(), zero_fill_bio_iter() · 38a72dac

由 Kent Overstreet 提交于 5月 08, 2018

Add versions that take bvec_iter args instead of using bio->bi_iter - to
be used by bcachefs.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

38a72dac

block: Use bioset_init() for fs_bio_set · f4f8154a

由 Kent Overstreet 提交于 5月 08, 2018

Minor optimization - remove a pointer indirection when using fs_bio_set.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f4f8154a

block: Add bioset_init()/bioset_exit() · 917a38c7

由 Kent Overstreet 提交于 5月 08, 2018

Similarly to mempool_init()/mempool_exit(), take a pointer indirection
out of allocation/freeing by allowing biosets to be embedded in other
structs.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

917a38c7

block: Convert bio_set to mempool_init() · 8aa6ba2f

由 Kent Overstreet 提交于 5月 08, 2018

Minor performance improvement by getting rid of pointer indirections
from allocation/freeing fastpaths.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8aa6ba2f

22 3月, 2018 1 次提交

Fix slab name "biovec-(1<<(21-12))" · bd5c4fac

由 Mikulas Patocka 提交于 3月 21, 2018

I'm getting a slab named "biovec-(1<<(21-12))". It is caused by unintended
expansion of the macro BIO_MAX_PAGES. This patch renames it to biovec-max.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v4.14+
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bd5c4fac

24 1月, 2018 1 次提交

block: Set BIO_TRACE_COMPLETION on new bio during split · 20d59023

由 Goldwyn Rodrigues 提交于 1月 23, 2018

We inadvertently set it again on the source bio, but we need
to set it on the new split bio instead.

Fixes: fbbaf700 ("block: trace completion of all bios.")
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

20d59023

07 1月, 2018 1 次提交

block: move bio_alloc_pages() to bcache · 25d8be77

由 Ming Lei 提交于 12月 18, 2017

bcache is the only user of bio_alloc_pages(), so move this function into
bcache, and avoid it being misused in the future.

Also rename it to bch_bio_allo_pages() since it is bcache only.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

25d8be77

21 12月, 2017 1 次提交

block-throttle: avoid double charge · 111be883

由 Shaohua Li 提交于 12月 20, 2017

If a bio is throttled and split after throttling, the bio could be
resubmited and enters the throttling again. This will cause part of the
bio to be charged multiple times. If the cgroup has an IO limit, the
double charge will significantly harm the performance. The bio split
becomes quite common after arbitrary bio size change.

To fix this, we always set the BIO_THROTTLED flag if a bio is throttled.
If the bio is cloned/split, we copy the flag to new bio too to avoid a
double charge. However, cloned bio could be directed to a new disk,
keeping the flag be a problem. The observation is we always set new disk
for the bio in this case, so we can clear the flag in bio_set_dev().

This issue exists for a long time, arbitrary bio size change just makes
it worse, so this should go into stable at least since v4.2.

V1-> V2: Not add extra field in bio based on discussion with Tejun

Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

111be883

23 11月, 2017 1 次提交

block: remove useless assignment in bio_split · f341a4d3

由 Mikulas Patocka 提交于 11月 22, 2017

Remove useless assignment to the variable "split" because the variable is
unconditionally assigned later.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f341a4d3

17 11月, 2017 1 次提交

bio: ensure __bio_clone_fast copies bi_partno · 62530ed8

由 Michael Lyle 提交于 11月 16, 2017

A new field was introduced in 74d46992, bi_partno, instead of using
bdev->bd_contains and encoding the partition information in the bi_bdev
field.  __bio_clone_fast was changed to copy the disk information, but
not the partition information.  At minimum, this regressed bcache and
caused data corruption.
Signed-off-by: NMichael Lyle <mlyle@lyle.org>
Fixes: 74d46992 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reported-by: NPavel Goran <via-bcache@pvgoran.name>
Reported-by: NCampbell Steven <casteven@gmail.com>
Reviewed-by: NColy Li <colyli@suse.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Cc: <stable@vger.kernel.org> # 4.14
Signed-off-by: NJens Axboe <axboe@kernel.dk>

62530ed8

26 10月, 2017 1 次提交

block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion() · e319e1fb

由 Byungchul Park 提交于 10月 25, 2017

Darrick posted the following warning and Dave Chinner analyzed it:

> ======================================================
> WARNING: possible circular locking dependency detected
> 4.14.0-rc1-fixes #1 Tainted: G        W
> ------------------------------------------------------
> loop0/31693 is trying to acquire lock:
>  (&(&ip->i_mmaplock)->mr_lock){++++}, at: [<ffffffffa00f1b0c>] xfs_ilock+0x23c/0x330 [xfs]
>
> but now in release context of a crosslock acquired at the following:
>  ((complete)&ret.event){+.+.}, at: [<ffffffff81326c1f>] submit_bio_wait+0x7f/0xb0
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 ((complete)&ret.event){+.+.}:
>        lock_acquire+0xab/0x200
>        wait_for_completion_io+0x4e/0x1a0
>        submit_bio_wait+0x7f/0xb0
>        blkdev_issue_zeroout+0x71/0xa0
>        xfs_bmapi_convert_unwritten+0x11f/0x1d0 [xfs]
>        xfs_bmapi_write+0x374/0x11f0 [xfs]
>        xfs_iomap_write_direct+0x2ac/0x430 [xfs]
>        xfs_file_iomap_begin+0x20d/0xd50 [xfs]
>        iomap_apply+0x43/0xe0
>        dax_iomap_rw+0x89/0xf0
>        xfs_file_dax_write+0xcc/0x220 [xfs]
>        xfs_file_write_iter+0xf0/0x130 [xfs]
>        __vfs_write+0xd9/0x150
>        vfs_write+0xc8/0x1c0
>        SyS_write+0x45/0xa0
>        entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #1 (&xfs_nondir_ilock_class){++++}:
>        lock_acquire+0xab/0x200
>        down_write_nested+0x4a/0xb0
>        xfs_ilock+0x263/0x330 [xfs]
>        xfs_setattr_size+0x152/0x370 [xfs]
>        xfs_vn_setattr+0x6b/0x90 [xfs]
>        notify_change+0x27d/0x3f0
>        do_truncate+0x5b/0x90
>        path_openat+0x237/0xa90
>        do_filp_open+0x8a/0xf0
>        do_sys_open+0x11c/0x1f0
>        entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> -> #0 (&(&ip->i_mmaplock)->mr_lock){++++}:
>        up_write+0x1c/0x40
>        xfs_iunlock+0x1d0/0x310 [xfs]
>        xfs_file_fallocate+0x8a/0x310 [xfs]
>        loop_queue_work+0xb7/0x8d0
>        kthread_worker_fn+0xb9/0x1f0
>
> Chain exists of:
>   &(&ip->i_mmaplock)->mr_lock --> &xfs_nondir_ilock_class --> (complete)&ret.event
>
>  Possible unsafe locking scenario by crosslock:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&xfs_nondir_ilock_class);
>   lock((complete)&ret.event);
>                                lock(&(&ip->i_mmaplock)->mr_lock);
>                                unlock((complete)&ret.event);
>
>                *** DEADLOCK ***

The warning is a false positive, caused by the fact that all
wait_for_completion()s in submit_bio_wait() are waiting with the same
lock class.

However, some bios have nothing to do with others, for example in the case
of loop devices, there's no direct connection between the bios of an upper
device and the bios of a lower device(=loop device).

The safest way to assign different lock classes to different devices is
to do it for each gendisk. In other words, this patch assigns a
lockdep_map per gendisk and uses it when initializing completion in
submit_bio_wait().
Analyzed-by: NDave Chinner <david@fromorbit.com>
Reported-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NByungchul Park <byungchul.park@lge.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: amir73il@gmail.com
Cc: axboe@kernel.dk
Cc: david@fromorbit.com
Cc: hch@infradead.org
Cc: idryomov@gmail.com
Cc: johan@kernel.org
Cc: johannes.berg@intel.com
Cc: kernel-team@lge.com
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-xfs@vger.kernel.org
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1508921765-15396-10-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

e319e1fb

25 10月, 2017 1 次提交

block: Use DECLARE_COMPLETION_ONSTACK() in submit_bio_wait() · 65e53aab

由 Christoph Hellwig 提交于 10月 25, 2017

Simplify the code by getting rid of the submit_bio_ret structure.

(This also helps address a lockdep false positive.)
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: amir73il@gmail.com
Cc: axboe@kernel.dk
Cc: darrick.wong@oracle.com
Cc: david@fromorbit.com
Cc: hch@infradead.org
Cc: idryomov@gmail.com
Cc: johan@kernel.org
Cc: johannes.berg@intel.com
Cc: kernel-team@lge.com
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-xfs@vger.kernel.org
Cc: oleg@redhat.com
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1508921765-15396-2-git-send-email-byungchul.park@lge.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

65e53aab

17 10月, 2017 1 次提交

block: fix Sphinx kernel-doc warning · 519c8e9f

由 Randy Dunlap 提交于 10月 16, 2017

Sphinx treats symbols that end with '_' as a kind of special
documentation indicator, so fix that by adding an ending '*'
to it.

../block/bio.c:404: ERROR: Unknown target name: "gfp".
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

519c8e9f

12 10月, 2017 11 次提交

A
bio_alloc_map_data(): do bmd->iter setup right there · 0e5b935d
由 Al Viro 提交于 9月 24, 2017
```
just need to copy it iter instead of iter->nr_segs
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
0e5b935d

bio_copy_user_iov(): saner bio size calculation · d16d44eb

由 Al Viro 提交于 9月 24, 2017

it's a bounce buffer; we don't *care* how badly is the real
source/destination fragmented, all that matters is the total
size.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d16d44eb

A
bio_map_user_iov(): get rid of copying iov_iter · 0a0f1513
由 Al Viro 提交于 9月 24, 2017
```
we do want *iter advanced
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
0a0f1513
A
bio_copy_from_iter(): get rid of copying iov_iter · 98a09d61
由 Al Viro 提交于 9月 24, 2017
```
we want the one passed to it advanced, anyway
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
98a09d61
A
move more stuff down into bio_copy_user_iov() · 2884d0be
由 Al Viro 提交于 9月 24, 2017
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
2884d0be
A
blk_rq_map_user_iov(): move iov_iter_advance() down · e81cef5d
由 Al Viro 提交于 9月 24, 2017
```
... into bio_{map,copy}_user_iov()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e81cef5d
A
bio_map_user_iov(): get rid of the iov_for_each() · b282cc76
由 Al Viro 提交于 9月 23, 2017
```
Use iov_iter_npages()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
b282cc76
A
bio_map_user_iov(): move alignment check into the main loop · 98f0bc99
由 Al Viro 提交于 9月 23, 2017
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
98f0bc99

don't rely upon subsequent bio_add_pc_page() calls failing · e2e115d1

由 Al Viro 提交于 9月 23, 2017

... they might actually succeed in some cases (when we are at the
queue-imposed segments limit, the next page is not mergable with
the last one we'd got in, but the first page covered by the next
iovec *is* mergable).  Make sure that once it's failed, we are
done with that bio.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e2e115d1

A
... and with iov_iter_get_pages_alloc() it becomes even simpler · 629e42bc
由 Al Viro 提交于 9月 23, 2017
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
629e42bc
A
bio_map_user_iov(): switch to iov_iter_get_pages()/iov_iter_advance() · 076098e5
由 Al Viro 提交于 9月 23, 2017
```
... and to hell with iov_for_each() nonsense
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
076098e5

11 10月, 2017 3 次提交

bio_copy_user_iov(): don't ignore ->iov_offset · 1cfd0ddd

由 Al Viro 提交于 9月 24, 2017

Since "block: support large requests in blk_rq_map_user_iov" we
started to call it with partially drained iter; that works fine
on the write side, but reads create a copy of iter for completion
time.  And that needs to take the possibility of ->iov_iter != 0
into account...

Cc: stable@vger.kernel.org #v4.5+
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1cfd0ddd

more bio_map_user_iov() leak fixes · 2b04e8f6

由 Al Viro 提交于 9月 23, 2017

we need to take care of failure exit as well - pages already
in bio should be dropped by analogue of bio_unmap_pages(),
since their refcounts had been bumped only once per reference
in bio.

Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2b04e8f6

fix unbalanced page refcounting in bio_map_user_iov · 95d78c28

由 Vitaly Mayatskikh 提交于 9月 22, 2017

bio_map_user_iov and bio_unmap_user do unbalanced pages refcounting if
IO vector has small consecutive buffers belonging to the same page.
bio_add_pc_page merges them into one, but the page reference is never
dropped.

Cc: stable@vger.kernel.org
Signed-off-by: NVitaly Mayatskikh <v.mayatskih@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

95d78c28

07 10月, 2017 1 次提交

block/bio: Remove null checks before mempool_destroy in bioset_free · 4078def8

由 Tim Hansen 提交于 10月 06, 2017

This patch removes redundant checks for null values on bio_pool and
bvec_pool.

Found using make coccicheck M=block/ on linux-net tree on the
next-20170929 tag.
Signed-off-by: NTim Hansen <devtimhansen@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4078def8

26 9月, 2017 1 次提交

blkcg: delete unused APIs · af551fb3

由 Shaohua Li 提交于 9月 14, 2017

Nobody uses the APIs right now.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

af551fb3

26 8月, 2017 1 次提交

md/raid0: attach correct cgroup info in bio · 8a8e6f84

由 Shaohua Li 提交于 8月 18, 2017

The discard bio doesn't attach the original bio cgroup info. Normal bio
is cloned, so is fine.
Signed-off-by: NShaohua Li <shli@fb.com>

8a8e6f84

24 8月, 2017 1 次提交

block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992

由 Christoph Hellwig 提交于 8月 23, 2017

This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

74d46992

10 8月, 2017 1 次提交

block: pass in queue to inflight accounting · d62e26b3

由 Jens Axboe 提交于 6月 30, 2017

No functional change in this patch, just in preparation for
basing the inflight mechanism on the queue in question.
Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d62e26b3

02 8月, 2017 1 次提交

block: Add comment to submit_bio_wait() · 3d289d68

由 Jan Kara 提交于 8月 02, 2017

submit_bio_wait() does not consume bio reference. Add comment about
that.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3d289d68

11 7月, 2017 1 次提交

block: call bio_uninit in bio_endio · b222dd2f

由 Shaohua Li 提交于 7月 10, 2017

bio_free isn't a good place to free cgroup info. There are a
lot of cases bio is allocated in special way (for example, in stack) and
never gets called by bio_put hence bio_free, we are leaking memory. This
patch moves the free to bio endio, which should be called anyway. The
bio_uninit call in bio_free is kept, in case the bio never gets called
bio endio.

This assumes ->bi_end_io() doesn't access cgroup info, which seems true
in my audit.

This along with Christoph's integrity patch should fix the memory leak
issue.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b222dd2f