提交 · 8aa6ba2f6e3deaff70e517e3cfbf38d1105f9d4f · openeuler / Kernel

15 5月, 2018 3 次提交

block: Convert bio_set to mempool_init() · 8aa6ba2f

由 Kent Overstreet 提交于 5月 08, 2018

Minor performance improvement by getting rid of pointer indirections
from allocation/freeing fastpaths.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8aa6ba2f

mempool: Add mempool_init()/mempool_exit() · c1a67fef

由 Kent Overstreet 提交于 5月 04, 2015

Allows mempools to be embedded in other structs, getting rid of a
pointer indirection from allocation fastpaths.

mempool_exit() is safe to call on an uninitialized but zeroed mempool.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c1a67fef

sbitmap: fix race in wait batch accounting · c854ab57

由 Jens Axboe 提交于 5月 14, 2018

If we have multiple callers of sbq_wake_up(), we can end up in a
situation where the wait_cnt will continually go more and more
negative. Consider the case where our wake batch is 1, hence
wait_cnt will start out as 1.

wait_cnt == 1

CPU0				CPU1
atomic_dec_return(), cnt == 0
				atomic_dec_return(), cnt == -1
				cmpxchg(-1, 0) (succeeds)
				[wait_cnt now 0]
cmpxchg(0, 1) (fails)

This ends up with wait_cnt being 0, we'll wakeup immediately
next time. Going through the same loop as above again, and
we'll have wait_cnt -1.

For the case where we have a larger wake batch, the only
difference is that the starting point will be higher. We'll
still end up with continually smaller batch wakeups, which
defeats the purpose of the rolling wakeups.

Always reset the wait_cnt to the batch value. Then it doesn't
matter who wins the race. But ensure that whomever does win
the race is the one that increments the ws index and wakes up
our batch count, loser gets to call __sbq_wake_up() again to
account his wakeups towards the next active wait state index.

Fixes: 6c0ca7ae ("sbitmap: fix wakeup hang after sbq resize")
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c854ab57

14 5月, 2018 7 次提交

block: consistently use GFP_NOIO instead of __GFP_NORECLAIM · 0eb0b63c

由 Christoph Hellwig 提交于 5月 09, 2018

Same numerical value (for now at least), but a much better documentation
of intent.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0eb0b63c

block: use GFP_NOIO instead of __GFP_DIRECT_RECLAIM · c3036021

由 Christoph Hellwig 提交于 5月 09, 2018

We just can't do I/O when doing block layer requests allocations,
so use GFP_NOIO instead of the even more limited __GFP_DIRECT_RECLAIM.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c3036021

block: pass an explicit gfp_t to get_request · 4accf5fc

由 Christoph Hellwig 提交于 5月 09, 2018

blk_old_get_request already has it at hand, and in blk_queue_bio, which
is the fast path, it is constant.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4accf5fc

block: sanitize blk_get_request calling conventions · ff005a06

由 Christoph Hellwig 提交于 5月 09, 2018

Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ff005a06

C
block: fix __get_request documentation · a9a14d36
由 Christoph Hellwig 提交于 5月 09, 2018
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
a9a14d36

scsi/osd: remove the gfp argument to osd_start_request · ac613e45

由 Christoph Hellwig 提交于 5月 09, 2018

Always GFP_KERNEL, and keeping it would cause serious complications for
the next change.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ac613e45

memstick: remove unused variables · 058147bc

由 Christoph Hellwig 提交于 5月 14, 2018

Fixes: 7c2d748e ("memstick: don't call blk_queue_bounce_limit")
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

058147bc

12 5月, 2018 7 次提交

ps3disk: handle highmem pages · e4f0e0cb

由 Christoph Hellwig 提交于 5月 09, 2018

The ps3disk driver already kmaps all pages when copying from/to the
internal bounce buffer, so it can accept highmem pages just fine.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e4f0e0cb

jsflash: handle highmem pages · 37a5b5c6

由 Christoph Hellwig 提交于 5月 09, 2018

Just kmap the bio single page payload before processing it.

(and yes, now highmem on sparc32 anyway, but kmap_(un)map atomic are nops,
so this gives the right example)
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

37a5b5c6

aoe: handle highmem pages · ad180f6f

由 Christoph Hellwig 提交于 5月 09, 2018

Use kmap_atomic when copying out of a bio_vec.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ad180f6f

mtd_blkdevs: handle highmem pages · 34ab96e6

由 Christoph Hellwig 提交于 5月 09, 2018

Just kmap the single payload page before passing it on to the FTL.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

34ab96e6

memstick: don't call blk_queue_bounce_limit · 7c2d748e

由 Christoph Hellwig 提交于 5月 09, 2018

All in-tree host drivers set up a proper dma mask and use the dma-mapping
helpers.  This means they will be able to deal with any address that we
are throwing at them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7c2d748e

DAC960: don't use block layer bounce buffers · 00f0a51f

由 Christoph Hellwig 提交于 5月 09, 2018

DAC960 just sets the block bounce limit to the dma mask, which means
that the iommu or swiotlb already take care of the bounce buffering,
and the block bouncing can be removed.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

00f0a51f

mtip32xx: don't use block layer bounce buffers · 5c26e050

由 Christoph Hellwig 提交于 5月 09, 2018

mtip32xx just sets the block bounce limit to the dma mask, which means
that the iommu or swiotlb already take care of the bounce buffering,
and the block bouncing can be removed.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5c26e050

11 5月, 2018 9 次提交

sbitmap: warn if using smaller shallow depth than was setup · 61445b56

由 Omar Sandoval 提交于 5月 09, 2018

Make sure the user passed the right value to
sbitmap_queue_min_shallow_depth().
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

61445b56

kyber-iosched: update shallow depth when setting up hardware queue · 28820640

由 Jens Axboe 提交于 5月 09, 2018

We don't expect the async depth to be smaller than the wake batch
count for sbitmap, but just in case, inform sbitmap of what shallow
depth kyber may use.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

28820640

bfq-iosched: update shallow depth to smallest one used · 483b7bf2

由 Jens Axboe 提交于 5月 09, 2018

If our shallow depth is smaller than the wake batching of sbitmap,
we can introduce hangs. Ensure that sbitmap knows how low we'll go.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

483b7bf2

sbitmap: fix missed wakeups caused by sbitmap_queue_get_shallow() · a3275539

由 Omar Sandoval 提交于 5月 09, 2018

The sbitmap queue wake batch is calculated such that once allocations
start blocking, all of the bits which are already allocated must be
enough to fulfill the batch counters of all of the waitqueues. However,
the shallow allocation depth can break this invariant, since we block
before our full depth is being utilized. Add
sbitmap_queue_min_shallow_depth(), which saves the minimum shallow depth
the sbq will use, and update sbq_calc_wake_batch() to take it into
account.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a3275539

bfq-iosched: remove unused variable · bd7d4ef6

由 Jens Axboe 提交于 5月 09, 2018

bfqd->sb_shift was attempted used as a cache for the sbitmap queue
shift, but we don't need it, as it never changes. Kill it with fire.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bd7d4ef6

bfq: calculate shallow depths at init time · f0635b8a

由 Jens Axboe 提交于 5月 09, 2018

It doesn't change, so don't put it in the per-IO hot path.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f0635b8a

bfq-iosched: don't worry about reserved tags in limit_depth · 55141366

由 Jens Axboe 提交于 5月 09, 2018

Reserved tags are used for error handling, we don't need to
care about them for regular IO. The core won't call us for these
anyway.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

55141366

blk-mq: don't call into depth limiting for reserved tags · 17a51199

由 Jens Axboe 提交于 5月 09, 2018

It's not useful, they are internal and/or error handling recovery
commands.
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

17a51199

block, bfq: postpone rq preparation to insert or merge · 18e5a57d

由 Paolo Valente 提交于 5月 04, 2018

When invoked for an I/O request rq, the prepare_request hook of bfq
increments reference counters in the destination bfq_queue for rq. In
this respect, after this hook has been invoked, rq may still be
transformed into a request with no icq attached, i.e., for bfq, a
request not associated with any bfq_queue. No further hook is invoked
to signal this tranformation to bfq (in general, to the destination
elevator for rq). This leads bfq into an inconsistent state, because
bfq has no chance to correctly lower these counters back. This
inconsistency may in its turn cause incorrect scheduling and hangs. It
certainly causes memory leaks, by making it impossible for bfq to free
the involved bfq_queue.

On the bright side, no transformation can still happen for rq after rq
has been inserted into bfq, or merged with another, already inserted,
request. Exploiting this fact, this commit addresses the above issue
by delaying the preparation of an I/O request to when the request is
inserted or merged.

This change also gives a performance bonus: a lock-contention point
gets removed. To prepare a request, bfq needs to hold its scheduler
lock. After postponing request preparation to insertion or merging, no
lock needs to be grabbed any longer in the prepare_request hook, while
the lock already taken to perform insertion or merging is used to
preparare the request as well.
Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: NBart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

18e5a57d

10 5月, 2018 1 次提交

mtip32xx: Fix an error handling path in 'mtip_pci_probe()' · 8e3c283f

由 Christophe JAILLET 提交于 5月 10, 2018

Branch to the right label in the error handling path in order to keep it
logical.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8e3c283f

09 5月, 2018 12 次提交

brd: Mark as non-rotational · 316ba573

由 SeongJae Park 提交于 5月 03, 2018

This commit sets QUEUE_FLAG_NONROT and clears up QUEUE_FLAG_ADD_RANDOM
to mark the ramdisks as non-rotational device.
Signed-off-by: NSeongJae Park <sj38.park@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

316ba573

block: consolidate struct request timestamp fields · 522a7775

由 Omar Sandoval 提交于 5月 09, 2018

Currently, struct request has four timestamp fields:

- A start time, set at get_request time, in jiffies, used for iostats
- An I/O start time, set at start_request time, in ktime nanoseconds,
  used for blk-stats (i.e., wbt, kyber, hybrid polling)
- Another start time and another I/O start time, used for cfq and bfq

These can all be consolidated into one start time and one I/O start
time, both in ktime nanoseconds, shaving off up to 16 bytes from struct
request depending on the kernel config.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

522a7775

block: move blk_stat_add() to __blk_mq_end_request() · 4bc6339a

由 Omar Sandoval 提交于 5月 09, 2018

We want this next to blk_account_io_done() for the next change so that
we can call ktime_get() only once for both.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4bc6339a

block: use ktime_get_ns() instead of sched_clock() for cfq and bfq · 84c7afce

由 Omar Sandoval 提交于 5月 09, 2018

cfq and bfq have some internal fields that use sched_clock() which can
trivially use ktime_get_ns() instead. Their timestamp fields in struct
request can also use ktime_get_ns(), which resolves the 8 year old
comment added by commit 28f4197e ("block: disable preemption before
using sched_clock()").
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

84c7afce

block: get rid of struct blk_issue_stat · 544ccc8d

由 Omar Sandoval 提交于 5月 09, 2018

struct blk_issue_stat squashes three things into one u64:

- The time the driver started working on a request
- The original size of the request (for the io.low controller)
- Flags for writeback throttling

It turns out that on x86_64, we have a 4 byte hole in struct request
which we can fill with the non-timestamp fields from blk_issue_stat,
simplifying things quite a bit.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

544ccc8d

block: replace bio->bi_issue_stat with bio-specific type · 5238dcf4

由 Omar Sandoval 提交于 5月 09, 2018

struct blk_issue_stat is going away, and bio->bi_issue_stat doesn't even
use the blk-stats interface, so we can provide a separate implementation
specific for bios. The helpers work the same way as the blk-stats
helpers.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5238dcf4

block: pass struct request instead of struct blk_issue_stat to wbt · a8a45941

由 Omar Sandoval 提交于 5月 09, 2018

issue_stat is going to go away, so first make writeback throttling take
the containing request, update the internal wbt helpers accordingly, and
change rwb->sync_cookie to be the request pointer instead of the
issue_stat pointer. No functional change.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a8a45941

block: move some wbt helpers to blk-wbt.c · 934031a1

由 Omar Sandoval 提交于 5月 09, 2018

A few helpers are only used from blk-wbt.c, so move them there, and put
wbt_track() behind the CONFIG_BLK_WBT typedef. This is in preparation
for changing how the wbt flags are tracked.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

934031a1

blk-wbt: throttle discards like background writes · 782f5697

由 Jens Axboe 提交于 5月 07, 2018

Throttle discards like we would any background write. Discards should
be background activity, so if they are impacting foreground IO, then
we will throttle them down.
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

782f5697

blk-wbt: pass in enum wbt_flags to get_rq_wait() · 8bea6090

由 Jens Axboe 提交于 5月 07, 2018

This is in preparation for having more write queues, in which
case we would have needed to pass in more information than just
a simple 'is_kswapd' boolean.
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8bea6090

blk-wbt: account any writing command as a write · 825843b0

由 Jens Axboe 提交于 5月 03, 2018

We currently special case WRITE and FLUSH, but we should really
just include any command with the write bit set. This ensures
that we account DISCARD.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

825843b0

block: break discard submissions into the user defined size · af097f5d

由 Jens Axboe 提交于 5月 08, 2018

Don't build discards bigger than what the user asked for, if the
user decided to limit the size by writing to 'discard_max_bytes'.
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

af097f5d

08 5月, 2018 1 次提交

loop: remember whether sysfs_create_group() was done · d3349b6b

由 Tetsuo Handa 提交于 5月 04, 2018

syzbot is hitting WARN() triggered by memory allocation fault
injection [1] because loop module is calling sysfs_remove_group()
when sysfs_create_group() failed.
Fix this by remembering whether sysfs_create_group() succeeded.

[1] https://syzkaller.appspot.com/bug?id=3f86c0edf75c86d2633aeb9dd69eccc70bc7e90bSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Nsyzbot <syzbot+9f03168400f56df89dbc6f1751f4458fe739ff29@syzkaller.appspotmail.com>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Renamed sysfs_ready -> sysfs_inited.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d3349b6b

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功