提交 · af67c31fba3b879b241536a48df703a2eee18ebf · openanolis / cloud-kernel

19 6月, 2017 1 次提交

blk: remove bio_set arg from blk_queue_split() · af67c31f

由 NeilBrown 提交于 6月 18, 2017

blk_queue_split() is always called with the last arg being q->bio_split,
where 'q' is the first arg.

Also blk_queue_split() sometimes uses the passed-in 'bs' and sometimes uses
q->bio_split.

This is inconsistent and unnecessary.  Remove the last arg and always use
q->bio_split inside blk_queue_split()
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Credit-to: Javier González <jg@lightnvm.io> (Noticed that lightnvm was missed)
Reviewed-by: NJavier González <javier@cnexlabs.com>
Tested-by: NJavier González <javier@cnexlabs.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

af67c31f

09 4月, 2017 1 次提交

block: implement splitting of REQ_OP_WRITE_ZEROES bios · 885fa13f

由 Christoph Hellwig 提交于 4月 05, 2017

Copy and past the REQ_OP_WRITE_SAME code to prepare to implementations
that limit the write zeroes size.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

885fa13f

09 2月, 2017 3 次提交

block: optionally merge discontiguous discard bios into a single request · 1e739730

由 Christoph Hellwig 提交于 2月 08, 2017

Add a new merge strategy that merges discard bios into a request until the
maximum number of discard ranges (or the maximum discard size) is reached
from the plug merging code.  I/O scheduler merging is not wired up yet
but might also be useful, although not for fast devices like NVMe which
are the only user for now.

Note that for now we don't support limiting the size of each discard range,
but if needed that can be added later.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

1e739730

block: enumify ELEVATOR_*_MERGE · 34fe7c05

由 Christoph Hellwig 提交于 2月 08, 2017

Switch these constants to an enum, and make let the compiler ensure that
all callers of blk_try_merge and elv_merge handle all potential values.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

34fe7c05

block: move req_set_nomerge to blk.h · 6cf7677f

由 Christoph Hellwig 提交于 2月 08, 2017

This makes it available outside of blk-merge.c, and inlining such a trivial
helper seems pretty useful to start with.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

6cf7677f

04 2月, 2017 2 次提交

block: free merged request in the caller · e4d750c9

由 Jens Axboe 提交于 2月 03, 2017

If we end up doing a request-to-request merge when we have completed
a bio-to-request merge, we free the request from deep down in that
path. For blk-mq-sched, the merge path has to hold the appropriate
lock, but we don't need it for freeing the request. And in fact
holding the lock is problematic, since we are now calling the
mq sched put_rq_private() hook with the lock held. Other call paths
do not hold this lock.

Fix this inconsistency by ensuring that the caller frees a merged
request. Then we can do it outside of the lock, making it both more
efficient and fixing the blk-mq-sched problem of invoking parts of
the scheduler with an unknown lock state.
Reported-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>

e4d750c9

blk-merge: return the merged request · b973cb7e

由 Jens Axboe 提交于 2月 02, 2017

When we attempt to merge request-to-request, we return a 0/1 if we
ended up merging or not. Change that to return the pointer to the
request that we freed. We will use this to move the freeing of
that request out of the merge logic, so that callers can drop
locks before freeing the request.

There should be no functional changes in this patch.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>

b973cb7e

18 1月, 2017 2 次提交

blk-mq-sched: add framework for MQ capable IO schedulers · bd166ef1

由 Jens Axboe 提交于 1月 17, 2017

This adds a set of hooks that intercepts the blk-mq path of
allocating/inserting/issuing/completing requests, allowing
us to develop a scheduler within that framework.

We reuse the existing elevator scheduler API on the registration
side, but augment that with the scheduler flagging support for
the blk-mq interfce, and with a separate set of ops hooks for MQ
devices.

We split driver and scheduler tags, so we can run the scheduling
independently of device queue depth.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>

bd166ef1

block: move existing elevator ops to union · c51ca6cf

由 Jens Axboe 提交于 12月 10, 2016

Prep patch for adding MQ ops as well, since doing anon unions with
named initializers doesn't work on older compilers.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>

c51ca6cf

09 12月, 2016 1 次提交

block: improve handling of the magic discard payload · f9d03f96

由 Christoph Hellwig 提交于 12月 08, 2016

Instead of allocating a single unused biovec for discard requests, send
them down without any payload.  Instead we allow the driver to add a
"special" payload using a biovec embedded into struct request (unioned
over other fields never used while in the driver), and overloading
the number of segments for this case.

This has a couple of advantages:

 - we don't have to allocate the bio_vec
 - the amount of special casing for discard requests in the block
   layer is significantly reduced
 - using this same scheme for other request types is trivial,
   which will be important for implementing the new WRITE_ZEROES
   op on devices where it actually requires a payload (e.g. SCSI)
 - we can get rid of playing games with the request length, as
   we'll never touch it and completions will work just fine
 - it will allow us to support ranged discard operations in the
   future by merging non-contiguous discard bios into a single
   request
 - last but not least it removes a lot of code

This patch is the common base for my WIP series for ranges discards and to
remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES,
so it would be good to get it in quickly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

f9d03f96

01 12月, 2016 2 次提交

block: factor out req_set_nomerge · e0c72300

由 Ritesh Harjani 提交于 12月 01, 2016

Factor out common code for setting REQ_NOMERGE flag which is being used
out at certain places and make it a helper instead, req_set_nomerge().
Signed-off-by: NRitesh Harjani <riteshh@codeaurora.org>

Get rid of the inline.
Signed-off-by: NJens Axboe <axboe@fb.com>

e0c72300

block: add support for REQ_OP_WRITE_ZEROES · a6f0788e

由 Chaitanya Kulkarni 提交于 11月 30, 2016

This adds a new block layer operation to zero out a range of
LBAs. This allows to implement zeroing for devices that don't use
either discard with a predictable zero pattern or WRITE SAME of zeroes.
The prominent example of that is NVMe with the Write Zeroes command,
but in the future, this should also help with improving the way
zeroing discards work. For this operation, suitable entry is exported in
sysfs which indicate the number of maximum bytes allowed in one
write zeroes operation by the device.
Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@hgst.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

a6f0788e

28 10月, 2016 1 次提交

block: split out request-only flags into a new namespace · e8064021

由 Christoph Hellwig 提交于 10月 20, 2016

A lot of the REQ_* flags are only used on struct requests, and only of
use to the block layer and a few drivers that dig into struct request
internals.

This patch adds a new req_flags_t rq_flags field to struct request for
them, and thus dramatically shrinks the number of common requests.  It
also removes the unfortunate situation where we have to fit the fields
from the same enum into 32 bits for struct bio and 64 bits for
struct request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e8064021

24 8月, 2016 1 次提交

block: make sure a big bio is split into at most 256 bvecs · 4d70dca4

由 Ming Lei 提交于 8月 23, 2016

After arbitrary bio size was introduced, the incoming bio may
be very big. We have to split the bio into small bios so that
each holds at most BIO_MAX_PAGES bvecs for safety reason, such
as bio_clone().

This patch fixes the following kernel crash:

> [  172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
> [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
> [  172.660399] Oops: 0000 [#1] SMP
> [...]
> [  172.664780] Call Trace:
> [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
> [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
> [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
> [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
> [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
> [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
> [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]

The issue can be reproduced by the following steps:
	- create one raid1 over two virtio-blk
	- build bcache device over the above raid1 and another cache device
	and bucket size is set as 2Mbytes
	- set cache mode as writeback
	- run random write over ext4 on the bcache device

Fixes: 54efd50b(block: make generic_make_request handle arbitrarily sized bios)
Reported-by: NSebastian Roesner <sroesner-kernelorg@roesner-online.de>
Reported-by: NEric Wheeler <bcache@lists.ewheeler.net>
Cc: stable@vger.kernel.org (4.3+)
Cc: Shaohua Li <shli@fb.com>
Acked-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4d70dca4

16 8月, 2016 1 次提交

block: Fix secure erase · 7afafc8a

由 Adrian Hunter 提交于 8月 16, 2016

Commit 288dab8a ("block: add a separate operation type for secure
erase") split REQ_OP_SECURE_ERASE from REQ_OP_DISCARD without considering
all the places REQ_OP_DISCARD was being used to mean either. Fix those.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Fixes: 288dab8a ("block: add a separate operation type for secure erase")
Signed-off-by: NJens Axboe <axboe@fb.com>

7afafc8a

08 8月, 2016 1 次提交

block: rename bio bi_rw to bi_opf · 1eff9d32

由 Jens Axboe 提交于 8月 05, 2016

Since commit 63a4cc24, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.
Signed-off-by: NJens Axboe <axboe@fb.com>

1eff9d32

21 7月, 2016 2 次提交

block: Fix front merge check · 17007f39

由 Damien Le Moal 提交于 7月 20, 2016

For a front merge, the maximum number of sectors of the
request must be checked against the front merge BIO sector,
not the current sector of the request.
Signed-off-by: NDamien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

17007f39

block: do not merge requests without consulting with io scheduler · 72ef799b

由 Tahsin Erdogan 提交于 7月 07, 2016

Before merging a bio into an existing request, io scheduler is called to
get its approval first. However, the requests that come from a plug
flush may get merged by block layer without consulting with io
scheduler.

In case of CFQ, this can cause fairness problems. For instance, if a
request gets merged into a low weight cgroup's request, high weight cgroup
now will depend on low weight cgroup to get scheduled. If high weigt cgroup
needs that io request to complete before submitting more requests, then it
will also lose its timeslice.

Following script demonstrates the problem. Group g1 has a low weight, g2
and g3 have equal high weights but g2's requests are adjacent to g1's
requests so they are subject to merging. Due to these merges, g2 gets
poor disk time allocation.

cat > cfq-merge-repro.sh << "EOF"
#!/bin/bash
set -e

IO_ROOT=/mnt-cgroup/io

mkdir -p $IO_ROOT

if ! mount | grep -qw $IO_ROOT; then
  mount -t cgroup none -oblkio $IO_ROOT
fi

cd $IO_ROOT

for i in g1 g2 g3; do
  if [ -d $i ]; then
    rmdir $i
  fi
done

mkdir g1 && echo 10 > g1/blkio.weight
mkdir g2 && echo 495 > g2/blkio.weight
mkdir g3 && echo 495 > g3/blkio.weight

RUNTIME=10

(echo $BASHPID > g1/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=0k &> /dev/null)&

(echo $BASHPID > g2/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=64k &> /dev/null)&

(echo $BASHPID > g3/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=256k &> /dev/null)&

sleep $((RUNTIME+1))

for i in g1 g2 g3; do
  echo ---- $i ----
  cat $i/blkio.time
done

EOF
# ./cfq-merge-repro.sh
---- g1 ----
8:16 162
---- g2 ----
8:16 165
---- g3 ----
8:16 686

After applying the patch:

# ./cfq-merge-repro.sh
---- g1 ----
8:16 90
---- g2 ----
8:16 445
---- g3 ----
8:16 471
Signed-off-by: NTahsin Erdogan <tahsin@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

72ef799b

09 6月, 2016 1 次提交

block: add a separate operation type for secure erase · 288dab8a

由 Christoph Hellwig 提交于 6月 09, 2016

Instead of overloading the discard support with the REQ_SECURE flag.
Use the opportunity to rename the queue flag as well, and remove the
dead checks for this flag in the RAID 1 and RAID 10 drivers that don't
claim support for secure erase.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

288dab8a

08 6月, 2016 3 次提交

block: convert merge/insert code to check for REQ_OPs. · 8fe0d473

由 Mike Christie 提交于 6月 05, 2016

This patch converts the block layer merging code to use separate variables
for the operation and flags, and to check req_op for the REQ_OP.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8fe0d473

block, fs, mm, drivers: use bio set/get op accessors · 95fe6c1a

由 Mike Christie 提交于 6月 05, 2016

This patch converts the simple bi_rw use cases in the block,
drivers, mm and fs code to set/get the bio operation using
bio_set_op_attrs/bio_op

These should be simple one or two liner cases, so I just did them
in one patch. The next patches handle the more complicated
cases in a module per patch.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

95fe6c1a

block, drivers, cgroup: use op_is_write helper instead of checking for REQ_WRITE · a8ebb056

由 Mike Christie 提交于 6月 05, 2016

We currently set REQ_WRITE/WRITE for all non READ IOs
like discard, flush, writesame, etc. In the next patches where we
no longer set up the op as a bitmap, we will not be able to
detect a operation direction like writesame by testing if REQ_WRITE is
set.

This patch converts the drivers and cgroup to use the
op_is_write helper. This should just cover the simple
cases. I did dm, md and bcache in their own patches
because they were more involved.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a8ebb056

04 3月, 2016 1 次提交

block: merge: get the 1st and last bvec via helpers · e827091c

由 Ming Lei 提交于 2月 26, 2016

This patch applies the two introduced helpers to
figure out the 1st and last bvec.
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e827091c

23 1月, 2016 1 次提交

block: fix bio splitting on max sectors · d0e5fbb0

由 Ming Lei 提交于 1月 23, 2016

After commit e36f6204(block: split bios to maxpossible length),
bio can be splitted in the middle of a vector entry, then it
is easy to split out one bio which size isn't aligned with block
size, especially when the block size is bigger than 512.

This patch fixes the issue by making the max io size aligned
to logical block size.

Fixes: e36f6204(block: split bios to maxpossible length)
Reported-by: NStefan Haberland <sth@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NMing Lei <tom.leiming@gmail.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d0e5fbb0

13 1月, 2016 1 次提交

block: split bios to max possible length · e36f6204

由 Keith Busch 提交于 1月 12, 2016

This splits bio in the middle of a vector to form the largest possible
bio at the h/w's desired alignment, and guarantees the bio being split
will have some data.

The criteria for splitting is changed from the max sectors to the h/w's
optimal sector alignment if it is provided. For h/w that advertise their
block storage's underlying chunk size, it's a big performance win to not
submit commands that cross them. If sector alignment is not provided,
this patch uses the max sectors as before.

This addresses the performance issue commit d3805611 attempted to
fix, but was reverted due to splitting logic error.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: <stable@vger.kernel.org> # 4.4.x-
Signed-off-by: NJens Axboe <axboe@fb.com>

e36f6204

09 1月, 2016 1 次提交

Revert "block: Split bios on chunk boundaries" · 6126eb24

由 Jens Axboe 提交于 1月 08, 2016

This reverts commit d3805611.

If we end up splitting on the first segment, we don't adjust
the sector count. That results in hitting a BUG() with attempting
to split 0 sectors.

As this is just a performance issue and not a regression since
4.3 release, let's just rever this change. That gives us more
time to test a real fix for 4.5, which would be marked for
stable anyway.

6126eb24

23 12月, 2015 1 次提交

block: Split bios on chunk boundaries · d3805611

由 Keith Busch 提交于 12月 22, 2015

For h/w that advertise their block storage's underlying chunk size, it's
a big performance win to not submit commands that cross them. This patch
uses that criteria if it is provided. If it is not provided, this patch
uses the max sectors as before.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d3805611

04 12月, 2015 1 次提交

block: add call to split trace point · cda22646

由 Mike Krinkin 提交于 12月 03, 2015

There is a split tracepoint that is supposed to be called when
bio is splitted, and it was called in bio_split function until
commit 4b1faf93 ("block: Kill bio_pair_split()").
But now, no one reports splits, so this patch adds call to
trace_block_split back in blk_queue_split right after split.
Signed-off-by: NMike Krinkin <krinkin.m.u@gmail.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

cda22646

01 12月, 2015 1 次提交

blk-merge: fix computing bio->bi_seg_front_size in case of single segment · a88d32af

由 Ming Lei 提交于 11月 30, 2015

When bio has only one physical segment, we should set bio's
bi_seg_front_size as the real(final) size of the single segment.

Fixes: 02e70742(blk-merge: fix blk_bio_segment_split)
Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a88d32af

24 11月, 2015 3 次提交

blk-merge: warn if figured out segment number is bigger than nr_phys_segments · 12e57f59

由 Ming Lei 提交于 11月 24, 2015

We had seen lots of reports of this kind issue, so add one
warnning in blk-merge, then it can be triggered easily and
avoid to depend on warning/bug from drivers.
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

12e57f59

blk-merge: fix blk_bio_segment_split · 02e70742

由 Ming Lei 提交于 11月 24, 2015

Commit bdced438(block: setup bi_phys_segments after
splitting) introduces function of computing bio->bi_phys_segments
during bio splitting.

Unfortunately both bio->bi_seg_front_size and bio->bi_seg_back_size
arn't computed, so too many physical segments may be obtained
for one request since both the two are used to check if one segment
across two bios can be possible.

This patch fixes the issue by computing the two variables in
blk_bio_segment_split().

Fixes: bdced438(block: setup bi_phys_segments after splitting)
Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
Reported-by: NMark Salter <msalter@redhat.com>
Tested-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: NMark Salter <msalter@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

02e70742

block: fix segment split · 578270bf

由 Ming Lei 提交于 11月 24, 2015

Inside blk_bio_segment_split(), previous bvec pointer(bvprvp)
always points to the iterator local variable, which is obviously
wrong, so fix it by pointing to the local variable of 'bvprv'.

Fixes: 5014c311(block: fix bogus compiler warnings in blk-merge.c)
Cc: stable@kernel.org #4.3
Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
Reported-by: NMark Salter <msalter@redhat.com>
Tested-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: NMark Salter <msalter@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

578270bf

22 10月, 2015 2 次提交

block: avoid to merge splitted bio · 6ac45aeb

由 Ming Lei 提交于 10月 20, 2015

The splitted bio has been already too fat to merge, so mark it
as NOMERGE.
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6ac45aeb

block: setup bi_phys_segments after splitting · bdced438

由 Ming Lei 提交于 10月 20, 2015

The number of bio->bi_phys_segments is always obtained
during bio splitting, so it is natural to setup it
just after bio splitting, then we can avoid to compute
nr_segment again during merge.
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

bdced438

17 9月, 2015 1 次提交

block: blk-merge: fast-clone bio when splitting rw bios · 52cc6eea

由 Ming Lei 提交于 9月 17, 2015

biovecs has become immutable since v3.13, so it isn't necessary
to allocate biovecs for the new cloned bios, then we can save
one extra biovecs allocation/copy, and the allocation is often
not fixed-length and a bit more expensive.

For example, if the 'max_sectors_kb' of null blk's queue is set
as 16(32 sectors) via sysfs just for making more splits, this patch
can increase throught about ~70% in the sequential read test over
null_blk(direct io, bs: 1M).

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Ming Lin <ming.l@ssi.samsung.com>
Cc: Dongsu Park <dpark@posteo.net>
Signed-off-by: NMing Lei <ming.lei@canonical.com>

This fixes a performance regression introduced by commit 54efd50b,
and allows us to take full advantage of the fact that we have immutable
bio_vecs. Hand applied, as it rejected violently with commit
5014c311.
Signed-off-by: NJens Axboe <axboe@fb.com>

52cc6eea

11 9月, 2015 1 次提交

block: Refuse request/bio merges with gaps in the integrity payload · 7f39add3

由 Sagi Grimberg 提交于 9月 11, 2015

If a driver sets the block queue virtual boundary mask, it means that
it cannot handle gaps so we must not allow those in the integrity
payload as well.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>

Fixed up by me to have duplicate integrity merge functions, depending
on whether block integrity is enabled or not. Fixes a compilations
issue with CONFIG_BLK_DEV_INTEGRITY unset.
Signed-off-by: NJens Axboe <axboe@fb.com>

7f39add3

04 9月, 2015 1 次提交

block: Check for gaps on front and back merges · 5e7c4274

由 Jens Axboe 提交于 9月 03, 2015

We are checking for gaps to previous bio_vec, which can
only detect back merges gaps. Moreover, at the point where
we check for a gap, we don't know if we will attempt a back
or a front merge. Thus, check for gap to prev in a back merge
attempt and check for a gap to next in a front merge attempt.
Signed-off-by: NJens Axboe <axboe@fb.com>
[sagig: Minor rename change]
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>

5e7c4274

03 9月, 2015 1 次提交

block: fix bogus compiler warnings in blk-merge.c · 5014c311

由 Jens Axboe 提交于 9月 02, 2015

The compiler can't figure out that bvprv is initialized whenever 'prev'
is set to 1 as well. Use a pointer to bvprv instead, setting it to NULL
initially, and get rid of the 'prev' tracking. This dumbs it down
enough that gcc is happy.
Signed-off-by: NJens Axboe <axboe@fb.com>

5014c311

02 9月, 2015 1 次提交

blk: Fix bio_io_vec index when checking bvec gaps · 2ca495ac

由 Keith Busch 提交于 9月 01, 2015

Corrects a coding error from earlier patch.

Reported by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Fixes: 03100aad ("block: Replace SG_GAPS with new queue limits mask")
Signed-off-by: NJens Axboe <axboe@fb.com>

2ca495ac

20 8月, 2015 1 次提交

block: Replace SG_GAPS with new queue limits mask · 03100aad

由 Keith Busch 提交于 8月 19, 2015

The SG_GAPS queue flag caused checks for bio vector alignment against
PAGE_SIZE, but the device may have different constraints. This patch
adds a queue limits so a driver with such constraints can set to allow
requests that would have been unnecessarily split. The new gaps check
takes the request_queue as a parameter to simplify the logic around
invoking this function.

This new limit makes the queue flag redundant, so removing it and
all usage. Device-mappers will inherit the correct settings through
blk_stack_limits().
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

03100aad

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功