提交 · 82ed4db499b8598f16f8871261bff088d6b0597f · openeuler / Kernel

28 1月, 2017 22 次提交

block: split scsi_request out of struct request · 82ed4db4

由 Christoph Hellwig 提交于 1月 27, 2017

And require all drivers that want to support BLOCK_PC to allocate it
as the first thing of their private data.  To support this the legacy
IDE and BSG code is switched to set cmd_size on their queues to let
the block layer allocate the additional space.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

82ed4db4

block/bsg: move queue creation into bsg_setup_queue · 8ae94eb6

由 Christoph Hellwig 提交于 1月 03, 2017

Simply the boilerplate code needed for bsg nodes a bit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

8ae94eb6

scsi: allocate scsi_cmnd structures as part of struct request · e9c787e6

由 Christoph Hellwig 提交于 1月 02, 2017

Rely on the new block layer functionality to allocate additional driver
specific data behind struct request instead of implementing it in SCSI
itѕelf.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e9c787e6

scsi: remove __scsi_alloc_queue · d48777a6

由 Christoph Hellwig 提交于 1月 02, 2017

Instead do an internal export of __scsi_init_queue for the transport
classes that export BSG nodes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d48777a6

scsi: remove scsi_cmd_dma_pool · eeff68c5

由 Christoph Hellwig 提交于 1月 02, 2017

There is no need for GFP_DMA allocations of the scsi_cmnd structures
themselves, all that might be DMAed to or from is the actual payload,
or the sense buffers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

eeff68c5

scsi: respect unchecked_isa_dma for blk-mq · 0a6ac4ee

由 Christoph Hellwig 提交于 1月 03, 2017

Currently blk-mq always allocates the sense buffer using normal GFP_KERNEL
allocation. Refactor the cmd pool code to split the cmd and sense allocation
and share the code to allocate the sense buffers as well as the sense buffer
slab caches between the legacy and blk-mq path.

Note that this switches to lazy allocation of the sense slab caches - the
slab caches (not the actual allocations) won't be destroy until the scsi
module is unloaded instead of keeping track of hosts using them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0a6ac4ee

scsi: remove gfp_flags member in scsi_host_cmd_pool · 0fbc3e0f

由 Christoph Hellwig 提交于 1月 02, 2017

When using the slab allocator we already decide at cache creation time if
an allocation comes from a GFP_DMA pool using the SLAB_CACHE_DMA flag,
and there is no point passing the kmalloc-family only GFP_DMA flag to
kmem_cache_alloc.  Drop all the infrastructure for doing so.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0fbc3e0f

scsi_dh_hp_sw: switch to scsi_execute_req_flags() · 80e1836c

由 Hannes Reinecke 提交于 11月 03, 2016

Switch to scsi_execute_req_flags() instead of using the block interface
directly.  This will set REQ_QUIET and REQ_PREEMPT, but this is okay as
we're evaluating the errors anyway and should be able to send the command
even if the device is quiesced.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

80e1836c

scsi_dh_emc: switch to scsi_execute_req_flags() · b78205c9

由 Hannes Reinecke 提交于 11月 03, 2016

Switch to scsi_execute_req_flags() and scsi_get_vpd_page() instead of
open-coding it.  Using scsi_execute_req_flags() will set REQ_QUIET and
REQ_PREEMPT, but this is okay as we're evaluating the errors anyway and
should be able to send the command even if the device is quiesced.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b78205c9

scsi_dh_rdac: switch to scsi_execute_req_flags() · 32782557

由 Hannes Reinecke 提交于 11月 03, 2016

Switch to scsi_execute_req_flags() and scsi_get_vpd_page() instead of
open-coding it.  Using scsi_execute_req_flags() will set REQ_QUIET and
REQ_PREEMPT, but this is okay as we're evaluating the errors anyway and
should be able to send the command even if the device is quiesced.
Signed-off-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

32782557

dm: always defer request allocation to the owner of the request_queue · eb8db831

由 Christoph Hellwig 提交于 1月 22, 2017

DM already calls blk_mq_alloc_request on the request_queue of the
underlying device if it is a blk-mq device.  But now that we allow drivers
to allocate additional data and initialize it ahead of time we need to do
the same for all drivers.   Doing so and using the new cmd_size
infrastructure in the block layer greatly simplifies the dm-rq and mpath
code, and should also make arbitrary combinations of SQ and MQ devices
with SQ or MQ device mapper tables easily possible as a further step.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

eb8db831

dm: remove incomplete BLOCK_PC support · 4bf58435

由 Christoph Hellwig 提交于 1月 10, 2017

DM tries to copy a few fields around for BLOCK_PC requests, but given
that no dm-target ever wires up scsi_cmd_ioctl BLOCK_PC can't actually
be sent to dm.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4bf58435

block: cleanup tracing · 48b77ad6

由 Christoph Hellwig 提交于 1月 27, 2017

A couple tweaks to the tracing code:

 - trace the request size for all requests
 - trace request sector and nr_sectors only for fs requests, enforced by
   helpers
 - drop SCSI CDB tracing - we have SCSI tracing for this and are going
   to me the CDB out of the generic struct request soon.

With this the tracing code stops to know about BLOCK_PC requests entirely,
it's just FS vs passthrough requests now, where the latter includes any
driver-private requests.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

48b77ad6

block: allow specifying size for extra command data · 6d247d7f

由 Christoph Hellwig 提交于 1月 27, 2017

This mirrors the blk-mq capabilities to allocate extra drivers-specific
data behind struct request by setting a cmd_size field, as well as having
a constructor / destructor for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6d247d7f

block: simplify blk_init_allocated_queue · 5ea708d1

由 Christoph Hellwig 提交于 1月 03, 2017

Return an errno value instead of the passed in queue so that the callers
don't have to keep track of two queues, and move the assignment of the
request_fn and lock to the caller as passing them as argument doesn't
simplify anything.  While we're at it also remove two pointless NULL
assignments, given that the request structure is zeroed on allocation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5ea708d1

block: fix elevator init check · e6f7f93d

由 Christoph Hellwig 提交于 1月 25, 2017

We can't initalize the elevator fields for flushes as flush share space
in struct request with the elevator data.  But currently we can't
communicate that a request is a flush through blk_get_request as we
can only pass READ or WRITE, and the low-level code looks at the
possible NULL bio to check for a flush.

Fix this by allowing to pass any block op and flags, and by checking for
the flush flags in __get_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e6f7f93d

md: cleanup bio op / flags handling in raid1_write_request · 309bd96a

由 Christoph Hellwig 提交于 1月 25, 2017

No need for the local variables, the bio is still live and we can just
assign the bits we want directly.  Make me wonder why we can't assign
all the bio flags to start with.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

309bd96a

J
Merge branch 'for-4.11/block' into for-4.11/rq-refactor · f924ba70
由 Jens Axboe 提交于 1月 27, 2017
```
Signed-off-by: NJens Axboe <axboe@fb.com>
```
f924ba70

blk-mq: fix debugfs compilation issues · 400f73b2

由 Omar Sandoval 提交于 1月 27, 2017

This fixes a couple of problems:

1. In the !CONFIG_DEBUG_FS case, the stub definitions were bogus.
2. In the !CONFIG_BLOCK case, blk-mq-debugfs.c shouldn't be compiled at
   all.

Fix the stub definitions and add a CONFIG_BLK_DEBUG_FS Kconfig option.

Fixes: 07e4fead ("blk-mq: create debugfs directory tree")
Signed-off-by: NOmar Sandoval <osandov@fb.com>

Augment Kconfig description.
Signed-off-by: NJens Axboe <axboe@fb.com>

400f73b2

J
block: cleanup remaining manual checks for PREFLUSH|FUA · f3a8ab7d
由 Jens Axboe 提交于 1月 27, 2017
```
Use op_is_flush() where applicable.
Signed-off-by: NJens Axboe <axboe@fb.com>
```
f3a8ab7d

blk-mq-sched: add flush insertion into blk_mq_sched_insert_request() · bd6737f1

由 Jens Axboe 提交于 1月 27, 2017

Instead of letting the caller check this and handle the details
of inserting a flush request, put the logic in the scheduler
insertion function. This fixes direct flush insertion outside
of the usual make_request_fn calls, like from dm via
blk_insert_cloned_request().
Signed-off-by: NJens Axboe <axboe@fb.com>

bd6737f1

block: add a op_is_flush helper · f73f44eb

由 Christoph Hellwig 提交于 1月 27, 2017

This centralizes the checks for bios that needs to be go into the flush
state machine.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f73f44eb

27 1月, 2017 17 次提交

blk-mq-sched: change ->dispatch_requests() to ->dispatch_request() · c13660a0

由 Jens Axboe 提交于 1月 26, 2017

When we invoke dispatch_requests(), the scheduler empties everything
into the passed in list. This isn't always a good thing, since it
means that we remove items that we could have potentially merged
with.

Change the function to dispatch single requests at the time. If
we do that, we can backoff exactly at the point where the device
can't consume more IO, and leave the rest with the scheduler for
better merging and future dispatch decision making.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Tested-by: NHannes Reinecke <hare@suse.com>

c13660a0

blk-mq-sched: fix starvation for multiple hardware queues and shared tags · 50e1dab8

由 Jens Axboe 提交于 1月 26, 2017

If we have both multiple hardware queues and shared tag map between
devices, we need to ensure that we propagate the hardware queue
restart bit higher up. This is because we can get into a situation
where we don't have any IO pending on a hardware queue, yet we fail
getting a tag to start new IO. If that happens, it's not enough to
mark the hardware queue as needing a restart, we need to bubble
that up to the higher level queue as well.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Tested-by: NHannes Reinecke <hare@suse.com>

50e1dab8

blk-mq: release driver tag on a requeue event · 99cf1dc5

由 Jens Axboe 提交于 1月 26, 2017

We don't want to hold on to this resource when we have a scheduler
attached.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Tested-by: NHannes Reinecke <hare@suse.com>

99cf1dc5

blk-mq: fix potential race in queue restart and driver tag allocation · 3c782d67

由 Jens Axboe 提交于 1月 26, 2017

Once we mark the queue as needing a restart, re-check if we can
get a driver tag. This fixes a theoretical issue where the needed
IO completes _after_ blk_mq_get_driver_tag() fails, but before we
manage to set the restart bit.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Tested-by: NHannes Reinecke <hare@suse.com>

3c782d67

blk-mq: improve scheduler queue sync/async running · 0abad774

由 Jens Axboe 提交于 1月 26, 2017

We'll use the same criteria for whether we need to run the queue sync
or async when we have a scheduler, as we do without one.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Tested-by: NHannes Reinecke <hare@suse.com>

0abad774

blk-mq: move hctx and ctx counters from sysfs to debugfs · 4a46f05e

由 Omar Sandoval 提交于 1月 25, 2017

These counters aren't as out-of-place in sysfs as the other stuff, but
debugfs is a slightly better home for them.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4a46f05e

blk-mq: move hctx io_poll, stats, and dispatched from sysfs to debugfs · be215473

由 Omar Sandoval 提交于 1月 25, 2017

These statistics _might_ be useful to userspace, but it's better not to
commit to an ABI for these yet. Also, the dispatched file in sysfs
couldn't be cleared, so make it clearable like the others in debugfs.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

be215473

blk-mq: add tags and sched_tags bitmaps to debugfs · d7e3621a

由 Omar Sandoval 提交于 1月 25, 2017

These can be used to debug issues like tag leaks and stuck requests.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d7e3621a

blk-mq: move tags and sched_tags info from sysfs to debugfs · d96b37c0

由 Omar Sandoval 提交于 1月 25, 2017

These are very tied to the blk-mq tag implementation, so exposing them
to sysfs isn't a great idea. Move the debugging information to debugfs
and add basic entries for the number of tags and the number of reserved
tags to sysfs.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d96b37c0

blk-mq: export software queue pending map to debugfs · 0bfa5288

由 Omar Sandoval 提交于 1月 25, 2017

This is useful for debugging problems where we've gotten stuck with
requests in the software queues.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0bfa5288

sbitmap: add helpers for dumping to a seq_file · 24af1ccf

由 Omar Sandoval 提交于 1月 25, 2017

This is useful debugging information that will be used in the blk-mq
debugfs directory.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>

Changed 'weight' to 'busy'.
Signed-off-by: NJens Axboe <axboe@fb.com>

24af1ccf

blk-mq: add extra request information to debugfs · 7b393852

由 Omar Sandoval 提交于 1月 25, 2017

The request pointers by themselves aren't super useful.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

7b393852

blk-mq: move hctx->dispatch and ctx->rq_list from sysfs to debugfs · 950cd7e9

由 Omar Sandoval 提交于 1月 25, 2017

These lists are only useful for debugging; they definitely don't belong
in sysfs. Putting them in debugfs also removes the limitation of a
single page of output.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

950cd7e9

blk-mq: add hctx->{state,flags} to debugfs · 9abb2ad2

由 Omar Sandoval 提交于 1月 25, 2017

hctx->state could come in handy for bugs where the hardware queue gets
stuck in the stopped state, and hctx->flags is just useful to know.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9abb2ad2

blk-mq: create debugfs directory tree · 07e4fead

由 Omar Sandoval 提交于 1月 25, 2017

In preparation for putting blk-mq debugging information in debugfs,
create a directory tree mirroring the one in sysfs:

    # tree -d /sys/kernel/debug/block
    /sys/kernel/debug/block
    |-- nvme0n1
    |   `-- mq
    |       |-- 0
    |       |   `-- cpu0
    |       |-- 1
    |       |   `-- cpu1
    |       |-- 2
    |       |   `-- cpu2
    |       `-- 3
    |           `-- cpu3
    `-- vda
        `-- mq
            `-- 0
                |-- cpu0
                |-- cpu1
                |-- cpu2
                `-- cpu3

Also add the scaffolding for the actual files that will go in here,
either under the hardware queue or software queue directories.
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

07e4fead

blk-mq-sched: check for successful allocation before assigning tag · b48fda09

由 Jens Axboe 提交于 1月 26, 2017

We don't trigger this from the normal IO path, since we always use
blocking allocations from there. But Bart saw it testing multipath
dm, since that is a heavy user of atomic request allocations in
the map and clone path.
Reported-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b48fda09

blk-mq: don't lose flags passed in to blk_mq_alloc_request() · 5a797e00

由 Jens Axboe 提交于 1月 26, 2017

If we come in from blk_mq_alloc_requst() with NOWAIT set in flags,
we must ensure that we don't later overwrite that in
blk_mq_sched_get_request(). Initialize alloc_data->flags before
passing it in.
Reported-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5a797e00

25 1月, 2017 1 次提交

blk-mq: only apply active queue tag throttling for driver tags · 200e86b3

由 Jens Axboe 提交于 1月 25, 2017

If we have a scheduler attached, we have two sets of tags. We don't
want to apply our active queue throttling for the scheduler side
of tags, that only applies to driver tags since that's the resource
we need to dispatch an IO.
Signed-off-by: NJens Axboe <axboe@fb.com>

200e86b3

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功