提交 · 74d46992e0d9dee7f1f376de0d56d31614c8a17a · openeuler / Kernel

24 8月, 2017 1 次提交

block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992

由 Christoph Hellwig 提交于 8月 23, 2017

This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

74d46992

14 6月, 2017 1 次提交

dm: missing break in process_queued_bios() · 047385b3

由 Dan Carpenter 提交于 6月 14, 2017

his used to be a fall through case, but we shifted code around and I
think we want a break here now.

Fixes: 4e4cbee9 ("block: switch bios to blk_status_t")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

047385b3

09 6月, 2017 5 次提交

block: switch bios to blk_status_t · 4e4cbee9

由 Christoph Hellwig 提交于 6月 03, 2017

Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

4e4cbee9

block: introduce new block status code type · 2a842aca

由 Christoph Hellwig 提交于 6月 03, 2017

Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings. This patch
instead introduces a new blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning. Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.

For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.

blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2a842aca

dm: change ->end_io calling convention · 1be56909

由 Christoph Hellwig 提交于 6月 03, 2017

Turn the error paramter into a pointer so that target drivers can change
the value, and make sure only DM_ENDIO_* values are returned from the
methods.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1be56909

dm: don't return errnos from ->map · 846785e6

由 Christoph Hellwig 提交于 6月 03, 2017

Instead use the special DM_MAPIO_KILL return value to return -EIO just
like we do for the request based path.  Note that dm-log-writes returned
-ENOMEM in a few places, which now becomes -EIO instead.  No consumer
treats -ENOMEM special so this shouldn't be an issue (and it should
use a mempool to start with to make guaranteed progress).
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

846785e6

dm mpath: merge do_end_io_bio into multipath_end_io_bio · 14ef1e48

由 Christoph Hellwig 提交于 6月 03, 2017

This simplifies the code and especially the error passing a bit and
will help with the next patch.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

14ef1e48

16 5月, 2017 2 次提交

dm mpath: multipath_clone_and_map must not return -EIO · f98e0eb6

由 Christoph Hellwig 提交于 5月 15, 2017

Since 412445ac ("dm: introduce a new DM_MAPIO_KILL return value"), the
clone_and_map_rq methods must not return errno values, so fix it up
to properly return DM_MAPIO_KILL, instead of the -EIO value that snuck
in due to a conflict between two patches.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f98e0eb6

dm mpath: don't return -EIO from dm_report_EIO · 18a482f5

由 Christoph Hellwig 提交于 5月 15, 2017

Instead just turn the macro into a helper for the warning message.
This removes an unnecessary assignment and will allow the next commit to
fix a place where -EIO is the wrong return value.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

18a482f5

02 5月, 2017 2 次提交

dm rq: change ->rq_end_io calling conventions · 7ed8578a

由 Christoph Hellwig 提交于 4月 26, 2017

Instead of returning either a DM_ENDIO_* constant or an error code, add
a new DM_ENDIO_DONE value that means keep errno as is.  This allows us
to easily keep the existing error code in case where we can't push back,
and it also preparares for the new block level status codes with strict
type checking.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7ed8578a

dm mpath: merge do_end_io into multipath_end_io · b79f10ee

由 Christoph Hellwig 提交于 4月 26, 2017

This simplifies the I/O completion path a bit.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b79f10ee

28 4月, 2017 8 次提交

dm mpath: make it easier to detect unintended I/O request flushes · 86331f39

由 Bart Van Assche 提交于 4月 27, 2017

I/O errors triggered by multipathd incorrectly not enabling the no-flush
flag for DM_DEVICE_SUSPEND or DM_DEVICE_RESUME are hard to debug.  Add
more logging to make it easier to debug this.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

86331f39

dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit() · 9a8ac3ae

由 Bart Van Assche 提交于 4月 27, 2017

No functional change but makes the code easier to read.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9a8ac3ae

dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH · ca5beb76

由 Bart Van Assche 提交于 4月 27, 2017

Instead of checking MPATHF_QUEUE_IF_NO_PATH,
MPATHF_SAVED_QUEUE_IF_NO_PATH and the no_flush flag to decide whether
or not to push back a request (or bio) if there are no paths available,
only clear MPATHF_QUEUE_IF_NO_PATH in queue_if_no_path() if no_flush has
not been set. The result is that only a single bit has to be tested in
the hot path to decide whether or not a request must be pushed back and
also that m->lock does not have to be taken in the hot path.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ca5beb76

dm: introduce enum dm_queue_mode to cleanup related code · 7e0d574f

由 Bart Van Assche 提交于 4月 27, 2017

Introduce an enumeration type for the queue mode.  This patch does
not change any functionality but makes the DM code easier to read.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7e0d574f

dm mpath: verify __pg_init_all_paths locking assumptions at runtime · b194679f

由 Bart Van Assche 提交于 4月 27, 2017

Verify at runtime that __pg_init_all_paths() is called with
multipath.lock held if lockdep is enabled.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

b194679f

dm mpath: delay requeuing while path initialization is in progress · c1d7ecf7

由 Bart Van Assche 提交于 4月 27, 2017

Requeuing a request immediately while path initialization is ongoing
causes high CPU usage, something that is undesired.  Hence delay
requeuing while path initialization is in progress.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

c1d7ecf7

dm mpath: avoid that path removal can trigger an infinite loop · 7083abbb

由 Bart Van Assche 提交于 4月 27, 2017

If blk_get_request() fails, check whether the failure is due to a path
being removed.  If that is the case, fail the path by triggering a call
to fail_path().  This avoids that the following scenario can be
encountered while removing paths:
* CPU usage of a kworker thread jumps to 100%.
* Removing the DM device becomes impossible.

Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK,
and the queue is not dying, because in these cases immediate requeuing
is inappropriate.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7083abbb

dm mpath: split and rename activate_path() to prepare for its expanded use · 89bfce76

由 Bart Van Assche 提交于 4月 27, 2017

activate_path() is renamed to activate_path_work() which now calls
activate_or_offline_path().  activate_or_offline_path() will be used
by the next commit.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

89bfce76

25 4月, 2017 1 次提交

dm mpath: requeue after a small delay if blk_get_request() fails · 06eb061f

由 Bart Van Assche 提交于 4月 07, 2017

If blk_get_request() returns ENODEV then multipath_clone_and_map()
causes a request to be requeued immediately. This can cause a kworker
thread to spend 100% of the CPU time of a single core in
__blk_mq_run_hw_queue() and also can cause device removal to never
finish.

Avoid this by only requeuing after a delay if blk_get_request() fails.
Additionally, reduce the requeue delay.

Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

06eb061f

21 4月, 2017 1 次提交

dm mpath: don't check for req->errors · 8fc77980

由 Christoph Hellwig 提交于 4月 20, 2017

We'll get all proper errors reported through ->end_io and ->errors will
go away soon.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

8fc77980

09 4月, 2017 1 次提交

dm: support REQ_OP_WRITE_ZEROES · ac62d620

由 Christoph Hellwig 提交于 4月 05, 2017

Copy & paste from the REQ_OP_WRITE_SAME code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ac62d620

03 2月, 2017 1 次提交
- M
  dm mpath: cleanup -Wbool-operation warning in choose_pgpath() · d19a55cc
  由 Mike Snitzer 提交于 1月 06, 2017
```
Reported-by: NDavid Binderman <dcb314@hotmail.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
  d19a55cc
28 1月, 2017 1 次提交

dm: always defer request allocation to the owner of the request_queue · eb8db831

由 Christoph Hellwig 提交于 1月 22, 2017

DM already calls blk_mq_alloc_request on the request_queue of the
underlying device if it is a blk-mq device.  But now that we allow drivers
to allocate additional data and initialize it ahead of time we need to do
the same for all drivers.   Doing so and using the new cmd_size
infrastructure in the block layer greatly simplifies the dm-rq and mpath
code, and should also make arbitrary combinations of SQ and MQ devices
with SQ or MQ device mapper tables easily possible as a further step.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

eb8db831

09 12月, 2016 1 次提交

dm mpath: use hw_handler_params if attached hw_handler is same as requested · 54cd640d

由 tang.junhui 提交于 11月 24, 2016

Let the requested m->hw_handler_params be used if the attached hardware
handler is the same handler as requested with m->hw_handler_name.
Signed-off-by: Ntang.junhui <tang.junhui@zte.com.cn>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

54cd640d

21 11月, 2016 4 次提交

dm mpath: do not modify *__clone if blk_mq_alloc_request() fails · 6599c84e

由 Bart Van Assche 提交于 11月 15, 2016

Purely cleanup, avoids potential for strange coding bugs.  But in
reality if __multipath_map() fails the caller has no business looking at
*__clone.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

6599c84e

dm mpath: change return type of pg_init_all_paths() from int to void · 4813577f

由 Bart Van Assche 提交于 11月 15, 2016

None of the callers of pg_init_all_paths() check its return value.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4813577f

dm mpath: add checks for priority group count to avoid invalid memory access · cc5bd925

由 tang.junhui 提交于 11月 04, 2016

This avoids the potential for invalid memory access, if/when there are
no priority groups, in response to invalid arguments being sent by the
user via DM message (e.g. "switch_group", "disable_group" or
"enable_group").
Signed-off-by: Ntang.junhui <tang.junhui@zte.com.cn>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

cc5bd925

dm mpath: add m->hw_handler_name NULL pointer check in parse_hw_handler() · f97dc421

由 tang.junhui 提交于 10月 28, 2016

Avoids false positive of no hardware handler being specified (which is
implied by a NULL m->hw_handler_name).
Signed-off-by: Ntang.junhui <tang.junhui@zte.com.cn>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f97dc421

29 9月, 2016 1 次提交

dm mpath: always return reservation conflict without failing over · 8ff232c1

由 Hannes Reinecke 提交于 7月 15, 2015

If dm-mpath encounters an reservation conflict it should not fail the
path (as communication with the target is not affected) but should
rather retry on another path.  However, in doing so we might be inducing
a ping-pong between paths, with no guarantee of any forward progress.
And arguably a reservation conflict is an unexpected error, so we should
be passing it upwards to allow the application to take appropriate
steps.

This change resolves a show-stopper problem seen with the pNFS SCSI
layout because it is trivial to hit reservation conflict based failover
loops without it.

Doubts were raised about the implications of this change relative to
products like IBM's SVC.  But there is little point withholding a fix
for Linux because a proprietary product may or may not have some issues
in its implementation of how it interfaces with Linux.  In the future,
if there is glaring evidence that this change is certainly problematic
we can revisit it.
Signed-off-by: NHannes Reinecke <hare@suse.de>
Acked-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com> # tweaked header

8ff232c1

15 9月, 2016 4 次提交

dm mpath: delay the requeue of blk-mq requests while all paths down · b88efd43

由 Mike Snitzer 提交于 9月 09, 2016

Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq.  Also, return
false from .busy, if all paths are down, so that blk-mq requests get
mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE
being returned to dm-rq.

This change allows for a noticeable reduction in cpu utilization
(reduced kworker load) while all paths are down, e.g.:

system CPU idleness (as measured by fio's --idle-prof=system):
before: system: 86.58%
after:  system: 98.60%
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>

b88efd43

dm mpath: use dm_mq_kick_requeue_list() · 7e48c768

由 Mike Snitzer 提交于 9月 14, 2016

When reinstating a path the blk-mq request_queue's requeue_list should
get kicked.  It makes sense to kick the requeue_list as part of the
existing hook (previously only used by bio-based support).

Rename process_queued_bios_list to process_queued_io_list.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>

7e48c768

dm: convert wait loops to use autoremove_wake_function() · 9f4c3f87

由 Bart Van Assche 提交于 8月 31, 2016

Use autoremove_wake_function() instead of default_wake_function()
to make the dm wait loops more similar to other wait loops in the
kernel.  This patch does not change any functionality.
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9f4c3f87

dm mpath: check if path's request_queue is dying in activate_path() · f10e06b7

由 Mike Snitzer 提交于 9月 01, 2016

If pg_init_retries is set and a request is queued against a multipath
device with all underlying block device request_queues in the "dying"
state then an infinite loop is triggered because activate_path() never
succeeds and hence never calls pg_init_done().

This change avoids that device removal triggers an infinite loop by
failing the activate_path() which causes the "dying" path to be failed.
Reported-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

f10e06b7

08 8月, 2016 1 次提交

block: rename bio bi_rw to bi_opf · 1eff9d32

由 Jens Axboe 提交于 8月 05, 2016

Since commit 63a4cc24, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.
Signed-off-by: NJens Axboe <axboe@fb.com>

1eff9d32

03 8月, 2016 1 次提交

dm mpath: add locking to multipath_resume and must_push_back · 1814f2e3

由 Mike Snitzer 提交于 7月 25, 2016

Multiple flags were being tested without locking.  Protect against
non-atomic bit changes in m->flags by holding m->lock (while testing or
setting the queue_if_no_path related flags).
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

1814f2e3

11 6月, 2016 4 次提交

dm mpath: add optional "queue_mode" feature · e83068a5

由 Mike Snitzer 提交于 5月 24, 2016

Allow a user to specify an optional feature 'queue_mode <mode>' where
<mode> may be "bio", "rq" or "mq" -- which corresponds to bio-based,
request_fn rq-based, and blk-mq rq-based respectively.

If the queue_mode feature isn't specified the default for the
"multipath" target is still "rq" but if dm_mod.use_blk_mq is set to Y
it'll default to mode "mq".

This new queue_mode feature introduces the ability for each multipath
device to have its own queue_mode (whereas before this feature all
multipath devices effectively had to have the same queue_mode).

This commit also goes a long way to eliminate the awkward (ab)use of
DM_TYPE_*, the associated filter_md_type() and other relatively fragile
and difficult to maintain code.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

e83068a5

M
dm mpath: remove bio-based bloat from struct dm_mpath_io · bf661be1
由 Mike Snitzer 提交于 5月 24, 2016
```
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
bf661be1

dm mpath: reinstate bio-based support · 76e33fe4

由 Mike Snitzer 提交于 5月 19, 2016

Add "multipath-bio" target that offers a bio-based multipath target as
an alternative to the request-based "multipath" target -- but in a
following commit "multipath-bio" will immediately be replaced by a new
"queue_mode" feature for the "multipath" target which will allow
bio-based mode to be selected.

When DM multipath was originally converted from bio-based to
request-based the motivation for the change was better dynamic load
balancing (by leveraging block core's request-based IO schedulers, for
merging and sorting, _before_ DM multipath would make the decision on
where to steer the IO -- based on path load and/or availability).

More background is available in this "Request-based Device-mapper
multipath and Dynamic load balancing" paper:
https://www.kernel.org/doc/ols/2007/ols2007v2-pages-235-244.pdf

But we've now come full circle where significantly faster storage
devices no longer need IOs to be made larger to drive optimal IO
performance. And even if they do there have been changes to the block
and filesystem layers that help ensure upper layers are constructing
larger IOs. In addition, SCSI's differentiated IO errors will propagate
through to bio-based IO completion hooks -- so that eliminates another
historic justiciation for request-based DM multipath. Lastly, the block
layer's immutable biovec changes have made bio cloning cheaper than it
has ever been; whereas request cloning is still relatively expensive
(both on a CPU usage and memory footprint level).

As such, bio-based DM multipath offers the promise of a more efficient
IO path for high IOPs devices that are, or will be, emerging.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

76e33fe4

dm: move request-based code out to dm-rq.[hc] · 4cc96131

由 Mike Snitzer 提交于 5月 12, 2016

Add some seperation between bio-based and request-based DM core code.

'struct mapped_device' and other DM core only structures and functions
have been moved to dm-core.h and all relevant DM core .c files have been
updated to include dm-core.h rather than dm.h

DM targets should _never_ include dm-core.h!

[block core merge conflict resolution from Stephen Rothwell]
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>

4cc96131

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功