提交 · a9b54bb95176cd27f952cd9647849022c4c998d6 · openeuler / Kernel

22 6月, 2015 1 次提交

drivers: xen-blkfront: only talk_to_blkback() when in XenbusStateInitialising · a9b54bb9

由 Bob Liu 提交于 6月 19, 2015

Patch 69b91ede
"drivers: xen-blkback: delay pending_req allocation to connect_ring"
exposed an problem that Xen blkfront has. There is a race
with XenStored and the drivers such that we can see two:

vbd vbd-268440320: blkfront:blkback_changed to state 2.
vbd vbd-268440320: blkfront:blkback_changed to state 2.
vbd vbd-268440320: blkfront:blkback_changed to state 4.

state changes to XenbusStateInitWait ('2'). The end result is that
blkback_changed() receives two notify and calls twice setup_blkring().

While the backend driver may only get the first setup_blkring() which is
wrong and reads out-dated (or reads them as they are being updated
with new ring-ref values).

The end result is that the ring ends up being incorrectly set.

The other drivers in the tree have such checks already in.
Reported-and-Tested-by: NRobert Butera <robert.butera@oracle.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

a9b54bb9

06 6月, 2015 8 次提交

xen/block: add multi-page ring support · 86839c56

由 Bob Liu 提交于 6月 03, 2015

Extend xen/block to support multi-page ring, so that more requests can be
issued by using more than one pages as the request ring between blkfront
and backend.
As a result, the performance can get improved significantly.

We got some impressive improvements on our highend iscsi storage cluster
backend. If using 64 pages as the ring, the IOPS increased about 15 times
for the throughput testing and above doubled for the latency testing.

The reason was the limit on outstanding requests is 32 if use only one-page
ring, but in our case the iscsi lun was spread across about 100 physical
drives, 32 was really not enough to keep them busy.

Changes in v2:
 - Rebased to 4.0-rc6.
 - Document on how multi-page ring feature working to linux io/blkif.h.

Changes in v3:
 - Remove changes to linux io/blkif.h and follow the protocol defined
   in io/blkif.h of XEN tree.
 - Rebased to 4.1-rc3

Changes in v4:
 - Turn to use 'ring-page-order' and 'max-ring-page-order'.
 - A few comments from Roger.

Changes in v5:
 - Clarify with 4k granularity to comment
 - Address more comments from Roger
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

86839c56

driver: xen-blkfront: move talk_to_blkback to a more suitable place · 8ab0144a

由 Bob Liu 提交于 6月 03, 2015

The major responsibility of talk_to_blkback() is allocate and initialize
the request ring and write the ring info to xenstore.
But this work should be done after backend entered 'XenbusStateInitWait' as
defined in the protocol file.
See xen/include/public/io/blkif.h in XEN git tree:
Front                                Back
=================================    =====================================
XenbusStateInitialising              XenbusStateInitialising
 o Query virtual device               o Query backend device identification
   properties.                          data.
 o Setup OS device instance.          o Open and validate backend device.
                                      o Publish backend features and
                                        transport parameters.
                                                     |
                                                     |
                                                     V
                                     XenbusStateInitWait

o Query backend features and
  transport parameters.
o Allocate and initialize the
  request ring.

There is no problem with this yet, but it is an violation of the design and
furthermore it would not allow frontend/backend to negotiate 'multi-page'
and 'multi-queue' features.

Changes in v2:
 - Re-write the commit message to be more clear.
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

8ab0144a

drivers: xen-blkback: delay pending_req allocation to connect_ring · 69b91ede

由 Bob Liu 提交于 6月 03, 2015

This is a pre-patch for multi-page ring feature.
In connect_ring, we can know exactly how many pages are used for the shared
ring, delay pending_req allocation here so that we won't waste too much memory.
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

69b91ede

NVMe: Automatic namespace rescan · a5768aa8

由 Keith Busch 提交于 6月 01, 2015

Namespaces may be dynamically allocated and deleted or attached and
detached. This has the driver rescan the device for namespace changes
after each device reset or namespace change asynchronous event.

There could potentially be many detached namespaces that we don't want
polluting /dev/ with unusable block handles, so this will delete disks
if the namespace is not active as indicated by the response from identify
namespace. This also skips adding the disk if no capacity is provisioned
to the namespace in the first place.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a5768aa8

J

Merge branch 'for-4.2/core' into for-4.2/drivers · b281ebb8
由 Jens Axboe 提交于 6月 05, 2015

b281ebb8

block: add blk_set_queue_dying() to blkdev.h · 3f21c265

由 Jens Axboe 提交于 6月 05, 2015

We export this function and NVMe wants to use it, but for some reason
it was never added to the block header. Do that.
Signed-off-by: NJens Axboe <axboe@fb.com>

3f21c265

NVMe: Memory barrier before queue_count is incremented · 36a7e993

由 Jon Derrick 提交于 5月 27, 2015

Protects against reordering and/or preempting which would allow the
kthread to access the queue descriptor before it is set up
Signed-off-by: NJon Derrick <jonathan.derrick@intel.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

36a7e993

NVMe: add sysfs and ioctl controller reset · 4cc06521

由 Keith Busch 提交于 6月 05, 2015

We need the ability to perform an nvme controller reset as discussed on
the mailing list thread:

  http://lists.infradead.org/pipermail/linux-nvme/2015-March/001585.html

This adds a sysfs entry that when written to will reset perform an NVMe
controller reset if the controller was successfully initialized in the
first place.

This also adds locking around resetting the device in the async probe
method so the driver can't schedule two resets.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Cc: Brandon Schultz <brandon.schulz@hgst.com>
Cc: David Sariel <david.sariel@pmcs.com>

Updated by Jens to:

1) Merge this with the ioctl reset patch from David Sariel. The ioctl
   path now shares the reset code from the sysfs path.

2) Don't flush work if we fail issuing the reset.
Signed-off-by: NJens Axboe <axboe@fb.com>

4cc06521

02 6月, 2015 5 次提交

null_blk: restart request processing on completion handler · 8b70f45e

由 Akinobu Mita 提交于 6月 02, 2015

When irqmode=2 (IRQ completion handler is timer) and queue_mode=1
(Block interface to use is rq), the completion handler should restart
request handling for any pending requests on a queue because request
processing stops when the number of commands are queued more than
hw_queue_depth (null_rq_prep_fn returns BLKPREP_DEFER).

Without this change, the following command cannot finish.

	# modprobe null_blk irqmode=2 queue_mode=1 hw_queue_depth=1
	# fio --name=t --rw=read --size=1g --direct=1 \
	  --ioengine=libaio --iodepth=64 --filename=/dev/nullb0
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8b70f45e

null_blk: prevent timer handler running on a different CPU where started · 419c21a3

由 Akinobu Mita 提交于 6月 02, 2015

When irqmode=2 (IRQ completion handler is timer), timer handler should
be called on the same CPU where the timer has been started.

Since completion_queues are per-cpu and the completion handler only
touches completion_queue for local CPU, we need to prevent the handler
from running on a different CPU where the timer has been started.
Otherwise, the IO cannot be completed until another completion handler
is executed on that CPU.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

419c21a3

NVMe: Remove hctx reliance for multi-namespace · 42483228

由 Keith Busch 提交于 6月 01, 2015

The driver needs to track shared tags to support multiple namespaces
that may be dynamically allocated or deleted. Relying on the first
request_queue's hctx's is not appropriate as we cannot clear outstanding
tags for all namespaces using this handle, nor can the driver easily track
all request_queue's hctx as namespaces are attached/detached. Instead,
this patch uses the nvme_dev's tagset to get the shared tag resources
instead of through a request_queue hctx.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

42483228

J

Merge branch 'for-4.2/core' into for-4.2/drivers · 843e8ddb
由 Jens Axboe 提交于 6月 01, 2015

843e8ddb

blk-mq: Shared tag enhancements · f26cdc85

由 Keith Busch 提交于 6月 01, 2015

Storage controllers may expose multiple block devices that share hardware
resources managed by blk-mq. This patch enhances the shared tags so a
low-level driver can access the shared resources not tied to the unshared
h/w contexts. This way the LLD can dynamically add and delete disks and
request queues without having to track all the request_queue hctx's to
iterate outstanding tags.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f26cdc85

30 5月, 2015 4 次提交

block: don't honor chunk sizes for data-less IO · e548ca4e

由 Jens Axboe 提交于 5月 29, 2015

We don't need to honor chunk sizes for IO that doesn't carry any
data.
Signed-off-by: NJens Axboe <axboe@fb.com>

e548ca4e

block: only honor SG gap prevention for merges that contain data · beefa6ba

由 Jens Axboe 提交于 5月 29, 2015

We can safely merge anything that wont generate an SG list entry,
so if the bio is data-less (discard), don't look at potential
SG gaps.
Signed-off-by: NJens Axboe <axboe@fb.com>

beefa6ba

NVMe: End sync requests immediately on failure · 75619bfa

由 Keith Busch 提交于 5月 28, 2015

Do not retry failed sync commands so the original status may be seen
without issuing unnecessary retries.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

75619bfa

NVMe: Use requested sync command timeout · f4ff414a

由 Keith Busch 提交于 5月 28, 2015

Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f4ff414a

27 5月, 2015 1 次提交

block: fix returnvar.cocci warnings · f6454b04

由 Julia Lawall 提交于 5月 26, 2015

Remove unneeded variable used to store return value.

Generated by: scripts/coccinelle/misc/returnvar.cocci
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NJulia Lawall <julia.lawall@lip6.fr>
Signed-off-by: NJens Axboe <axboe@fb.com>

f6454b04

23 5月, 2015 1 次提交

NVMe: Fix obtaining command result · a0a931d6

由 Keith Busch 提交于 5月 22, 2015

Replaces req->sense_len usage, which is not owned by the LLD, to
req->special to contain the command result for driver created commands,
and sets the result unconditionally on completion.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@fb.com>
Fixes: d29ec824 ("nvme: submit internal commands through the block layer")
Signed-off-by: NJens Axboe <axboe@fb.com>

a0a931d6

22 5月, 2015 11 次提交

block, dm: don't copy bios for request clones · 5f1b670d

由 Christoph Hellwig 提交于 5月 22, 2015

Currently dm-multipath has to clone the bios for every request sent
to the lower devices, which wastes cpu cycles and ties down memory.

This patch instead adds a new REQ_CLONE flag that instructs req_bio_endio
to not complete bios attached to a request, which we set on clone
requests similar to bios in a flush sequence.  With this change I/O
errors on a path failure only get propagated to dm-multipath, which
can then either resubmit the I/O or complete the bios on the original
request.

I've done some basic testing of this on a Linux target with ALUA support,
and it survives path failures during I/O nicely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5f1b670d

block: remove management of bi_remaining when restoring original bi_end_io · 326e1dbb

由 Mike Snitzer 提交于 5月 22, 2015

Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
non-chains") regressed all existing callers that followed this pattern:
 1) saving a bio's original bi_end_io
 2) wiring up an intermediate bi_end_io
 3) restoring the original bi_end_io from intermediate bi_end_io
 4) calling bio_endio() to execute the restored original bi_end_io

The regression was due to BIO_CHAIN only ever getting set if
bio_inc_remaining() is called.  For the above pattern it isn't set until
step 3 above (step 2 would've needed to establish BIO_CHAIN).  As such
the first bio_endio(), in step 2 above, never decremented __bi_remaining
before calling the intermediate bi_end_io -- leaving __bi_remaining with
the value 1 instead of 0.  When bio_inc_remaining() occurred during step
3 it brought it to a value of 2.  When the second bio_endio() was
called, in step 4 above, it should've called the original bi_end_io but
it didn't because there was an extra reference that wasn't dropped (due
to atomic operations being optimized away since BIO_CHAIN wasn't set
upfront).

Fix this issue by removing the __bi_remaining management complexity for
all callers that use the above pattern -- bio_chain() is the only
interface that _needs_ to be concerned with __bi_remaining.  For the
above pattern callers just expect the bi_end_io they set to get called!
Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
that aren't associated with the bio_chain() interface.

Also, the bio_inc_remaining() interface has been moved local to bio.c.

Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

326e1dbb

nvme: submit internal commands through the block layer · d29ec824

由 Christoph Hellwig 提交于 5月 22, 2015

Use block layer queues with an internal cmd_type to submit internally
generated NVMe commands. This both simplifies the code a lot and allow
for a better structure. For example now the LighNVM code can construct
commands without knowing the details of the underlying I/O descriptors.
Or a future NVMe over network target could inject commands, as well as
could the SCSI translation and ioctl code be reused for such a beast.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d29ec824

C
nvme: fail SCSI read/write command with unsupported protection bit · 772ce435
由 Christoph Hellwig 提交于 5月 22, 2015
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
772ce435

nvme: report the DPOFUA in MODE_SENSE · 90851768

由 Christoph Hellwig 提交于 5月 22, 2015

NVMe device always support the FUA bit, and the SCSI translations
accepts the DPO bit, which doesn't have much of a meaning for us.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

90851768

C
nvme: simplify and cleanup the READ/WRITE SCSI CDB parsing code · cbbb7a2e
由 Christoph Hellwig 提交于 5月 22, 2015
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
cbbb7a2e
C
nvme: first round at deobsfucating the SCSI translation code · 3726897e
由 Christoph Hellwig 提交于 5月 22, 2015
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
3726897e

nvme: fix scsi translation error handling · e61b0a86

由 Christoph Hellwig 提交于 5月 22, 2015

Erorr handling for the scsi translation was completely broken, as there
were two different positive error number spaces overlapping.  Fix this
up by removing one of them, and centralizing the generation of the other
positive values in a single place.  Also fix up a few places that didn't
handle the NVMe error codes properly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

e61b0a86

nvme: split nvme_trans_send_fw_cmd · b90c48d0

由 Christoph Hellwig 提交于 5月 22, 2015

This function handles two totally different opcodes, so split it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

b90c48d0

nvme: store a struct device pointer in struct nvme_dev · e75ec752

由 Christoph Hellwig 提交于 5月 22, 2015

Most users want the generic device, so store that in struct nvme_dev
instead of the pci_dev.  This also happens to be a nice step towards
making some code reusable for non-PCI transports.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

e75ec752

nvme: consolidate synchronous command submission helpers · f705f837

由 Christoph Hellwig 提交于 5月 22, 2015

Note that we keep the unused timeout argument, but allow callers to
pass 0 instead of a timeout if they want the default.  This will allow
adding a timeout to the pass through path later on.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

f705f837

20 5月, 2015 9 次提交

loop: remove (now) unused 'out' label · 6a927007

由 Jens Axboe 提交于 5月 20, 2015

gcc, righfully, complains:

drivers/block/loop.c:1369:1: warning: label 'out' defined but not used [-Wunused-label]

Kill it.
Signed-off-by: NJens Axboe <axboe@fb.com>

6a927007

s390/block/dasd: remove obsolete while -EBUSY loop · a05e5780

由 Jarod Wilson 提交于 5月 06, 2015

With the mutex_trylock bit gone from blkdev_reread_part(), the retry logic
in dasd_scan_partitions() shouldn't be necessary.

CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a05e5780

block: dasd_genhd: convert to blkdev_reread_part · 6029a06c

由 Ming Lei 提交于 5月 06, 2015

Also remove the obsolete comment.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6029a06c

block: nbd: convert to blkdev_reread_part() · 9dcd1379

由 Ming Lei 提交于 5月 06, 2015

Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9dcd1379

block: loop: fix another reread part failure · 06f0e9e6

由 Ming Lei 提交于 5月 06, 2015

loop_clr_fd() can be run piggyback with lo_release(), and
under this situation, reread partition may always fail because
bd_mutex has been held already.

This patch detects the situation by the reference count, and
call __blkdev_reread_part() to avoid acquiring the lock again.

In the meantime, this patch switches to new kernel APIs
of blkdev_reread_part() and __blkdev_reread_part().
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

06f0e9e6

block: loop: don't hold lo_ctl_mutex in lo_open · f8933667

由 Ming Lei 提交于 5月 06, 2015

The lo_ctl_mutex is held for running all ioctl handlers, and
in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
rereading partitions, which requires bd_mutex.

So it is easy to cause failure because trylock(bd_mutex) may
fail inside blkdev_reread_part(), and follows the lock context:

blkid or other application:
	->open()
		->mutex_lock(bd_mutex)
		->lo_open()
			->mutex_lock(lo_ctl_mutex)

losetup(set fd ioctl):
	->mutex_lock(lo_ctl_mutex)
	->ioctl_by_bdev(BLKRRPART)
		->trylock(bd_mutex)

This patch trys to eliminate the ABBA lock dependency by removing
lo_ctl_mutext in lo_open() with the following approach:

1) make lo_refcnt as atomic_t and avoid acquiring lo_ctl_mutex in lo_open():
	- for open vs. add/del loop, no any problem because of loop_index_mutex
	- freeze request queue during clr_fd, so I/O can't come until
	  clearing fd is completed, like the effect of holding lo_ctl_mutex
	  in lo_open
	- both open() and release() have been serialized by bd_mutex already

2) don't hold lo_ctl_mutex for decreasing/checking lo_refcnt in
lo_release(), then lo_ctl_mutex is only required for the last release.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f8933667

J
Merge branch 'for-4.2/core' into for-4.2/drivers · b816d45f
由 Jens Axboe 提交于 5月 20, 2015
```
We need the blkdev_reread_part() changes for drivers to adapt.
```
b816d45f

block: replace trylock with mutex_lock in blkdev_reread_part() · b04a5636

由 Ming Lei 提交于 5月 06, 2015

The only possible problem of using mutex_lock() instead of trylock
is about deadlock.

If there aren't any locks held before calling blkdev_reread_part(),
deadlock can't be caused by this conversion.

If there are locks held before calling blkdev_reread_part(),
and if these locks arn't required in open, close handler and I/O
path, deadlock shouldn't be caused too.

Both user space's ioctl(BLKRRPART) and md_setup_drive() from
init/do_mounts_md.c belongs to the 1st case, so the conversion is safe
for the two cases.

For loop, the previous patches in this pathset has fixed the ABBA lock
dependency, so the conversion is OK.

For nbd, tx_lock is held when calling the function:

	- both open and release won't hold the lock
	- when blkdev_reread_part() is run, I/O thread has been stopped
	already, so tx_lock won't be acquired in I/O path at that time.
	- so the conversion won't cause deadlock for nbd

For dasd, both dasd_open(), dasd_release() and request function don't
acquire any mutex/semphone, so the conversion should be safe.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NJarod Wilson <jarod@redhat.com>
Acked-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b04a5636

block: export blkdev_reread_part() and __blkdev_reread_part() · be324177

由 Jarod Wilson 提交于 5月 06, 2015

This patch exports blkdev_reread_part() for block drivers, also
introduce __blkdev_reread_part().

For some drivers, such as loop, reread of partitions can be run
from the release path, and bd_mutex may already be held prior to
calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce
__blkdev_reread_part for use in such cases.

CC: Christoph Hellwig <hch@lst.de>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

be324177

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功