提交 · 1828d47845ca92ec3fcfd82399f036526afc6a31 · openeuler / qemu

25 8月, 2017 3 次提交

blkdebug: Catch bs->exact_filename overflow · 1828d478

由 Max Reitz 提交于 6月 13, 2017

The bs->exact_filename field may not be sufficient to store the full
blkdebug node filename. In this case, we should not generate a filename
at all instead of an unusable one.

Cc: qemu-stable@nongnu.org
Reported-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170613172006.19685-2-mreitz@redhat.com
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit de81d72d)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

1828d478

commit: Fix completion with extra reference · 1dd3ba38

由 Kevin Wolf 提交于 6月 09, 2017

commit_complete() can't assume that after its block_job_completed() the
job is actually immediately freed; someone else may still be holding
references. In this case, the op blockers on the intermediate nodes make
the graph reconfiguration in the completion code fail.

Call block_job_remove_all_bdrv() manually so that we know for sure that
any blockers on intermediate nodes are given up.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit 4f78a16f)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

1dd3ba38

commit: Fix use after free in completion · f28b8906

由 Kevin Wolf 提交于 6月 02, 2017

The final bdrv_set_backing_hd() could be working on already freed nodes
because the commit job drops its references (through BlockBackends) to
both overlay_bs and top already a bit earlier.

One way to trigger the bug is hot unplugging a disk for which
blockdev_mark_auto_del() cancels the block job.

Fix this by taking BDS-level references while we're still using the
nodes.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
(cherry picked from commit 19ebd13e)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f28b8906

04 8月, 2017 15 次提交

mirror: Drop permissions on s->target on completion · 0ebbef1f

由 Kevin Wolf 提交于 5月 29, 2017

This fixes an assertion failure that was triggered by qemu-iotests 129
on some CI host, while the same test case didn't seem to fail on other
hosts.

Essentially the problem is that the blk_unref(s->target) in
mirror_exit() doesn't necessarily mean that the BlockBackend goes away
immediately. It is possible that the job completion was triggered nested
in mirror_drain(), which looks like this:

    BlockBackend *target = s->target;
    blk_ref(target);
    blk_drain(target);
    blk_unref(target);

In this case, the write permissions for s->target are retained until
after blk_drain(), which makes removing mirror_top_bs fail for the
active commit case (can't have a writable backing file in the chain
without the filter driver).

Explicitly dropping the permissions first means that the additional
reference doesn't hurt and the job can complete successfully even if
called from the nested blk_drain().

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit 63c8ef28)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

0ebbef1f

block: Guarantee that *file is set on bdrv_get_block_status() · 64945cb5

由 Eric Blake 提交于 6月 05, 2017

We document that *file is valid if the return is not an error and
includes BDRV_BLOCK_OFFSET_VALID, but forgot to obey this contract
when a driver (such as blkdebug) lacks a callback.  Messed up in
commit 67a0fd2a (v2.6), when we added the file parameter.

Enhance qemu-iotest 177 to cover this, using a sequence that would
print garbage or even SEGV, because it was dererefencing through
uninitialized memory.  [The resulting test output shows that we
have less-than-ideal block status from the blkdebug driver, but
that's a separate fix coming up soon.]

Setting *file on all paths that return BDRV_BLOCK_OFFSET_VALID is
enough to fix the crash, but we can go one step further: always
setting *file, even on error, means that a broken caller that
blindly dereferences file without checking for error is now more
likely to get a reliable SEGV instead of randomly acting on garbage,
making it easier to diagnose such buggy callers.  Adding an
assertion that file is set where expected doesn't hurt either.

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 81c219ac)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

64945cb5

block: Simplify BDRV_BLOCK_RAW recursion · 6a3f9c5c

由 Eric Blake 提交于 5月 04, 2017

Since we are already in coroutine context during the body of
bdrv_co_get_block_status(), we can shave off a few layers of
wrappers when recursing to query the protocol when a format driver
returned BDRV_BLOCK_RAW.

Note that we are already using the correct recursion later on in
the same function, when probing whether the protocol layer is sparse
in order to find out if we can add BDRV_BLOCK_ZERO to an existing
BDRV_BLOCK_DATA|BDRV_BLOCK_OFFSET_VALID.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>
Message-id: 20170504173745.27414-1-eblake@redhat.com
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ee29d6ad)
* prereq for 81c219acSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

6a3f9c5c

blkdebug: Add ability to override unmap geometries · 48f2dc06

由 Eric Blake 提交于 4月 29, 2017

Make it easier to simulate various unusual hardware setups (for
example, recent commits 3482b9bc and b8d0a980 affect the Dell
Equallogic iSCSI with its 15M preferred and maximum unmap and
write zero sizing, or b2f95fee deals with the Linux loopback
block device having a max_transfer of 64k), by allowing blkdebug
to wrap any other device with further restrictions on various
alignments.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-9-eblake@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit 430b26a8)
* prereq for 81c219acSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

48f2dc06

blkdebug: Simplify override logic · 3ae74003

由 Eric Blake 提交于 4月 29, 2017

Rather than store into a local variable, then copy to the struct
if the value is valid, then reporting errors otherwise, it is
simpler to just store into the struct and report errors if the
value is invalid.  This however requires that the struct store
a 64-bit number, rather than a narrower type.  Likewise, setting
a sane errno value in ret prior to the sequence of parsing and
jumping to out: on error makes it easier for the next patch to
add a chain of similar checks.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-8-eblake@redhat.com
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit 3dc834f8)
* prereq for 81c219acSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

3ae74003

blkdebug: Add pass-through write_zero and discard support · 577cf9e6

由 Eric Blake 提交于 4月 29, 2017

In order to test the effects of artificial geometry constraints
on operations like write zero or discard, we first need blkdebug
to manage these actions.  It also allows us to inject errors on
those operations, just like we can for read/write/flush.

We can also test the contract promised by the block layer; namely,
if a device has specified limits on alignment or maximum size,
then those limits must be obeyed (for now, the blkdebug driver
merely inherits limits from whatever it is wrapping, but the next
patch will further enhance it to allow specific limit overrides).

This patch intentionally refuses to service requests smaller than
the requested alignments; this is because an upcoming patch adds
a qemu-iotest to prove that the block layer is correctly handling
fragmentation, but the test only works if there is a way to tell
the difference at artificial alignment boundaries when blkdebug is
using a larger-than-default alignment.  If we let the blkdebug
layer always defer to the underlying layer, which potentially has
a smaller granularity, the iotest will be thwarted.

Tested by setting up an NBD server with export 'foo', then invoking:
$ ./qemu-io
qemu-io> open -o driver=blkdebug blkdebug::nbd://localhost:10809/foo
qemu-io> d 0 15M
qemu-io> w -z 0 15M

Pre-patch, the server never sees the discard (it was silently
eaten by the block layer); post-patch it is passed across the
wire.  Likewise, pre-patch the write is always passed with
NBD_WRITE (with 15M of zeroes on the wire), while post-patch
it can utilize NBD_WRITE_ZEROES (for less traffic).
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-7-eblake@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit 63188c24)
* prereq for 81c219acSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

577cf9e6

blkdebug: Refactor error injection · 138cf638

由 Eric Blake 提交于 4月 29, 2017

Rather than repeat the logic at each caller of checking if a Rule
exists that warrants an error injection, fold that logic into
inject_error(); and rename it to rule_check() for legibility.
This will help the next patch, which adds two more callers that
need to check rules for the potential of injecting errors.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-6-eblake@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit d157ed5f)
* prereq for 81c219acSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

138cf638

blkdebug: Sanity check block layer guarantees · a1a3d603

由 Eric Blake 提交于 4月 29, 2017

Commits 04ed95f4 and 1a62d0ac updated the block layer to auto-fragment
any I/O to fit within device boundaries. Additionally, when using a
minimum alignment of 4k, we want to ensure the block layer does proper
read-modify-write rather than requesting I/O on a slice of a sector.
Let's enforce that the contract is obeyed when using blkdebug.  For
now, blkdebug only allows alignment overrides, and just inherits other
limits from whatever device it is wrapping, but a future patch will
further enhance things.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-5-eblake@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit e0ef4395)
* prereq for 81c219acSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

a1a3d603

vvfat: fix qemu-img map and qemu-img convert · 636eacb6

由 Hervé Poussineau 提交于 5月 22, 2017

- bs->total_sectors is the number of sectors of the whole disk
- s->sector_count is the number of sectors of the FAT partition

This fixes the following assert in qemu-img map:
qemu-img.c:2641: get_block_status: Assertion `nb_sectors' failed.

This also fixes an infinite loop in qemu-img convert.

Fixes: 4480e0f9
Fixes: https://bugs.launchpad.net/qemu/+bug/1599539
Cc: qemu-stable@nongnu.org
Signed-off-by: NHervé Poussineau <hpoussin@reactos.org>
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 139921aa)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

636eacb6

stream: fix crash in stream_start() when block_job_create() fails · c60a8ed8

由 Alberto Garcia 提交于 5月 15, 2017

The code that tries to reopen a BlockDriverState in stream_start()
when the creation of a new block job fails crashes because it attempts
to dereference a pointer that is known to be NULL.

This is a regression introduced in a170a91f,
likely because the code was copied from stream_complete().

Cc: qemu-stable@nongnu.org
Reported-by: NKashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Tested-by: NKashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 525989a5)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

c60a8ed8

curl: avoid recursive locking of BDRVCURLState mutex · c79bef68

由 Paolo Bonzini 提交于 5月 15, 2017

The curl driver has a ugly hack where, if it cannot find an empty CURLState,
it just uses aio_poll to wait for one to be empty.  This is probably
buggy when used together with dataplane, and the simplest way to fix it
is to use coroutines instead.

A more immediate effect of the bug however is that it can cause a
recursive call to curl_readv_bh_cb and recursively taking the
BDRVCURLState mutex.  This causes a deadlock.

The fix is to unlock the mutex around aio_poll, but for cleanliness we
should also take the mutex around all calls to curl_init_state, even if
reaching the unlock/lock pair is impossible.  The same is true for
curl_clean_state.
Reported-by: NKun Wei <kuwei@redhat.com>
Tested-by: NRichard W.M. Jones <rjones@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-id: 20170515100059.15795-4-pbonzini@redhat.com
Cc: qemu-stable@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NJeff Cody <jcody@redhat.com>
(cherry picked from commit 456af346)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

c79bef68

curl: never invoke callbacks with s->mutex held · 4b519b9f

由 Paolo Bonzini 提交于 5月 15, 2017

All curl callbacks go through curl_multi_do, and hence are called with
s->mutex held.  Note that with comments, and make curl_read_cb drop the
lock before invoking the callback.

Likewise for curl_find_buf, where the callback can be invoked by the
caller.

Cc: qemu-stable@nongnu.org
Reviewed-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-3-pbonzini@redhat.com
Signed-off-by: NJeff Cody <jcody@redhat.com>
(cherry picked from commit 34db05e7)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

4b519b9f

curl: strengthen assertion in curl_clean_state · f00c08cb

由 Paolo Bonzini 提交于 5月 15, 2017

curl_clean_state should only be called after all AIOCBs have been
completed.  This is not so obvious for the call from curl_detach_aio_context,
so assert that.

Cc: qemu-stable@nongnu.org
Reviewed-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-2-pbonzini@redhat.com
Signed-off-by: NJeff Cody <jcody@redhat.com>
(cherry picked from commit 675a7756)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

f00c08cb

block: Add errp to b{lk,drv}_truncate() · 5797a36a

由 Max Reitz 提交于 3月 28, 2017

For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit ed3d2ec9)
* prereq for 698bdfa0Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

5797a36a

block/vhdx: Make vhdx_create() always set errp · 73aa7ad7

由 Max Reitz 提交于 3月 28, 2017

This patch makes vhdx_create() always set errp in case of an error. It
also adds errp parameters to vhdx_create_bat() and
vhdx_create_new_region_table() so we can pass on the error object
generated by blk_truncate() as of a future commit.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Message-id: 20170328205129.15138-2-mreitz@redhat.com
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
(cherry picked from commit 55b9392b)
* prereq for 698bdfa0Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

73aa7ad7

01 8月, 2017 4 次提交

qobject: Use simpler QDict/QList scalar insertion macros · e59084b5

由 Eric Blake 提交于 4月 27, 2017

We now have macros in place to make it less verbose to add a scalar
to QDict and QList, so use them.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then touched up manually to fix a couple of '?:' back to original
spacing, as well as avoiding a long line in monitor.c.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-7-eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
(cherry picked from commit 46f5ac20)
* prereq for fc0932fdSigned-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

e59084b5

qobject: Drop useless QObject casts · 3f308bf3

由 Eric Blake 提交于 4月 27, 2017

We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ff, for example), having
it be automated by Coccinelle makes it easier to maintain.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then I verified that no manual touchups were required.
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-5-eblake@redhat.com>
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
(cherry picked from commit de6e7951)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

3f308bf3

block: Do not unref bs->file on error in BD's open · c1059a3a

由 Max Reitz 提交于 4月 13, 2017

The block layer takes care of removing the bs->file child if the block
driver's bdrv_open()/bdrv_file_open() implementation fails. The block
driver therefore does not need to do so, and indeed should not unless it
sets bs->file to NULL afterwards -- because if this is not done, the
bdrv_unref_child() in bdrv_open_inherit() will dereference the freed
memory block at bs->file afterwards, which is not good.

We can now decide whether to add a "bs->file = NULL;" after each of the
offending bdrv_unref_child() invocations, or just drop them altogether.
The latter is simpler, so let's do that.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit de234897)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

c1059a3a

dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented · 4aa16db9

由 Eric Blake 提交于 7月 21, 2017

We've been documenting the value in bytes since its introduction
in commit b9a9b3a4 (v1.3), where it was actually reported in bytes.

Commit e4654d2d (v2.0) then removed things from block/qapi.c, in
preparation for a rewrite to a list of dirty sectors in the next
commit 21b56835 in block.c, but the new code mistakenly started
reporting in sectors.

Fixes: https://bugzilla.redhat.com/1441460

CC: qemu-stable@nongnu.org
Signed-off-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6c98c57a)
Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>

4aa16db9

18 4月, 2017 1 次提交

block: Walk bs->children carefully in bdrv_drain_recurse · 178bd438

由 Fam Zheng 提交于 4月 18, 2017

The recursive bdrv_drain_recurse may run a block job completion BH that
drops nodes. The coming changes will make that more likely and use-after-free
would happen without this patch

Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to
QLIST_FOREACH_SAFE to prevent such a case from happening.

Since bdrv_unref accesses global state that is not protected by the AioContext
lock, we cannot use bdrv_ref/bdrv_unref unconditionally.  Fortunately the
protection is not needed in IOThread because only main loop can modify a graph
with the AioContext lock held.
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-2-famz@redhat.com>
Reviewed-by: NJeff Cody <jcody@redhat.com>
Tested-by: NJeff Cody <jcody@redhat.com>
Signed-off-by: NFam Zheng <famz@redhat.com>

178bd438

11 4月, 2017 9 次提交

block/io: Comment out permission assertions · e3e0003a

由 Max Reitz 提交于 4月 11, 2017

In case of block migration, there may be writes to BlockBackends that do
not have the write permission taken. Before this issue is fixed (which
is not going to happen in 2.9), we therefore cannot assert that this is
the case.
Suggested-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Tested-by: NKevin Wolf <kwolf@redhat.com>
Message-id: 20170411145050.31290-1-mreitz@redhat.com
Tested-by: NLaurent Vivier <lvivier@redhat.com>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

e3e0003a

sheepdog: Fix crash in co_read_response() · 5eceb01a

由 Kevin Wolf 提交于 4月 11, 2017

This fixes a regression introduced in commit 9d456654.

aio_co_wake() can only be used to reenter a coroutine that was already
previously entered, otherwise co->ctx is uninitialised and we access
garbage. Using it immediately after qemu_coroutine_create() like in
co_read_response() is wrong and causes segfaults.

Replace the call with aio_co_enter(), which gets an explicit AioContext
parameter and works even for new coroutines.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Tested-by: NKashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

5eceb01a

iscsi: Fix iscsi_create · 2ec9a782

由 Fam Zheng 提交于 4月 10, 2017

Since d5895fcb (iscsi: Split URL into individual options), creating
qcow2 image on an iscsi LUN fails:

    qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G
    qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid
        argument

The problem is iscsi_open now expects that transport_name, portal and
target are already parsed into structured options by
iscsi_parse_filename, but it is not called in iscsi_create.
Signed-off-by: NFam Zheng <famz@redhat.com>
Message-id: 20170410075451.21329-1-famz@redhat.com
Reviewed-by: NEric Blake <eblake@redhat.com>
[mreitz: Dropped now superfluous
         qdict_put(bs_options, "filename", ...)]
Signed-off-by: NMax Reitz <mreitz@redhat.com>

2ec9a782

throttle: Remove block from group on hot-unplug · 1606e4cf

由 Eric Blake 提交于 4月 06, 2017

When a block device that is part of a throttle group is hot-unplugged,
we forgot to remove it from the throttle group. This leaves stale
memory around, and causes an easily reproducible crash:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-device virtio-scsi-pci,bus=pci.0 -drive \
id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image2,drive=drive_image2 -drive \
id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image3,drive=drive_image3
{'execute':'qmp_capabilities'}
{'execute':'device_del','arguments':{'id':'image3'}}
{'execute':'system_reset'}

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810Suggested-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-id: 20170406190847.29347-1-eblake@redhat.com
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

1606e4cf

block: pass the right options for BlockDriver.bdrv_open() · 7a9e5119

由 Dong Jia Shi 提交于 4月 05, 2017

raw_open() expects the caller always passing in the right actual
@options parameter. But when trying to applying snapshot on a RBD
image, bdrv_snapshot_goto() calls raw_open() (by calling the
bdrv_open callback on the BlockDriver) with a NULL @options, and
that will result in a Segmentation fault.

For the other non-raw format drivers, it also makes sense to passing
in the actual options, althought they don't trigger the problem so
far.

Let's prepare a @options by adding the "file" key-value pair to a
copy of the actual options that were given for the node (i.e.
bs->options), and pass it to the callback.

BlockDriver.bdrv_open() expects bs->file to be NULL and just
overwrites it with the result from bdrv_open_child(). That means we
should actually make sure it's NULL because otherwise the child BDS
will have a reference count that is 1 too high. So we unconditionally
invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and
we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
deleted in the meantime.
Suggested-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NDong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

7a9e5119

sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE · 76296dff

由 Fam Zheng 提交于 4月 11, 2017

When called from main thread, the coroutine should run in the context of
bs. Use bdrv_coroutine_enter to ensure that.
Signed-off-by: NFam Zheng <famz@redhat.com>

76296dff

block: Fix bdrv_co_flush early return · 49ca6259

由 Fam Zheng 提交于 4月 10, 2017

bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
BDRV_POLL_WHILE to work, even for the shortcut case where flush is
unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
the variable declaration position.
Signed-off-by: NFam Zheng <famz@redhat.com>
Acked-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>

49ca6259

block: Use bdrv_coroutine_enter to start I/O coroutines · e92f0e19

由 Fam Zheng 提交于 4月 10, 2017

BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
the main context, which relies on the yielded coroutine continuing on bs->ctx
before notifying qemu_aio_context with bdrv_wakeup().

Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine
is entered from main loop, co->ctx will be qemu_aio_context, as a result of the
"release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when
both main thread and the iothread access the same BDS:

  main loop                                iothread
-----------------------------------------------------------------------
  blockdev_snapshot
    aio_context_acquire(bs->ctx)
                                           virtio_scsi_data_plane_handle_cmd
    bdrv_drained_begin(bs->ctx)
    bdrv_flush(bs)
      bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
        ...
        qemu_coroutine_yield(co)
      BDRV_POLL_WHILE()
        aio_context_release(bs->ctx)
                                             aio_context_acquire(bs->ctx).return
                                               ...
                                                 aio_co_wake(co)
        aio_poll(qemu_aio_context)               ...
          co_schedule_bh_cb()                    ...
            qemu_coroutine_enter(co)             ...

              /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
                      continues... */
                                             aio_context_release(bs->ctx)
        aio_context_acquire(bs->ctx)

Note that in above case, bdrv_drained_begin() doesn't do the "release,
poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.

Fix this by using bdrv_coroutine_enter and enter coroutine in the right
context.

iotests 109 output is updated because the coroutine reenter flow during
mirror job complete is different (now through co_queue_wakeup, instead
of the unconditional qemu_coroutine_switch before), making the end job
len different.
Signed-off-by: NFam Zheng <famz@redhat.com>
Acked-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>

e92f0e19

block: Make bdrv_parent_drained_begin/end public · 14e9559f

由 Fam Zheng 提交于 4月 08, 2017

Signed-off-by: NFam Zheng <famz@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>

14e9559f

07 4月, 2017 6 次提交

mirror: Fix aio context of mirror_top_bs · 19dd29e8

由 Fam Zheng 提交于 4月 07, 2017

It should be moved to the same context as source, before inserting to the
graph.
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Signed-off-by: NFam Zheng <famz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

19dd29e8

block: Don't check permissions for copy on read · 1bf03e66

由 Kevin Wolf 提交于 4月 07, 2017

The assertion is currently failing. We can't require callers to have
write permissions when all they are doing is a read, so comment it out.
Add a FIXME comment in the code so that the check is re-enabled when
copy on read is refactored into its own filter driver.
Reported-by: NRichard W.M. Jones <rjones@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NRichard W.M. Jones <rjones@redhat.com>

1bf03e66

block/mirror: Fix use-after-free · 7a25fcd0

由 Max Reitz 提交于 4月 03, 2017

If @bs does not have any parents, the only reference to @mirror_top_bs
will be held by the BlockJob object after the bdrv_unref() following
block_job_create(). However, if block_job_create() fails, this reference
will not exist and @mirror_top_bs will have been deleted when we
goto fail.

The issue comes back at all later entries to the fail label: We delete
the BlockJob object before rolling back our changes to the node graph.
This means that we will delete @mirror_top_bs in the process.

All in all, whenever @bs does not have any parents and we go down the
fail path we will dereference @mirror_top_bs after it has been deleted.

Fix this by invoking bdrv_unref() only when block_job_create() was
successful and by bdrv_ref()'ing @mirror_top_bs in the fail path before
deleting the BlockJob object. Finally, bdrv_unref() it at the end of the
fail path after we actually no longer need it.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NJohn Snow <jsnow@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

7a25fcd0

commit: Set commit_top_bs->total_sectors · 0d0676a1

由 Kevin Wolf 提交于 4月 06, 2017

Like in the mirror filter driver, we also need to set the image size for
the commit filter driver. This is less likely to be a problem in
practice than for the mirror because we're not at the active layer here,
but attaching new parents to a node in the middle of the chain is
possible, so the size needs to be correct anyway.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>

0d0676a1

commit: Set commit_top_bs->aio_context · 02be4aeb

由 Kevin Wolf 提交于 4月 06, 2017

The filter driver that is inserted by the commit job needs to use the
same AioContext as its parent and child nodes.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NFam Zheng <famz@redhat.com>

02be4aeb

block: Ignore guest dev permissions during incoming migration · d35ff5e6

由 Kevin Wolf 提交于 4月 04, 2017

Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Tested-by: NKashyap Chamarthy <kchamart@redhat.com>

d35ff5e6

03 4月, 2017 2 次提交

block/parallels: Avoid overflows · 86d1bd70

由 Max Reitz 提交于 3月 31, 2017

Change the types of variables in allocate_clusters() to int64_t so we do
not have to worry about potential overflows.

Add an assertion that our accesses to s->bat[] do not result in a buffer
overflow and that the implicit conversion performed when invoking
bat_entry_off() does not result in an integer overflow.

Coverity-id: 1307776
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20170331170512.10381-1-mreitz@redhat.com
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

86d1bd70

qcow2: Discard unaligned tail when wiping image · 0c1bd469

由 Eric Blake 提交于 3月 31, 2017

There is a subtle difference between the fast (qcow2v3 with no
extra data) and slow path (qcow2v2 format [aka 0.10], or when a
snapshot is present) of qcow2_make_empty().  The slow path fails
to discard the final (partial) cluster of an unaligned image.

The problem stems from the fact that qcow2_discard_clusters() was
silently ignoring sub-cluster head and tail on unaligned requests.
A quick audit of all callers shows that qcow2_snapshot_create() has
always passed a cluster-aligned request since the call was added
in commit 1ebf561c; qcow2_co_pdiscard() has passed a cluster-aligned
request since commit ecdbead6 taught the block layer about preferred
discard alignment; and qcow2_make_empty() was fixed to pass an
aligned start (but not necessarily end) in commit a3e1505d.

Asserting that the start is always aligned also points out that we
now have a dead check: rounding the end offset down can never result
in a value less than the aligned start offset (the check was rendered
dead with commit ecdbead6).  Meanwhile, we do not want to round the
end cluster down in the one case of the end offset matching the
(unaligned) file size - that final partial cluster should still be
discarded.

With those fixes in place, the fast and slow paths are back in sync
at discarding an entire image; the next patch will update
qemu-iotests to ensure we don't regress.

Note that bdrv_co_pdiscard ignores ALL partial cluster requests,
including the partial cluster at the end of an image; it can be
argued that the partial cluster at the end should be special-cased
so that a guest issuing discard requests at proper alignments
everywhere else can likewise empty the entire image.  But that
optimization is left for another day.
Signed-off-by: NEric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-3-eblake@redhat.com
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

0c1bd469