提交 · 9f4793d8f2f1abf20dc1d7d55f65ce8df081ca7a · openeuler / qemu

25 2月, 2019 15 次提交

block: Add bdrv_make_absolute_filename() · 9f4793d8

由 Max Reitz 提交于 2月 01, 2019

This is a general function for making a filename that is relative to a
certain BDS absolute.

It calls bdrv_get_full_backing_filename_from_filename() for now, but
that will be changed in a follow-up patch.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-13-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

9f4793d8

block: bdrv_get_full_backing_filename's ret. val. · 6b6833c1

由 Max Reitz 提交于 2月 01, 2019

Make bdrv_get_full_backing_filename() return an allocated string instead
of placing the result in a caller-provided buffer.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-12-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

6b6833c1

block: bdrv_get_full_backing_filename_from_...'s ret. val. · 645ae7d8

由 Max Reitz 提交于 2月 01, 2019

Make bdrv_get_full_backing_filename_from_filename() return an allocated
string instead of placing the result in a caller-provided buffer.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-11-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

645ae7d8

block: Make path_combine() return the path · 009b03aa

由 Max Reitz 提交于 2月 01, 2019

Besides being safe for arbitrary path lengths, after some follow-up
patches all callers will want a freshly allocated buffer anyway.

In the meantime, path_combine_deprecated() is added which has the same
interface as path_combine() had before this patch. All callers to that
function will be converted in follow-up patches.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Message-id: 20190201192935.18394-10-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

009b03aa

block: Respect backing bs in bdrv_refresh_filename · 90993623

由 Max Reitz 提交于 2月 01, 2019

Basically, bdrv_refresh_filename() should respect all children of a
BlockDriverState. However, generally those children are driver-specific,
so this function cannot handle the general case. On the other hand,
there are only few drivers which use other children than @file and
@backing (that being vmdk, quorum, and blkverify).

Most block drivers only use @file and/or @backing (if they use any
children at all). Both can be implemented directly in
bdrv_refresh_filename.

The user overriding the file's filename is already handled, however, the
user overriding the backing file is not. If this is done, opening the
BDS with the plain filename of its file will not be correct, so we may
not set bs->exact_filename in that case.

iotest 051 contains test cases for overriding the backing file, and so
its output changes with this patch applied.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-6-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

90993623

block: Add BDS.auto_backing_file · 998c2019

由 Max Reitz 提交于 2月 01, 2019

If the backing file is overridden, this most probably does change the
guest-visible data of a BDS. Therefore, we will need to consider this
in bdrv_refresh_filename().

To see whether it has been overridden, we might want to compare
bs->backing_file and bs->backing->bs->filename. However,
bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
to change the backing child at runtime, without modifying the image
header), so bs->backing_file most of the time simply contains a copy of
bs->backing->bs->filename anyway, so it is useless for such a
comparison.

This patch adds an auto_backing_file BDS field which contains the
backing file path as indicated by the image header, which is not changed
by bdrv_set_backing_hd().

Because of bdrv_refresh_filename() magic, however, a BDS's filename may
differ from what has been specified during bdrv_open(). Then, the
comparison between bs->auto_backing_file and bs->backing->bs->filename
may fail even though bs->backing was opened from bs->auto_backing_file.
To mitigate this, we can copy the real BDS's filename (after the whole
bdrv_open() and bdrv_refresh_filename() process) into
bs->auto_backing_file, if we know the former has been opened based on
the latter. This is only possible if no options modifying the backing
file's behavior have been specified, though. To simplify things, this
patch only copies the filename from the backing file if no options have
been specified for it at all.

Furthermore, there are cases where an overlay is created by qemu which
already contains a BDS's filename (e.g. in blockdev-snapshot-sync). We
do not need to worry about updating the overlay's bs->auto_backing_file
there, because we actually wrote a post-bdrv_refresh_filename() filename
into the image header.

So all in all, there will be false negatives where (as of a future
patch) bdrv_refresh_filename() will assume that the backing file differs
from what was specified in the image header, even though it really does
not. However, these cases should be limited to where (1) the user
actually did override something in the backing chain (e.g. by specifying
options for the backing file), or (2) the user executed a QMP command to
change some node's backing file (e.g. change-backing-file or
block-commit with @backing-file given) where the given filename does not
happen to coincide with qemu's idea of the backing BDS's filename.

Then again, (1) really is limited to -drive. With -blockdev or
blockdev-add, you have to adhere to the schema, so a user cannot give
partial "unimportant" options (e.g. by just setting backing.node-name
and leaving the rest to the image header). Therefore, trying to fix
this would mean trying to fix something for -drive only.

To improve on (2), we would need a full infrastructure to "canonicalize"
an arbitrary filename (+ options), so it can be compared against
another. That seems a bit over the top, considering that filenames
nowadays are there mostly for the user's entertainment.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-5-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

998c2019

block: Skip implicit nodes for filename info · bb808d5f

由 Max Reitz 提交于 2月 01, 2019

bdrv_refresh_filename() should simply skip all implicit nodes.  They are
supposed to be invisible to the user, so they should not appear in
filename information.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-4-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

bb808d5f

block: Use children list in bdrv_refresh_filename · e24518e3

由 Max Reitz 提交于 2月 01, 2019

bdrv_refresh_filename() should invoke itself recursively on all
children, not just on file.

With that change, we can remove the manual invocations in blkverify,
quorum, commit, mirror, and blklogwrites.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-3-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

e24518e3

block: Use bdrv_refresh_filename() to pull · f30c66ba

由 Max Reitz 提交于 2月 01, 2019

Before this patch, bdrv_refresh_filename() is used in a pushing manner:
Whenever the BDS graph is modified, the parents of the modified edges
are supposed to be updated (recursively upwards).  However, that is
nonviable, considering that we want child changes not to concern
parents.

Also, in the long run we want a pull model anyway: Here, we would have a
bdrv_filename() function which returns a BDS's filename, freshly
constructed.

This patch is an intermediate step.  It adds bdrv_refresh_filename()
calls before every place a BDS.filename value is used.  The only
exceptions are protocol drivers that use their own filename, which
clearly would not profit from refreshing that filename before.

Also, bdrv_get_encrypted_filename() is removed along the way (as a user
of BDS.filename), since it is completely unused.

In turn, all of the calls to bdrv_refresh_filename() before this patch
are removed, because we no longer have to call this function on graph
changes.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-2-mreitz@redhat.com
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

f30c66ba

block: fix bdrv_check_perm for non-tree subgraph · f962e961

由 Vladimir Sementsov-Ogievskiy 提交于 2月 23, 2019

bdrv_check_perm in it's recursion checks each node in context of new
permissions for one parent, because of nature of DFS. It works well,
while children subgraph of top-most updated node is a tree, i.e. it
doesn't have any kind of loops. But if we have a loop (not oriented,
of course), i.e. we have two different ways from top-node to some
child-node, then bdrv_check_perm will do wrong thing:

  top
  | \
  |  |
  v  v
  A  B
  |  |
  v  v
  node

It will once check new permissions of node in context of new A
permissions and old B permissions and once visa-versa. It's a wrong way
and may lead to corruption of permission system. We may start with
no-permissions and all-shared for both A->node and B->node relations
and finish up with non shared write permission for both ways.

The following commit will add a test, which shows this bug.

To fix this situation, let's really set BdrvChild permissions during
bdrv_check_perm procedure. And we are happy here, as check-perm is
already written in transaction manner, so we just need to restore
backed-up permissions in _abort.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

f962e961

block: improve should_update_child · 2f30b7c3

由 Vladimir Sementsov-Ogievskiy 提交于 2月 23, 2019

As it already said in the comment, we don't want to create loops in
parent->child relations. So, when we try to append @to to @c, we should
check that @c is not in @to children subtree, and we should check it
recursively, not only the first level. The patch provides BFS-based
search, to check the relations.

This is needed for further fleecing-hook filter usage: we need to
append it to source, when the hook is already a parent of target, and
source may be in a backing chain of target (fleecing-scheme). So, on
appending, the hook should not became a child (direct or through
children subtree) of the target.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2f30b7c3

block: Use normal drain for bdrv_set_aio_context() · d70d5954

由 Kevin Wolf 提交于 2月 08, 2019

Now that bdrv_set_aio_context() works inside drained sections, it can
also use the real drain function instead of open coding something
similar.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

d70d5954

block: Fix AioContext switch for drained node · e64f25f3

由 Kevin Wolf 提交于 2月 08, 2019

When a drained node changes its AioContext, we need to move its
aio_disable_external() to the new context, too.

Without this fix, drain_end will try to reenable the new context, which
has never been disabled, so an assertion failure is triggered.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

e64f25f3

block: Don't poll in bdrv_set_aio_context() · 6c75d761

由 Kevin Wolf 提交于 2月 15, 2019

The explicit aio_poll() call in bdrv_set_aio_context() was added in
commit c2b6428d as a workaround for bdrv_drain() failing to achieve
to actually quiesce everything (specifically the NBD client code to
switch AioContext).

Now that the NBD client has been fixed to complete this operation during
bdrv_drain(), we don't need the workaround any more.

It was wrong anyway: aio_poll() must always be run in the home thread of
the AioContext.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

6c75d761

block: don't set the same context · 57830a49

由 Denis Plotnikov 提交于 2月 15, 2019

Adds a fast path on aio context setting preventing
unnecessary context setting routine.
Also, it prevents issues with cyclic walk of child
bds-es appeared because of registering aio walking
notifiers:

Call stack:

0  __GI_raise
1  __GI_abort
2  __assert_fail_base
3  __GI___assert_fail
4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
7  block_job_attached_aio_context
8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
9  bdrv_set_aio_context (bs=0x55f54d65c000)
10 blk_set_aio_context
11 virtio_blk_data_plane_stop
12 virtio_bus_stop_ioeventfd
13 virtio_vmstate_change
14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
16 vm_stop (state=RUN_STATE_SHUTDOWN)
17 main_loop_should_exit
18 main_loop
19 main

This can happen because of "new" context attachment to VM disk bds.
When attaching a new context the corresponding aio context handler is
called for each of aio_notifiers registered on the VM disk bds context.
Among those handlers, there is the block_job_attached_aio_context handler
which sets a new aio context for the block job bds. When doing so,
the old context is detached from all the block job bds children and one of
them is the VM disk bds, serving as backing store for the blockjob bds,
although the VM disk bds is actually the initializer of that process.
Since the VM disk bds is protected with walking_aio_notifiers flag
from double processing in recursive calls, the assert fires.
Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

57830a49

12 2月, 2019 1 次提交

bdrv_query_image_info Error parameter added · 1bf6e9ca

由 Andrey Shinkevich 提交于 2月 08, 2019

Inform a user in case qcow2_get_specific_info fails to obtain
QCOW2 image specific information. This patch is preliminary to
the one "qcow2: Add list of bitmaps to ImageInfoSpecificQCow2".
Signed-off-by: NAndrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Message-Id: <1549638368-530182-2-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

1bf6e9ca

01 2月, 2019 3 次提交

block: Fix invalidate_cache error path for parent activation · 78fc3b3a

由 Kevin Wolf 提交于 1月 31, 2019

bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
actually activating a node so that the correct permissions etc. are
taken. In case of errors, the flag must be restored so that the next
call to bdrv_co_invalidate_cache() retries activation.

Restoring the flag was missing in the error path for a failed
parent->role->activate() call. The consequence is that this attempt to
activate all images correctly fails because we still set errp, however
on the next attempt BDRV_O_INACTIVE is already clear, so we return
success without actually retrying the failed action.

An example where this is observable in practice is migration to a QEMU
instance that has a raw format block node attached to a guest device
with share-rw=off (the default) while another process holds
BLK_PERM_WRITE for the same image. In this case, all activation steps
before parent->role->activate() succeed because raw can tolerate other
writers to the image. Only the parent callback (in particular
blk_root_activate()) tries to implement the share-rw=on property and
requests exclusive write permissions. This fails when the migration
completes and correctly displays an error. However, a manual 'cont' will
incorrectly resume the VM without calling blk_root_activate() again.

This case is described in more detail in the following bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1531888

Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
path.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Tested-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

78fc3b3a

block: Apply auto-read-only for ro-whitelist drivers · 8be25de6

由 Kevin Wolf 提交于 1月 22, 2019

If QEMU was configured with a driver in --block-drv-ro-whitelist, trying
to use that driver read-write resulted in an error message even if
auto-read-only=on was set.

Consider auto-read-only=on for the whitelist checking and use it to
automatically degrade to read-only for block drivers on the read-only
whitelist.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

8be25de6

block: Fix hangs in synchronous APIs with iothreads · 4720cbee

由 Kevin Wolf 提交于 1月 07, 2019

In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().

For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.

Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.

The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).

The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

4720cbee

31 1月, 2019 1 次提交

qapi: add x-debug-query-block-graph · 5d3b4e99

由 Vladimir Sementsov-Ogievskiy 提交于 12月 21, 2018

Add a new command, returning block nodes (and their users) graph.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20181221170909.25584-2-vsementsov@virtuozzo.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

5d3b4e99

14 12月, 2018 7 次提交

block: Assert that flags are up-to-date in bdrv_reopen_prepare() · e6d79c41

由 Alberto Garcia 提交于 11月 12, 2018

Towards the end of bdrv_reopen_queue_child(), before starting to
process the children, the update_flags_from_options() function is
called in order to have BDRVReopenState.flags in sync with the options
from the QDict.

This is necessary because during the reopen process flags must be
updated for all nodes in the queue so bdrv_is_writable_after_reopen()
and the permission checks work correctly.

Because of that, calling update_flags_from_options() again in
bdrv_reopen_prepare() doesn't really change the flags (they are
already up-to-date). But we need to call it in order to remove those
options from QemuOpts and that way indicate that they have been
processed.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

e6d79c41

block: Remove assertions from update_flags_from_options() · 8eb4b07b

由 Alberto Garcia 提交于 11月 12, 2018

This function takes four options (cache.direct, cache.no-flush,
read-only and auto-read-only) from a QemuOpts object and updates the
flags accordingly.

If any of those options is not set (because it was missing from the
original QDict or because it had an invalid value) then the function
aborts with a failed assertion:

   $ qemu-io -c 'reopen -o read-only=foo' hd.qcow2
   block.c:1126: update_flags_from_options: Assertion `qemu_opt_find(opts, BDRV_OPT_CACHE_DIRECT)' failed.
   Aborted

This assertion is unnecessary, and it forces any caller of
bdrv_reopen() to pass all the aforementioned four options. This may
have made sense in order to remove ambiguity when bdrv_reopen() was
taking both flags and options, but that's not the case anymore.

It's also unnecessary if we want to validate the option values,
because bdrv_reopen_prepare() already takes care of that, as we can
see if we remove the assertions:

   $ qemu-io -c 'reopen -o read-only=foo' hd.qcow2
   Parameter 'read-only' expects 'on' or 'off'
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

8eb4b07b

block: Stop passing flags to bdrv_reopen_queue_child() · 9aa09ddd

由 Alberto Garcia 提交于 11月 12, 2018

Now that all callers are passing the new options using the QDict we no
longer need the 'flags' parameter.

This patch makes the following changes:

   1) The update_options_from_flags() call is no longer necessary
      so it can be removed.

   2) The update_flags_from_options() call is now used in all cases,
      and is moved down a few lines so it happens after the options
      QDict contains the final set of values.

   3) The flags parameter is removed. Now the flags are initialized
      using the current value (for the top-level node) or the parent
      flags (after inherit_options()). In both cases the initial
      values are updated to reflect the new options in the QDict. This
      happens in bdrv_reopen_queue_child() (as explained above) and in
      bdrv_reopen_prepare().
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

9aa09ddd

block: Remove flags parameter from bdrv_reopen_queue() · 2e891722

由 Alberto Garcia 提交于 11月 12, 2018

Now that all callers are passing all flag changes as QDict options,
the flags parameter is no longer necessary, so we can get rid of it.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2e891722

block: Drop bdrv_reopen() · 295cf237

由 Alberto Garcia 提交于 11月 12, 2018

No one is using this function anymore, so we can safely remove it.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

295cf237

block: Use bdrv_reopen_set_read_only() in bdrv_backing_update_filename() · e94d3dba

由 Alberto Garcia 提交于 11月 12, 2018

This patch replaces the bdrv_reopen() calls that set and remove the
BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

e94d3dba

block: Add bdrv_reopen_set_read_only() · 6e1000a8

由 Alberto Garcia 提交于 11月 12, 2018

Most callers of bdrv_reopen() only use it to switch a BlockDriverState
between read-only and read-write, so this patch adds a new function
that does just that.

We also want to get rid of the flags parameter in the bdrv_reopen()
API, so this function sets the "read-only" option and passes the
original flags (which will then be updated in bdrv_reopen_prepare()).
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

6e1000a8

27 11月, 2018 1 次提交

block: Don't inactivate children before parents · 9e37271f

由 Kevin Wolf 提交于 11月 23, 2018

bdrv_child_cb_inactivate() asserts that parents are already inactive
when children get inactivated. This precondition is necessary because
parents could still issue requests in their inactivation code.

When block nodes are created individually with -blockdev, all of them
are monitor owned and will be returned by bdrv_next() in an undefined
order (in practice, in the order of their creation, which is usually
children before parents), which obviously fails the assertion:

qemu: block.c:899: bdrv_child_cb_inactivate: Assertion `bs->open_flags & BDRV_O_INACTIVE' failed.

This patch fixes the ordering by skipping nodes with still active
parents in bdrv_inactivate_recurse() because we know that they will be
covered by recursion when the last active parent becomes inactive.

With the correct parents-before-children ordering, we also got rid of
the reason why commit aad0b7a0 introduced two passes, so we can go
back to a single-pass recursion. This is necessary so we can rely on the
BDRV_O_INACTIVE flag to skip nodes with active parents (the flag used
to be set only in pass 2, so we would always skip non-root nodes in
pass 1 because all parents would still be considered active; setting the
flag in pass 1 would mean, that we never skip anything in pass 2 because
all parents are already considered inactive).

Because of the change to single pass, this patch is best reviewed with
whitespace changes ignored.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>

9e37271f

23 11月, 2018 2 次提交

block: Update BlockDriverState.inherits_from on bdrv_drop_intermediate() · 6bd858b3

由 Alberto Garcia 提交于 10月 31, 2018

The previous patch fixed the inherits_from pointer after block-stream,
and this one does the same for block-commit.

When block-commit finishes and the 'top' node is not the topmost one
from the backing chain then all nodes above 'base' up to and including
'top' are removed from the chain.

The bdrv_drop_intermediate() call converts a chain like this one:

    base <- intermediate <- top <- active

into this one:

    base <- active

In a simple scenario each backing file from the first chain has the
inherits_from attribute pointing to its parent. This means that
reopening 'active' will recursively reopen all its children, whose
options can be changed in the process.

However after the 'block-commit' call base.inherits_from is NULL and
the chain is broken, so 'base' does not inherit from 'active' and will
not be reopened automatically:

   $ qemu-img create -f qcow2 hd0.qcow2 1M
   $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2
   $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2
   $ $QEMU -drive if=none,file=hd2.qcow2

   { 'execute': 'block-commit',
     'arguments': {
       'device': 'none0',
       'top': 'hd1.qcow2' } }

   { 'execute': 'human-monitor-command',
     'arguments': {
        'command-line':
          'qemu-io none0 "reopen -o backing.l2-cache-size=2M"' } }

   { "return": "Cannot change the option 'backing.l2-cache-size'\r\n"}

This patch updates base.inherits_from in this scenario, and adds a
test case.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

6bd858b3

block: Update BlockDriverState.inherits_from on bdrv_set_backing_hd() · 0065c455

由 Alberto Garcia 提交于 10月 31, 2018

When a BlockDriverState's child is opened (be it a backing file, the
protocol layer, or any other) inherits_from is set to point to the
parent node. Children opened separately and then attached to a parent
don't have this pointer set.

bdrv_reopen_queue_child() uses this to determine whether a node's
children must also be reopened inheriting the options from the parent
or not. If inherits_from points to the parent then the child is
reopened and its options can be changed, like in this example:

   $ qemu-img create -f qcow2 hd0.qcow2 1M
   $ qemu-img create -f qcow2 hd1.qcow2 1M
   $ $QEMU -drive if=none,node-name=hd0,file=hd0.qcow2,\
                  backing.driver=qcow2,backing.file.filename=hd1.qcow2
   (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M"

If the child does not inherit from the parent then it does not get
reopened and its options cannot be changed:

   $ $QEMU -drive if=none,node-name=hd1,file=hd1.qcow2
           -drive if=none,node-name=hd0,file=hd0.qcow2,backing=hd1
   (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M"
   Cannot change the option 'backing.l2-cache-size'

If a disk image has a chain of backing files then all of them are also
connected through their inherits_from pointers (i.e. it's possible to
walk the chain in reverse order from base to top).

However this is broken if the intermediate nodes are removed using
e.g. block-stream because the inherits_from pointer from the base node
becomes NULL:

   $ qemu-img create -f qcow2 hd0.qcow2 1M
   $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2
   $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2
   $ $QEMU -drive if=none,file=hd2.qcow2
   (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M"
   (qemu) block_stream none0 0 hd0.qcow2
   (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M"
   Cannot change the option 'backing.l2-cache-size'

This patch updates the inherits_from pointer if the intermediate nodes
of a backing chain are removed using bdrv_set_backing_hd(), and adds a
test case for this scenario.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

0065c455

22 11月, 2018 1 次提交

block: Fix update of BDRV_O_AUTO_RDONLY in update_flags_from_options() · 2a3d4331

由 Alberto Garcia 提交于 11月 12, 2018

Commit e35bdc12 added the auto-read-only option and the
code to update its corresponding flag in update_flags_from_options(),
but forgot to clear the flag if auto-read-only is false.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reported-by: NMax Reitz <mreitz@redhat.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2a3d4331

19 11月, 2018 1 次提交

block: Always abort reopen after prepare succeeded · 9ad08c44

由 Max Reitz 提交于 11月 16, 2018

bdrv_reopen_multiple() does not invoke bdrv_reopen_abort() for the
element of the reopen queue for which bdrv_reopen_prepare() failed,
because it assumes that the prepare function will have rolled back all
changes already.

However, bdrv_reopen_prepare() does not do this in every case: It may
notice an error after BlockDriver.bdrv_reopen_prepare() succeeded, and
it will not invoke BlockDriver.bdrv_reopen_abort() then; and neither
will bdrv_reopen_multiple(), as explained above.

This is wrong because we must always call .bdrv_reopen_commit() or
.bdrv_reopen_abort() after .bdrv_reopen_prepare() has succeeded.
Otherwise, the block driver has no chance to undo what it has done in
its implementation of .bdrv_reopen_prepare().

To fix this, bdrv_reopen_prepare() has to call .bdrv_reopen_abort() if
it wants to return an error after .bdrv_reopen_prepare() has succeeded.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

9ad08c44

05 11月, 2018 4 次提交

block: Require auto-read-only for existing fallbacks · eaa2410f

由 Kevin Wolf 提交于 10月 12, 2018

Some block drivers have traditionally changed their node to read-only
mode without asking the user. This behaviour has been marked deprecated
since 2.11, expecting users to provide an explicit read-only=on option.

Now that we have auto-read-only=on, enable these drivers to make use of
the option.

This is the only use of bdrv_set_read_only(), so we can make it a bit
more specific and turn it into a bdrv_apply_auto_read_only() that is
more convenient for drivers to use.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

eaa2410f

block: Add auto-read-only option · e35bdc12

由 Kevin Wolf 提交于 10月 05, 2018

If a management application builds the block graph node by node, the
protocol layer doesn't inherit its read-only option from the format
layer any more, so it must be set explicitly.

Backing files should work on read-only storage, but at the same time, a
block job like commit should be able to reopen them read-write if they
are on read-write storage. However, without option inheritance, reopen
only changes the read-only option for the root node (typically the
format layer), but not the protocol layer, so reopening fails (the
format layer wants to get write permissions, but the protocol layer is
still read-only).

A simple workaround for the problem in the management tool would be to
open the protocol layer always read-write and to make only the format
layer read-only for backing files. However, sometimes the file is
actually stored on read-only storage and we don't know whether the image
can be opened read-write (for example, for NBD it depends on the server
we're trying to connect to). This adds an option that makes QEMU try to
open the image read-write, but allows it to degrade to a read-only mode
without returning an error.

The documentation for this option is consciously phrased in a way that
allows QEMU to switch to a better model eventually: Instead of trying
when the image is first opened, making the read-only flag dynamic and
changing it automatically whenever the first BLK_PERM_WRITE user is
attached or the last one is detached would be much more useful
behaviour.

Unfortunately, this more useful behaviour is also a lot harder to
implement, and libvirt needs a solution now before it can switch to
-blockdev, so let's start with this easier approach for now.

Instead of adding a new auto-read-only option, turning the existing
read-only into an enum (with a bool alternate for compatibility) was
considered, but it complicated the implementation to the point that it
didn't seem to be worth it.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

e35bdc12

block: Update flags in bdrv_set_read_only() · eeae6a59

由 Kevin Wolf 提交于 10月 09, 2018

To fully change the read-only state of a node, we must not only change
bs->read_only, but also update bs->open_flags.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>

eeae6a59

A
block: replace "discard" literal with BDRV_OPT_DISCARD macro · 415bbca8
由 Alberto Garcia 提交于 10月 03, 2018
```
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
```
415bbca8

30 10月, 2018 1 次提交

dirty-bitmaps: clean-up bitmaps loading and migration logic · 9c98f145

由 Vladimir Sementsov-Ogievskiy 提交于 10月 29, 2018

This patch aims to bring the following behavior:

1. We don't load bitmaps, when started in inactive mode. It's the case
of incoming migration. In this case we wait for bitmaps migration
through migration channel (if 'dirty-bitmaps' capability is enabled) or
for invalidation (to load bitmaps from the image).

2. We don't remove persistent bitmaps on inactivation. Instead, we only
remove bitmaps after storing. This is the only way to restore bitmaps,
if we decided to resume source after [failed] migration with
'dirty-bitmaps' capability enabled (which means, that bitmaps were not
stored).

3. We load bitmaps on open and any invalidation, it's ok for all cases:
  - normal open
  - migration target invalidation with dirty-bitmaps capability
    (bitmaps are migrating through migration channel, the are not
     stored, so they should have IN_USE flag set and will be skipped
     when loading. However, it would fail if bitmaps are read-only[1])
  - migration target invalidation without dirty-bitmaps capability
    (normal load of the bitmaps, if migrated with shared storage)
  - source invalidation with dirty-bitmaps capability
    (skip because IN_USE)
  - source invalidation without dirty-bitmaps capability
    (bitmaps were dropped, reload them)

[1]: to accurately handle this, migration of read-only bitmaps is
     explicitly forbidden in this patch.

New mechanism for not storing bitmaps when migrate with dirty-bitmaps
capability is introduced: migration filed in BdrvDirtyBitmap.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NJohn Snow <jsnow@redhat.com>

9c98f145

19 10月, 2018 2 次提交

block: Clean up bdrv_img_create()'s error reporting · da7e92ca

由 Markus Armbruster 提交于 10月 17, 2018

bdrv_img_create() takes an Error ** argument and uses it in the
conventional way, except for one place: when qemu_opts_do_parse()
fails, it first reports its error to stderr or the HMP monitor with
error_report_err(), then error_setg()'s a generic error.

When the caller reports that second error similarly, this produces two
consecutive error messages on stderr or the HMP monitor.

When the caller does something else with it, such as send it via QMP,
the first error still goes to stderr or the HMP monitor.  Fortunately,
no such caller exists.

Simply use the first error as is.  Update expected output of
qemu-iotest 049 accordingly.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-37-armbru@redhat.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>

da7e92ca

error: Fix use of error_prepend() with &error_fatal, &error_abort · 4b576648

由 Markus Armbruster 提交于 10月 17, 2018

From include/qapi/error.h:

  * Pass an existing error to the caller with the message modified:
  *     error_propagate(errp, err);
  *     error_prepend(errp, "Could not frobnicate '%s': ", name);

Fei Li pointed out that doing error_propagate() first doesn't work
well when @errp is &error_fatal or &error_abort: the error_prepend()
is never reached.

Since I doubt fixing the documentation will stop people from getting
it wrong, introduce error_propagate_prepend(), in the hope that it
lures people away from using its constituents in the wrong order.
Update the instructions in error.h accordingly.

Convert existing error_prepend() next to error_propagate to
error_propagate_prepend().  If any of these get reached with
&error_fatal or &error_abort, the error messages improve.  I didn't
check whether that's the case anywhere.

Cc: Fei Li <fli@suse.com>
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-2-armbru@redhat.com>

4b576648

01 10月, 2018 1 次提交

block: Allow changing 'detect-zeroes' on reopen · 543770bd

由 Alberto Garcia 提交于 9月 06, 2018

'detect-zeroes' is one of the basic BlockdevOptions available for all
drivers, but it's not handled by bdrv_reopen_prepare(), so any attempt
to change it results in an error:

   (qemu) qemu-io virtio0 "reopen -o detect-zeroes=on"
   Cannot change the option 'detect-zeroes'

Since there's no reason why we shouldn't allow changing it and the
implementation is simple let's just do it.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

543770bd