提交 · e238a99e89e6304a21ea4bf8f4ec3c296624199b · openeuler / qemu

07 5月, 2019 1 次提交

block: Assert that drv->bdrv_child_perm is set in bdrv_child_perm() · 0b3ca76e

由 Alberto Garcia 提交于 4月 04, 2019

There is no need to check for this because all block drivers that have
children implement bdrv_child_perm and all callers already ensure that
bs->drv is set.

Furthermore, if this check would fail then the callers would end up
with uninitialized values for nperm and nshared.

This patch replaces the check with an assertion.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190404112953.4058-1-berto@igalia.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

0b3ca76e

30 4月, 2019 1 次提交

block: Fix AioContext switch for bs->drv == NULL · 1bffe1ae

由 Kevin Wolf 提交于 4月 17, 2019

Even for block nodes with bs->drv == NULL, we can't just ignore a
bdrv_set_aio_context() call. Leaving the node in its old context can
mean that it's still in an iothread context in bdrv_close_all() during
shutdown, resulting in an attempted unlock of the AioContext lock which
we don't hold.

This is an example stack trace of a related crash:

 #0  0x00007ffff59da57f in raise () at /lib64/libc.so.6
 #1  0x00007ffff59c4895 in abort () at /lib64/libc.so.6
 #2  0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
 #3  0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
 #4  0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507
 #5  0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0)
         at block/io.c:833
 #6  0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990
 #7  0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990
 #8  0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51
 #9  0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248
 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194
 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194
 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248
 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124
 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153
 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358
 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542
 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598
 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785
 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483
 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412
 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776

The reproducer I used is a qcow2 image on gluster volume, where the
virtual disk size (4 GB) is larger than the gluster volume size (64M),
so we can easily trigger an ENOSPC. This backend is assigned to a
virtio-blk device using an iothread, and then from the guest a
'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop
because of an I/O error. qemu_gluster_co_flush_to_disk() sets
bs->drv = NULL on error, so when virtio-blk stops the dataplane, the
block nodes stay in the iothread AioContext. A 'quit' monitor command
issued from this paused state crashes the process.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227
Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>

1bffe1ae

08 4月, 2019 1 次提交

block: Forward 'discard' to temporary overlay · 3f48686f

由 Kevin Wolf 提交于 4月 04, 2019

When bdrv_temp_snapshot_options() is called for snapshot=on, the
'discard' option in the options QDict hasn't been parsed and merged into
the flags yet. So copy the dict entry to make sure that the temporary
overlay enables discard when it was requested for the drive.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>

3f48686f

02 4月, 2019 1 次提交

block: continue until base is found in bdrv_freeze_backing_chain() et al · 0f0998f6

由 Alberto Garcia 提交于 3月 28, 2019

All three functions that handle the BdrvChild.frozen attribute walk
the backing chain from 'bs' to 'base' and stop either when 'base' is
found or at the end of the chain if 'base' is NULL.

However if 'base' is not found then the functions return without
errors as if it was NULL.

This is wrong: if the caller passed an incorrect parameter that means
that there is a bug in the code.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

0f0998f6

19 3月, 2019 1 次提交

block: Silence Coverity in bdrv_drop_intermediate() · 2345bde6

由 Kevin Wolf 提交于 3月 15, 2019

Coverity doesn't like that the return value of bdrv_check_update_perm()
stays unused only in this place (CID 1399710).

Even if checking local_err should be equivalent to checking ret < 0,
let's switch to using the return value to be more consistent (and in
case of a bug somewhere down the call chain, forgetting to assign errp
is more likely than returning 0 for an error case).
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NMarkus Armbruster <armbru@redhat.com>

2345bde6

13 3月, 2019 10 次提交

block: Remove the AioContext parameter from bdrv_reopen_multiple() · 5019aece

由 Alberto Garcia 提交于 3月 12, 2019

This parameter has been unused since 1a63a907Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

5019aece

block: Add bdrv_reset_options_allowed() · faf116b4

由 Alberto Garcia 提交于 3月 12, 2019

bdrv_reopen_prepare() receives a BDRVReopenState with (among other
things) a new set of options to be applied to that BlockDriverState.

If an option is missing then it means that we want to reset it to its
default value rather than keep the previous one. This way the state
of the block device after being reopened is comparable to that of a
device added with "blockdev-add" using the same set of options.

Not all options from all drivers can be changed this way, however.
If the user attempts to reset an immutable option to its default value
using this method then we must forbid it.

This new function takes a BlockDriverState and a new set of options
and checks if there's any option that was previously set but is
missing from the new set of options.

If the option is present in both sets we don't need to check that they
have the same value. The loop at the end of bdrv_reopen_prepare()
already takes care of that.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

faf116b4

block: Allow changing the backing file on reopen · cb828c31

由 Alberto Garcia 提交于 3月 12, 2019

This patch allows the user to change the backing file of an image that
is being reopened. Here's what it does:

 - In bdrv_reopen_prepare(): check that the value of 'backing' points
   to an existing node or is null. If it points to an existing node it
   also needs to make sure that replacing the backing file will not
   create a cycle in the node graph (i.e. you cannot reach the parent
   from the new backing file).

 - In bdrv_reopen_commit(): perform the actual node replacement by
   calling bdrv_set_backing_hd().

There may be temporary implicit nodes between a BDS and its backing
file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
looks for the real (non-implicit) backing file and requires that the
'backing' option points to it. Replacing or detaching a backing file
is forbidden if there are implicit nodes in the middle.

Although x-blockdev-reopen is meant to be used like blockdev-add,
there's an important thing that must be taken into account: the only
way to set a new backing file is by using a reference to an existing
node (previously added with e.g. blockdev-add).  If 'backing' contains
a dictionary with a new set of options ({"driver": "qcow2", "file": {
... }}) then it is interpreted that the _existing_ backing file must
be reopened with those options.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

cb828c31

block: Allow omitting the 'backing' option in certain cases · bacd9b87

由 Alberto Garcia 提交于 3月 12, 2019

Of all options of type BlockdevRef used to specify children in
BlockdevOptions, 'backing' is the only one that is optional.

For "x-blockdev-reopen" we want that if an option is omitted then it
must be reset to its default value. The default value of 'backing'
means that QEMU opens the backing file specified in the image
metadata, but this is not something that we want to support for the
reopen operation.

Because of this the 'backing' option has to be specified during
reopen, pointing to the existing backing file if we want to keep it,
or pointing to a different one (or NULL) if we want to replace it (to
be implemented in a subsequent patch).

In order to simplify things a bit and not to require that the user
passes the 'backing' option to every single block device even when
it's clearly not necessary, this patch allows omitting this option if
the block device being reopened doesn't have a backing file attached
_and_ no default backing file is specified in the image metadata.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

bacd9b87

block: Handle child references in bdrv_reopen_queue() · 8546632e

由 Alberto Garcia 提交于 3月 12, 2019

Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
which can contain a set of child options, a child reference, or
NULL. In optional attributes like "backing" it can also be missing.

Only the first case (set of child options) is being handled properly
by bdrv_reopen_queue(). This patch deals with all the others.

Here's how these cases should be handled when bdrv_reopen_queue() is
deciding what to do with each child of a BlockDriverState:

1) Set of child options: if the child was implicitly created (i.e
inherits_from points to the parent) then the options are removed
from the parent's options QDict and are passed to the child with
a recursive bdrv_reopen_queue() call. This case was already
working fine.

2) Child reference: there's two possibilites here.

2a) Reference to the current child: if the child was implicitly
created then it is put in the reopen queue, keeping its
current set of options (since this was a child reference
there was no way to specify a different set of options).
If the child is not implicit then it keeps its current set
of options but it is not reopened (and therefore does not
inherit any new option from the parent).

2b) Reference to a different BDS: the current child is not put
in the reopen queue at all. Passing a reference to a
different BDS can be used to replace a child, although at
the moment no driver implements this, so it results in an
error. In any case, the current child is not going to be
reopened (and might in fact disappear if it's replaced)

3) NULL: This is similar to (2b). Although no driver allows this
yet it can be used to detach the current child so it should not
be put in the reopen queue.

4) Missing option: at the moment "backing" is the only case where
this can happen. With "blockdev-add", leaving "backing" out
means that the default backing file is opened. We don't want to
open a new image during reopen, so we require that "backing" is
always present. We'll relax this requirement a bit in the next
patch. If keep_old_opts is true and "backing" is missing then
this behaves like 2a (the current child is reopened).
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

8546632e

block: Add 'keep_old_opts' parameter to bdrv_reopen_queue() · 077e8e20

由 Alberto Garcia 提交于 3月 12, 2019

The bdrv_reopen_queue() function is used to create a queue with
the BDSs that are going to be reopened and their new options. Once
the queue is ready bdrv_reopen_multiple() is called to perform the
operation.

The original options from each one of the BDSs are kept, with the new
options passed to bdrv_reopen_queue() applied on top of them.

For "x-blockdev-reopen" we want a function that behaves much like
"blockdev-add". We want to ignore the previous set of options so that
only the ones actually specified by the user are applied, with the
rest having their default values.

One of the things that we need is a way to tell bdrv_reopen_queue()
whether we want to keep the old set of options or not, and that's what
this patch does. All current callers are setting this new parameter to
true and x-blockdev-reopen will set it to false.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

077e8e20

block: Allow freezing BdrvChild links · 2cad1ebe

由 Alberto Garcia 提交于 3月 12, 2019

Our permission system is useful to define what operations are allowed
on a certain block node and includes things like BLK_PERM_WRITE or
BLK_PERM_RESIZE among others.

One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
the node that this BdrvChild points to". The exact meaning of this has
never been very clear, but it can be understood as "change any of the
links connected to the node". This can be used to prevent changing a
backing link, but it's too coarse.

This patch adds a new 'frozen' attribute to BdrvChild, which forbids
detaching the link from the node it points to, and new API to freeze
and unfreeze a backing chain.

After this change a few functions can fail, so they need additional
checks.
Signed-off-by: NAlberto Garcia <berto@igalia.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2cad1ebe

file-posix: Fix bdrv_open_flags() for snapshot=on · 30855137

由 Kevin Wolf 提交于 3月 11, 2019

Using a different read-only setting for bs->open_flags than for the
flags to the driver's open function is just inconsistent and a bad idea.
After this patch, the temporary snapshot keeps being opened read-only if
read-only=on,snapshot=on is passed.

If we wanted to change this behaviour to make only the orginal image
file read-only, but the temporary overlay read-write (as the comment in
the removed code suggests), that change would have to be made in
bdrv_temp_snapshot_options() (where the comment suggests otherwise).

Addressing this inconsistency before introducing dynamic auto-read-only
is important because otherwise we would immediately try to reopen the
temporary overlay even though the file is already unlinked.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

30855137

block: Make permission changes in reopen less wrong · 69b736e7

由 Kevin Wolf 提交于 3月 05, 2019

The way that reopen interacts with permission changes has one big
problem: Both operations are recursive, and the permissions are changes
for each node in the reopen queue.

For a simple graph that consists just of parent and child,
.bdrv_check_perm will be called twice for the child, once recursively
when adjusting the permissions of parent, and once again when the child
itself is reopened.

Even worse, the first .bdrv_check_perm call happens before
.bdrv_reopen_prepare was called for the child and the second one is
called afterwards.

Making sure that .bdrv_check_perm (and the other permission callbacks)
are called only once is hard. We can cope with multiple calls right now,
but as soon as file-posix gets a dynamic auto-read-only that may need to
open a new file descriptor, we get the additional requirement that all
of them are after the .bdrv_reopen_prepare call.

So reorder things in bdrv_reopen_multiple() to first call
.bdrv_reopen_prepare for all involved nodes and only then adjust
permissions.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

69b736e7

block: Avoid useless local_err · a4615ab3

由 Kevin Wolf 提交于 3月 05, 2019

Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

a4615ab3

08 3月, 2019 1 次提交

block: iterate_format with account of whitelisting · 9ac404c5

由 Andrey Shinkevich 提交于 3月 07, 2019

bdrv_iterate_format (which is currently only used for printing out the
formats supported by the block layer) doesn't take format whitelisting
into account.

This creates a problem for tests: they enumerate supported formats to
decide which tests to enable, but then discover that QEMU doesn't let
them actually use some of those formats.

To avoid that, exclude formats that are not whitelisted from
enumeration, if whitelisting is in use.  Since we have separate
whitelists for r/w and r/o, take this a parameter to
bdrv_iterate_format, and print two lists of supported formats (r/w and
r/o) in main qemu.
Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NAndrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

9ac404c5

25 2月, 2019 22 次提交

block: BDS options may lack the "driver" option · 62a01a27

由 Max Reitz 提交于 2月 01, 2019

When BDSs are created by qemu itself (e.g. as filters in block jobs),
they may not have a "driver" option in their options QDict.  When
generating a json:{} filename, however, it must always be present.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-31-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

62a01a27

block: Do not copy exact_filename from format file · fb695c74

由 Max Reitz 提交于 2月 01, 2019

If a format BDS's file BDS is in turn a format BDS, we cannot simply use
the same filename, because when opening a BDS tree based on a filename
alone, qemu will create only one format node on top of one protocol node
(disregarding a potential backing file).
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-26-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

fb695c74

block: Purify .bdrv_refresh_filename() · 998b3a1e

由 Max Reitz 提交于 2月 01, 2019

Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
Now that we have generic code in the central bdrv_refresh_filename() for
creating BDS.full_open_options, we can drop the latter part from all
BlockDriver.bdrv_refresh_filename() implementations.

This also means that we can drop all of the existing default code for
this from the global bdrv_refresh_filename() itself.

Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
after having set BDS.full_open_options, because the block driver's
implementation should now be allowed to depend on BDS.full_open_options
being set correctly.

Finally, with this patch we can drop the @options parameter from
BlockDriver.bdrv_refresh_filename(); also, add a comment on this
function's purpose in block/block_int.h while touching its interface.

This completely obsoletes blklogwrite's implementation of
.bdrv_refresh_filename().
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-25-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

998b3a1e

block: Generically refresh runtime options · 97e2f021

由 Max Reitz 提交于 2月 01, 2019

Instead of having every block driver which implements
bdrv_refresh_filename() copy all of the strong runtime options over to
bs->full_open_options, implement this process generically in
bdrv_refresh_filename().

This patch only adds this new generic implementation, it does not remove
the old functionality. This is done in a follow-up patch.

With this patch, some superfluous information (that should never have
been there) may be removed from some JSON filenames, as can be seen in
the change to iotests 110's and 228's reference outputs.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-24-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

97e2f021

block: Use bdrv_dirname() for relative filenames · 8df68616

由 Max Reitz 提交于 2月 01, 2019

bdrv_get_full_backing_filename_from_filename() breaks down when it comes
to JSON filenames. Using bdrv_dirname() as the basis is better because
since we have BDS, we can descend through the BDS tree to the protocol
layer, which gives us a greater probability of finding a non-JSON name;
also, bdrv_dirname() is more correct as it allows block drivers to
override the generation of that directory name in a protocol-specific
way.

We still need to keep bdrv_get_full_backing_filename_from_filename(),
though, because it has valid callers which need it during image creation
when no BDS is available yet.

This makes a test case in qemu-iotest 110, which was supposed to fail,
work. That is actually good, but we need to change the reference output
(and the comment in 110) accordingly.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-20-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

8df68616

block: Add bdrv_dirname() · 1e89d0f9

由 Max Reitz 提交于 2月 01, 2019

This function may be implemented by block drivers to derive a directory
name from a BDS. Concatenating this g_free()-able string with a relative
filename must result in a valid (not necessarily existing) filename, so
this is a function that should generally be not implemented by format
drivers, because this is protocol-specific.

If a BDS's driver does not implement this function, bdrv_dirname() will
fall through to the BDS's file if it exists. If it does not, the
exact_filename field will be used to generate a directory name.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-15-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

1e89d0f9

block: Fix bdrv_find_backing_image() · 2d9158ce

由 Max Reitz 提交于 2月 01, 2019

bdrv_find_backing_image() should use bdrv_get_full_backing_filename() or
bdrv_make_absolute_filename() instead of trying to do what those
functions do by itself.

path_combine_deprecated() can now be dropped, so let's do that.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-14-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

2d9158ce

block: Add bdrv_make_absolute_filename() · 9f4793d8

由 Max Reitz 提交于 2月 01, 2019

This is a general function for making a filename that is relative to a
certain BDS absolute.

It calls bdrv_get_full_backing_filename_from_filename() for now, but
that will be changed in a follow-up patch.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-13-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

9f4793d8

block: bdrv_get_full_backing_filename's ret. val. · 6b6833c1

由 Max Reitz 提交于 2月 01, 2019

Make bdrv_get_full_backing_filename() return an allocated string instead
of placing the result in a caller-provided buffer.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-12-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

6b6833c1

block: bdrv_get_full_backing_filename_from_...'s ret. val. · 645ae7d8

由 Max Reitz 提交于 2月 01, 2019

Make bdrv_get_full_backing_filename_from_filename() return an allocated
string instead of placing the result in a caller-provided buffer.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-11-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

645ae7d8

block: Make path_combine() return the path · 009b03aa

由 Max Reitz 提交于 2月 01, 2019

Besides being safe for arbitrary path lengths, after some follow-up
patches all callers will want a freshly allocated buffer anyway.

In the meantime, path_combine_deprecated() is added which has the same
interface as path_combine() had before this patch. All callers to that
function will be converted in follow-up patches.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Message-id: 20190201192935.18394-10-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

009b03aa

block: Respect backing bs in bdrv_refresh_filename · 90993623

由 Max Reitz 提交于 2月 01, 2019

Basically, bdrv_refresh_filename() should respect all children of a
BlockDriverState. However, generally those children are driver-specific,
so this function cannot handle the general case. On the other hand,
there are only few drivers which use other children than @file and
@backing (that being vmdk, quorum, and blkverify).

Most block drivers only use @file and/or @backing (if they use any
children at all). Both can be implemented directly in
bdrv_refresh_filename.

The user overriding the file's filename is already handled, however, the
user overriding the backing file is not. If this is done, opening the
BDS with the plain filename of its file will not be correct, so we may
not set bs->exact_filename in that case.

iotest 051 contains test cases for overriding the backing file, and so
its output changes with this patch applied.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-6-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

90993623

block: Add BDS.auto_backing_file · 998c2019

由 Max Reitz 提交于 2月 01, 2019

If the backing file is overridden, this most probably does change the
guest-visible data of a BDS. Therefore, we will need to consider this
in bdrv_refresh_filename().

To see whether it has been overridden, we might want to compare
bs->backing_file and bs->backing->bs->filename. However,
bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
to change the backing child at runtime, without modifying the image
header), so bs->backing_file most of the time simply contains a copy of
bs->backing->bs->filename anyway, so it is useless for such a
comparison.

This patch adds an auto_backing_file BDS field which contains the
backing file path as indicated by the image header, which is not changed
by bdrv_set_backing_hd().

Because of bdrv_refresh_filename() magic, however, a BDS's filename may
differ from what has been specified during bdrv_open(). Then, the
comparison between bs->auto_backing_file and bs->backing->bs->filename
may fail even though bs->backing was opened from bs->auto_backing_file.
To mitigate this, we can copy the real BDS's filename (after the whole
bdrv_open() and bdrv_refresh_filename() process) into
bs->auto_backing_file, if we know the former has been opened based on
the latter. This is only possible if no options modifying the backing
file's behavior have been specified, though. To simplify things, this
patch only copies the filename from the backing file if no options have
been specified for it at all.

Furthermore, there are cases where an overlay is created by qemu which
already contains a BDS's filename (e.g. in blockdev-snapshot-sync). We
do not need to worry about updating the overlay's bs->auto_backing_file
there, because we actually wrote a post-bdrv_refresh_filename() filename
into the image header.

So all in all, there will be false negatives where (as of a future
patch) bdrv_refresh_filename() will assume that the backing file differs
from what was specified in the image header, even though it really does
not. However, these cases should be limited to where (1) the user
actually did override something in the backing chain (e.g. by specifying
options for the backing file), or (2) the user executed a QMP command to
change some node's backing file (e.g. change-backing-file or
block-commit with @backing-file given) where the given filename does not
happen to coincide with qemu's idea of the backing BDS's filename.

Then again, (1) really is limited to -drive. With -blockdev or
blockdev-add, you have to adhere to the schema, so a user cannot give
partial "unimportant" options (e.g. by just setting backing.node-name
and leaving the rest to the image header). Therefore, trying to fix
this would mean trying to fix something for -drive only.

To improve on (2), we would need a full infrastructure to "canonicalize"
an arbitrary filename (+ options), so it can be compared against
another. That seems a bit over the top, considering that filenames
nowadays are there mostly for the user's entertainment.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-5-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

998c2019

block: Skip implicit nodes for filename info · bb808d5f

由 Max Reitz 提交于 2月 01, 2019

bdrv_refresh_filename() should simply skip all implicit nodes.  They are
supposed to be invisible to the user, so they should not appear in
filename information.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-4-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

bb808d5f

block: Use children list in bdrv_refresh_filename · e24518e3

由 Max Reitz 提交于 2月 01, 2019

bdrv_refresh_filename() should invoke itself recursively on all
children, not just on file.

With that change, we can remove the manual invocations in blkverify,
quorum, commit, mirror, and blklogwrites.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NAlberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-3-mreitz@redhat.com
Signed-off-by: NMax Reitz <mreitz@redhat.com>

e24518e3

block: Use bdrv_refresh_filename() to pull · f30c66ba

由 Max Reitz 提交于 2月 01, 2019

Before this patch, bdrv_refresh_filename() is used in a pushing manner:
Whenever the BDS graph is modified, the parents of the modified edges
are supposed to be updated (recursively upwards).  However, that is
nonviable, considering that we want child changes not to concern
parents.

Also, in the long run we want a pull model anyway: Here, we would have a
bdrv_filename() function which returns a BDS's filename, freshly
constructed.

This patch is an intermediate step.  It adds bdrv_refresh_filename()
calls before every place a BDS.filename value is used.  The only
exceptions are protocol drivers that use their own filename, which
clearly would not profit from refreshing that filename before.

Also, bdrv_get_encrypted_filename() is removed along the way (as a user
of BDS.filename), since it is completely unused.

In turn, all of the calls to bdrv_refresh_filename() before this patch
are removed, because we no longer have to call this function on graph
changes.
Signed-off-by: NMax Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-2-mreitz@redhat.com
Reviewed-by: NEric Blake <eblake@redhat.com>
Signed-off-by: NMax Reitz <mreitz@redhat.com>

f30c66ba

block: fix bdrv_check_perm for non-tree subgraph · f962e961

由 Vladimir Sementsov-Ogievskiy 提交于 2月 23, 2019

bdrv_check_perm in it's recursion checks each node in context of new
permissions for one parent, because of nature of DFS. It works well,
while children subgraph of top-most updated node is a tree, i.e. it
doesn't have any kind of loops. But if we have a loop (not oriented,
of course), i.e. we have two different ways from top-node to some
child-node, then bdrv_check_perm will do wrong thing:

  top
  | \
  |  |
  v  v
  A  B
  |  |
  v  v
  node

It will once check new permissions of node in context of new A
permissions and old B permissions and once visa-versa. It's a wrong way
and may lead to corruption of permission system. We may start with
no-permissions and all-shared for both A->node and B->node relations
and finish up with non shared write permission for both ways.

The following commit will add a test, which shows this bug.

To fix this situation, let's really set BdrvChild permissions during
bdrv_check_perm procedure. And we are happy here, as check-perm is
already written in transaction manner, so we just need to restore
backed-up permissions in _abort.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

f962e961

block: improve should_update_child · 2f30b7c3

由 Vladimir Sementsov-Ogievskiy 提交于 2月 23, 2019

As it already said in the comment, we don't want to create loops in
parent->child relations. So, when we try to append @to to @c, we should
check that @c is not in @to children subtree, and we should check it
recursively, not only the first level. The patch provides BFS-based
search, to check the relations.

This is needed for further fleecing-hook filter usage: we need to
append it to source, when the hook is already a parent of target, and
source may be in a backing chain of target (fleecing-scheme). So, on
appending, the hook should not became a child (direct or through
children subtree) of the target.
Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

2f30b7c3

block: Use normal drain for bdrv_set_aio_context() · d70d5954

由 Kevin Wolf 提交于 2月 08, 2019

Now that bdrv_set_aio_context() works inside drained sections, it can
also use the real drain function instead of open coding something
similar.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

d70d5954

block: Fix AioContext switch for drained node · e64f25f3

由 Kevin Wolf 提交于 2月 08, 2019

When a drained node changes its AioContext, we need to move its
aio_disable_external() to the new context, too.

Without this fix, drain_end will try to reenable the new context, which
has never been disabled, so an assertion failure is triggered.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

e64f25f3

block: Don't poll in bdrv_set_aio_context() · 6c75d761

由 Kevin Wolf 提交于 2月 15, 2019

The explicit aio_poll() call in bdrv_set_aio_context() was added in
commit c2b6428d as a workaround for bdrv_drain() failing to achieve
to actually quiesce everything (specifically the NBD client code to
switch AioContext).

Now that the NBD client has been fixed to complete this operation during
bdrv_drain(), we don't need the workaround any more.

It was wrong anyway: aio_poll() must always be run in the home thread of
the AioContext.
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Reviewed-by: NEric Blake <eblake@redhat.com>

6c75d761

block: don't set the same context · 57830a49

由 Denis Plotnikov 提交于 2月 15, 2019

Adds a fast path on aio context setting preventing
unnecessary context setting routine.
Also, it prevents issues with cyclic walk of child
bds-es appeared because of registering aio walking
notifiers:

Call stack:

0  __GI_raise
1  __GI_abort
2  __assert_fail_base
3  __GI___assert_fail
4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
7  block_job_attached_aio_context
8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
9  bdrv_set_aio_context (bs=0x55f54d65c000)
10 blk_set_aio_context
11 virtio_blk_data_plane_stop
12 virtio_bus_stop_ioeventfd
13 virtio_vmstate_change
14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
16 vm_stop (state=RUN_STATE_SHUTDOWN)
17 main_loop_should_exit
18 main_loop
19 main

This can happen because of "new" context attachment to VM disk bds.
When attaching a new context the corresponding aio context handler is
called for each of aio_notifiers registered on the VM disk bds context.
Among those handlers, there is the block_job_attached_aio_context handler
which sets a new aio context for the block job bds. When doing so,
the old context is detached from all the block job bds children and one of
them is the VM disk bds, serving as backing store for the blockjob bds,
although the VM disk bds is actually the initializer of that process.
Since the VM disk bds is protected with walking_aio_notifiers flag
from double processing in recursive calls, the assert fires.
Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
Signed-off-by: NKevin Wolf <kwolf@redhat.com>

57830a49

12 2月, 2019 1 次提交

bdrv_query_image_info Error parameter added · 1bf6e9ca

由 Andrey Shinkevich 提交于 2月 08, 2019

Inform a user in case qcow2_get_specific_info fails to obtain
QCOW2 image specific information. This patch is preliminary to
the one "qcow2: Add list of bitmaps to ImageInfoSpecificQCow2".
Signed-off-by: NAndrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: NEric Blake <eblake@redhat.com>
Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: NKevin Wolf <kwolf@redhat.com>
Message-Id: <1549638368-530182-2-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: NEric Blake <eblake@redhat.com>

1bf6e9ca

01 2月, 2019 1 次提交

block: Fix invalidate_cache error path for parent activation · 78fc3b3a

由 Kevin Wolf 提交于 1月 31, 2019

bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
actually activating a node so that the correct permissions etc. are
taken. In case of errors, the flag must be restored so that the next
call to bdrv_co_invalidate_cache() retries activation.

Restoring the flag was missing in the error path for a failed
parent->role->activate() call. The consequence is that this attempt to
activate all images correctly fails because we still set errp, however
on the next attempt BDRV_O_INACTIVE is already clear, so we return
success without actually retrying the failed action.

An example where this is observable in practice is migration to a QEMU
instance that has a raw format block node attached to a guest device
with share-rw=off (the default) while another process holds
BLK_PERM_WRITE for the same image. In this case, all activation steps
before parent->role->activate() succeed because raw can tolerate other
writers to the image. Only the parent callback (in particular
blk_root_activate()) tries to implement the share-rw=on property and
requests exclusive write permissions. This fails when the migration
completes and correctly displays an error. However, a manual 'cont' will
incorrectly resume the VM without calling blk_root_activate() again.

This case is described in more detail in the following bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1531888

Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
path.

Cc: qemu-stable@nongnu.org
Signed-off-by: NKevin Wolf <kwolf@redhat.com>
Tested-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>

78fc3b3a