1. 07 5月, 2019 1 次提交
  2. 30 4月, 2019 1 次提交
    • K
      block: Fix AioContext switch for bs->drv == NULL · 1bffe1ae
      Kevin Wolf 提交于
      Even for block nodes with bs->drv == NULL, we can't just ignore a
      bdrv_set_aio_context() call. Leaving the node in its old context can
      mean that it's still in an iothread context in bdrv_close_all() during
      shutdown, resulting in an attempted unlock of the AioContext lock which
      we don't hold.
      
      This is an example stack trace of a related crash:
      
       #0  0x00007ffff59da57f in raise () at /lib64/libc.so.6
       #1  0x00007ffff59c4895 in abort () at /lib64/libc.so.6
       #2  0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
       #3  0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
       #4  0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507
       #5  0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0)
               at block/io.c:833
       #6  0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990
       #7  0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990
       #8  0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51
       #9  0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248
       #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
       #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194
       #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194
       #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248
       #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
       #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124
       #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153
       #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358
       #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542
       #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598
       #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785
       #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483
       #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412
       #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776
      
      The reproducer I used is a qcow2 image on gluster volume, where the
      virtual disk size (4 GB) is larger than the gluster volume size (64M),
      so we can easily trigger an ENOSPC. This backend is assigned to a
      virtio-blk device using an iothread, and then from the guest a
      'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop
      because of an I/O error. qemu_gluster_co_flush_to_disk() sets
      bs->drv = NULL on error, so when virtio-blk stops the dataplane, the
      block nodes stay in the iothread AioContext. A 'quit' monitor command
      issued from this paused state crashes the process.
      
      Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      1bffe1ae
  3. 08 4月, 2019 1 次提交
  4. 02 4月, 2019 1 次提交
  5. 19 3月, 2019 1 次提交
  6. 13 3月, 2019 10 次提交
    • A
      block: Remove the AioContext parameter from bdrv_reopen_multiple() · 5019aece
      Alberto Garcia 提交于
      This parameter has been unused since 1a63a907Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      5019aece
    • A
      block: Add bdrv_reset_options_allowed() · faf116b4
      Alberto Garcia 提交于
      bdrv_reopen_prepare() receives a BDRVReopenState with (among other
      things) a new set of options to be applied to that BlockDriverState.
      
      If an option is missing then it means that we want to reset it to its
      default value rather than keep the previous one. This way the state
      of the block device after being reopened is comparable to that of a
      device added with "blockdev-add" using the same set of options.
      
      Not all options from all drivers can be changed this way, however.
      If the user attempts to reset an immutable option to its default value
      using this method then we must forbid it.
      
      This new function takes a BlockDriverState and a new set of options
      and checks if there's any option that was previously set but is
      missing from the new set of options.
      
      If the option is present in both sets we don't need to check that they
      have the same value. The loop at the end of bdrv_reopen_prepare()
      already takes care of that.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      faf116b4
    • A
      block: Allow changing the backing file on reopen · cb828c31
      Alberto Garcia 提交于
      This patch allows the user to change the backing file of an image that
      is being reopened. Here's what it does:
      
       - In bdrv_reopen_prepare(): check that the value of 'backing' points
         to an existing node or is null. If it points to an existing node it
         also needs to make sure that replacing the backing file will not
         create a cycle in the node graph (i.e. you cannot reach the parent
         from the new backing file).
      
       - In bdrv_reopen_commit(): perform the actual node replacement by
         calling bdrv_set_backing_hd().
      
      There may be temporary implicit nodes between a BDS and its backing
      file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
      looks for the real (non-implicit) backing file and requires that the
      'backing' option points to it. Replacing or detaching a backing file
      is forbidden if there are implicit nodes in the middle.
      
      Although x-blockdev-reopen is meant to be used like blockdev-add,
      there's an important thing that must be taken into account: the only
      way to set a new backing file is by using a reference to an existing
      node (previously added with e.g. blockdev-add).  If 'backing' contains
      a dictionary with a new set of options ({"driver": "qcow2", "file": {
      ... }}) then it is interpreted that the _existing_ backing file must
      be reopened with those options.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      cb828c31
    • A
      block: Allow omitting the 'backing' option in certain cases · bacd9b87
      Alberto Garcia 提交于
      Of all options of type BlockdevRef used to specify children in
      BlockdevOptions, 'backing' is the only one that is optional.
      
      For "x-blockdev-reopen" we want that if an option is omitted then it
      must be reset to its default value. The default value of 'backing'
      means that QEMU opens the backing file specified in the image
      metadata, but this is not something that we want to support for the
      reopen operation.
      
      Because of this the 'backing' option has to be specified during
      reopen, pointing to the existing backing file if we want to keep it,
      or pointing to a different one (or NULL) if we want to replace it (to
      be implemented in a subsequent patch).
      
      In order to simplify things a bit and not to require that the user
      passes the 'backing' option to every single block device even when
      it's clearly not necessary, this patch allows omitting this option if
      the block device being reopened doesn't have a backing file attached
      _and_ no default backing file is specified in the image metadata.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      bacd9b87
    • A
      block: Handle child references in bdrv_reopen_queue() · 8546632e
      Alberto Garcia 提交于
      Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
      which can contain a set of child options, a child reference, or
      NULL. In optional attributes like "backing" it can also be missing.
      
      Only the first case (set of child options) is being handled properly
      by bdrv_reopen_queue(). This patch deals with all the others.
      
      Here's how these cases should be handled when bdrv_reopen_queue() is
      deciding what to do with each child of a BlockDriverState:
      
         1) Set of child options: if the child was implicitly created (i.e
            inherits_from points to the parent) then the options are removed
            from the parent's options QDict and are passed to the child with
            a recursive bdrv_reopen_queue() call. This case was already
            working fine.
      
         2) Child reference: there's two possibilites here.
      
            2a) Reference to the current child: if the child was implicitly
                created then it is put in the reopen queue, keeping its
                current set of options (since this was a child reference
                there was no way to specify a different set of options).
                If the child is not implicit then it keeps its current set
                of options but it is not reopened (and therefore does not
                inherit any new option from the parent).
      
            2b) Reference to a different BDS: the current child is not put
                in the reopen queue at all. Passing a reference to a
                different BDS can be used to replace a child, although at
                the moment no driver implements this, so it results in an
                error. In any case, the current child is not going to be
                reopened (and might in fact disappear if it's replaced)
      
         3) NULL: This is similar to (2b). Although no driver allows this
            yet it can be used to detach the current child so it should not
            be put in the reopen queue.
      
         4) Missing option: at the moment "backing" is the only case where
            this can happen. With "blockdev-add", leaving "backing" out
            means that the default backing file is opened. We don't want to
            open a new image during reopen, so we require that "backing" is
            always present. We'll relax this requirement a bit in the next
            patch. If keep_old_opts is true and "backing" is missing then
            this behaves like 2a (the current child is reopened).
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      8546632e
    • A
      block: Add 'keep_old_opts' parameter to bdrv_reopen_queue() · 077e8e20
      Alberto Garcia 提交于
      The bdrv_reopen_queue() function is used to create a queue with
      the BDSs that are going to be reopened and their new options. Once
      the queue is ready bdrv_reopen_multiple() is called to perform the
      operation.
      
      The original options from each one of the BDSs are kept, with the new
      options passed to bdrv_reopen_queue() applied on top of them.
      
      For "x-blockdev-reopen" we want a function that behaves much like
      "blockdev-add". We want to ignore the previous set of options so that
      only the ones actually specified by the user are applied, with the
      rest having their default values.
      
      One of the things that we need is a way to tell bdrv_reopen_queue()
      whether we want to keep the old set of options or not, and that's what
      this patch does. All current callers are setting this new parameter to
      true and x-blockdev-reopen will set it to false.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      077e8e20
    • A
      block: Allow freezing BdrvChild links · 2cad1ebe
      Alberto Garcia 提交于
      Our permission system is useful to define what operations are allowed
      on a certain block node and includes things like BLK_PERM_WRITE or
      BLK_PERM_RESIZE among others.
      
      One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
      the node that this BdrvChild points to". The exact meaning of this has
      never been very clear, but it can be understood as "change any of the
      links connected to the node". This can be used to prevent changing a
      backing link, but it's too coarse.
      
      This patch adds a new 'frozen' attribute to BdrvChild, which forbids
      detaching the link from the node it points to, and new API to freeze
      and unfreeze a backing chain.
      
      After this change a few functions can fail, so they need additional
      checks.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2cad1ebe
    • K
      file-posix: Fix bdrv_open_flags() for snapshot=on · 30855137
      Kevin Wolf 提交于
      Using a different read-only setting for bs->open_flags than for the
      flags to the driver's open function is just inconsistent and a bad idea.
      After this patch, the temporary snapshot keeps being opened read-only if
      read-only=on,snapshot=on is passed.
      
      If we wanted to change this behaviour to make only the orginal image
      file read-only, but the temporary overlay read-write (as the comment in
      the removed code suggests), that change would have to be made in
      bdrv_temp_snapshot_options() (where the comment suggests otherwise).
      
      Addressing this inconsistency before introducing dynamic auto-read-only
      is important because otherwise we would immediately try to reopen the
      temporary overlay even though the file is already unlinked.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      30855137
    • K
      block: Make permission changes in reopen less wrong · 69b736e7
      Kevin Wolf 提交于
      The way that reopen interacts with permission changes has one big
      problem: Both operations are recursive, and the permissions are changes
      for each node in the reopen queue.
      
      For a simple graph that consists just of parent and child,
      .bdrv_check_perm will be called twice for the child, once recursively
      when adjusting the permissions of parent, and once again when the child
      itself is reopened.
      
      Even worse, the first .bdrv_check_perm call happens before
      .bdrv_reopen_prepare was called for the child and the second one is
      called afterwards.
      
      Making sure that .bdrv_check_perm (and the other permission callbacks)
      are called only once is hard. We can cope with multiple calls right now,
      but as soon as file-posix gets a dynamic auto-read-only that may need to
      open a new file descriptor, we get the additional requirement that all
      of them are after the .bdrv_reopen_prepare call.
      
      So reorder things in bdrv_reopen_multiple() to first call
      .bdrv_reopen_prepare for all involved nodes and only then adjust
      permissions.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      69b736e7
    • K
      block: Avoid useless local_err · a4615ab3
      Kevin Wolf 提交于
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      a4615ab3
  7. 08 3月, 2019 1 次提交
    • A
      block: iterate_format with account of whitelisting · 9ac404c5
      Andrey Shinkevich 提交于
      bdrv_iterate_format (which is currently only used for printing out the
      formats supported by the block layer) doesn't take format whitelisting
      into account.
      
      This creates a problem for tests: they enumerate supported formats to
      decide which tests to enable, but then discover that QEMU doesn't let
      them actually use some of those formats.
      
      To avoid that, exclude formats that are not whitelisted from
      enumeration, if whitelisting is in use.  Since we have separate
      whitelists for r/w and r/o, take this a parameter to
      bdrv_iterate_format, and print two lists of supported formats (r/w and
      r/o) in main qemu.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NAndrey Shinkevich <andrey.shinkevich@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9ac404c5
  8. 25 2月, 2019 22 次提交
    • M
      block: BDS options may lack the "driver" option · 62a01a27
      Max Reitz 提交于
      When BDSs are created by qemu itself (e.g. as filters in block jobs),
      they may not have a "driver" option in their options QDict.  When
      generating a json:{} filename, however, it must always be present.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-31-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      62a01a27
    • M
      block: Do not copy exact_filename from format file · fb695c74
      Max Reitz 提交于
      If a format BDS's file BDS is in turn a format BDS, we cannot simply use
      the same filename, because when opening a BDS tree based on a filename
      alone, qemu will create only one format node on top of one protocol node
      (disregarding a potential backing file).
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-26-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      fb695c74
    • M
      block: Purify .bdrv_refresh_filename() · 998b3a1e
      Max Reitz 提交于
      Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
      refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
      Now that we have generic code in the central bdrv_refresh_filename() for
      creating BDS.full_open_options, we can drop the latter part from all
      BlockDriver.bdrv_refresh_filename() implementations.
      
      This also means that we can drop all of the existing default code for
      this from the global bdrv_refresh_filename() itself.
      
      Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
      after having set BDS.full_open_options, because the block driver's
      implementation should now be allowed to depend on BDS.full_open_options
      being set correctly.
      
      Finally, with this patch we can drop the @options parameter from
      BlockDriver.bdrv_refresh_filename(); also, add a comment on this
      function's purpose in block/block_int.h while touching its interface.
      
      This completely obsoletes blklogwrite's implementation of
      .bdrv_refresh_filename().
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190201192935.18394-25-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      998b3a1e
    • M
      block: Generically refresh runtime options · 97e2f021
      Max Reitz 提交于
      Instead of having every block driver which implements
      bdrv_refresh_filename() copy all of the strong runtime options over to
      bs->full_open_options, implement this process generically in
      bdrv_refresh_filename().
      
      This patch only adds this new generic implementation, it does not remove
      the old functionality. This is done in a follow-up patch.
      
      With this patch, some superfluous information (that should never have
      been there) may be removed from some JSON filenames, as can be seen in
      the change to iotests 110's and 228's reference outputs.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190201192935.18394-24-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      97e2f021
    • M
      block: Use bdrv_dirname() for relative filenames · 8df68616
      Max Reitz 提交于
      bdrv_get_full_backing_filename_from_filename() breaks down when it comes
      to JSON filenames. Using bdrv_dirname() as the basis is better because
      since we have BDS, we can descend through the BDS tree to the protocol
      layer, which gives us a greater probability of finding a non-JSON name;
      also, bdrv_dirname() is more correct as it allows block drivers to
      override the generation of that directory name in a protocol-specific
      way.
      
      We still need to keep bdrv_get_full_backing_filename_from_filename(),
      though, because it has valid callers which need it during image creation
      when no BDS is available yet.
      
      This makes a test case in qemu-iotest 110, which was supposed to fail,
      work. That is actually good, but we need to change the reference output
      (and the comment in 110) accordingly.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-20-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      8df68616
    • M
      block: Add bdrv_dirname() · 1e89d0f9
      Max Reitz 提交于
      This function may be implemented by block drivers to derive a directory
      name from a BDS. Concatenating this g_free()-able string with a relative
      filename must result in a valid (not necessarily existing) filename, so
      this is a function that should generally be not implemented by format
      drivers, because this is protocol-specific.
      
      If a BDS's driver does not implement this function, bdrv_dirname() will
      fall through to the BDS's file if it exists. If it does not, the
      exact_filename field will be used to generate a directory name.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-15-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      1e89d0f9
    • M
      block: Fix bdrv_find_backing_image() · 2d9158ce
      Max Reitz 提交于
      bdrv_find_backing_image() should use bdrv_get_full_backing_filename() or
      bdrv_make_absolute_filename() instead of trying to do what those
      functions do by itself.
      
      path_combine_deprecated() can now be dropped, so let's do that.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-14-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      2d9158ce
    • M
      block: Add bdrv_make_absolute_filename() · 9f4793d8
      Max Reitz 提交于
      This is a general function for making a filename that is relative to a
      certain BDS absolute.
      
      It calls bdrv_get_full_backing_filename_from_filename() for now, but
      that will be changed in a follow-up patch.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-13-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      9f4793d8
    • M
      block: bdrv_get_full_backing_filename's ret. val. · 6b6833c1
      Max Reitz 提交于
      Make bdrv_get_full_backing_filename() return an allocated string instead
      of placing the result in a caller-provided buffer.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-12-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      6b6833c1
    • M
      block: bdrv_get_full_backing_filename_from_...'s ret. val. · 645ae7d8
      Max Reitz 提交于
      Make bdrv_get_full_backing_filename_from_filename() return an allocated
      string instead of placing the result in a caller-provided buffer.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190201192935.18394-11-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      645ae7d8
    • M
      block: Make path_combine() return the path · 009b03aa
      Max Reitz 提交于
      Besides being safe for arbitrary path lengths, after some follow-up
      patches all callers will want a freshly allocated buffer anyway.
      
      In the meantime, path_combine_deprecated() is added which has the same
      interface as path_combine() had before this patch. All callers to that
      function will be converted in follow-up patches.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Message-id: 20190201192935.18394-10-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      009b03aa
    • M
      block: Respect backing bs in bdrv_refresh_filename · 90993623
      Max Reitz 提交于
      Basically, bdrv_refresh_filename() should respect all children of a
      BlockDriverState. However, generally those children are driver-specific,
      so this function cannot handle the general case. On the other hand,
      there are only few drivers which use other children than @file and
      @backing (that being vmdk, quorum, and blkverify).
      
      Most block drivers only use @file and/or @backing (if they use any
      children at all). Both can be implemented directly in
      bdrv_refresh_filename.
      
      The user overriding the file's filename is already handled, however, the
      user overriding the backing file is not. If this is done, opening the
      BDS with the plain filename of its file will not be correct, so we may
      not set bs->exact_filename in that case.
      
      iotest 051 contains test cases for overriding the backing file, and so
      its output changes with this patch applied.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-6-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      90993623
    • M
      block: Add BDS.auto_backing_file · 998c2019
      Max Reitz 提交于
      If the backing file is overridden, this most probably does change the
      guest-visible data of a BDS.  Therefore, we will need to consider this
      in bdrv_refresh_filename().
      
      To see whether it has been overridden, we might want to compare
      bs->backing_file and bs->backing->bs->filename.  However,
      bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
      to change the backing child at runtime, without modifying the image
      header), so bs->backing_file most of the time simply contains a copy of
      bs->backing->bs->filename anyway, so it is useless for such a
      comparison.
      
      This patch adds an auto_backing_file BDS field which contains the
      backing file path as indicated by the image header, which is not changed
      by bdrv_set_backing_hd().
      
      Because of bdrv_refresh_filename() magic, however, a BDS's filename may
      differ from what has been specified during bdrv_open().  Then, the
      comparison between bs->auto_backing_file and bs->backing->bs->filename
      may fail even though bs->backing was opened from bs->auto_backing_file.
      To mitigate this, we can copy the real BDS's filename (after the whole
      bdrv_open() and bdrv_refresh_filename() process) into
      bs->auto_backing_file, if we know the former has been opened based on
      the latter.  This is only possible if no options modifying the backing
      file's behavior have been specified, though.  To simplify things, this
      patch only copies the filename from the backing file if no options have
      been specified for it at all.
      
      Furthermore, there are cases where an overlay is created by qemu which
      already contains a BDS's filename (e.g. in blockdev-snapshot-sync).  We
      do not need to worry about updating the overlay's bs->auto_backing_file
      there, because we actually wrote a post-bdrv_refresh_filename() filename
      into the image header.
      
      So all in all, there will be false negatives where (as of a future
      patch) bdrv_refresh_filename() will assume that the backing file differs
      from what was specified in the image header, even though it really does
      not.  However, these cases should be limited to where (1) the user
      actually did override something in the backing chain (e.g. by specifying
      options for the backing file), or (2) the user executed a QMP command to
      change some node's backing file (e.g. change-backing-file or
      block-commit with @backing-file given) where the given filename does not
      happen to coincide with qemu's idea of the backing BDS's filename.
      
      Then again, (1) really is limited to -drive.  With -blockdev or
      blockdev-add, you have to adhere to the schema, so a user cannot give
      partial "unimportant" options (e.g. by just setting backing.node-name
      and leaving the rest to the image header).  Therefore, trying to fix
      this would mean trying to fix something for -drive only.
      
      To improve on (2), we would need a full infrastructure to "canonicalize"
      an arbitrary filename (+ options), so it can be compared against
      another.  That seems a bit over the top, considering that filenames
      nowadays are there mostly for the user's entertainment.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-5-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      998c2019
    • M
      block: Skip implicit nodes for filename info · bb808d5f
      Max Reitz 提交于
      bdrv_refresh_filename() should simply skip all implicit nodes.  They are
      supposed to be invisible to the user, so they should not appear in
      filename information.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-4-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      bb808d5f
    • M
      block: Use children list in bdrv_refresh_filename · e24518e3
      Max Reitz 提交于
      bdrv_refresh_filename() should invoke itself recursively on all
      children, not just on file.
      
      With that change, we can remove the manual invocations in blkverify,
      quorum, commit, mirror, and blklogwrites.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-3-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      e24518e3
    • M
      block: Use bdrv_refresh_filename() to pull · f30c66ba
      Max Reitz 提交于
      Before this patch, bdrv_refresh_filename() is used in a pushing manner:
      Whenever the BDS graph is modified, the parents of the modified edges
      are supposed to be updated (recursively upwards).  However, that is
      nonviable, considering that we want child changes not to concern
      parents.
      
      Also, in the long run we want a pull model anyway: Here, we would have a
      bdrv_filename() function which returns a BDS's filename, freshly
      constructed.
      
      This patch is an intermediate step.  It adds bdrv_refresh_filename()
      calls before every place a BDS.filename value is used.  The only
      exceptions are protocol drivers that use their own filename, which
      clearly would not profit from refreshing that filename before.
      
      Also, bdrv_get_encrypted_filename() is removed along the way (as a user
      of BDS.filename), since it is completely unused.
      
      In turn, all of the calls to bdrv_refresh_filename() before this patch
      are removed, because we no longer have to call this function on graph
      changes.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190201192935.18394-2-mreitz@redhat.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      f30c66ba
    • V
      block: fix bdrv_check_perm for non-tree subgraph · f962e961
      Vladimir Sementsov-Ogievskiy 提交于
      bdrv_check_perm in it's recursion checks each node in context of new
      permissions for one parent, because of nature of DFS. It works well,
      while children subgraph of top-most updated node is a tree, i.e. it
      doesn't have any kind of loops. But if we have a loop (not oriented,
      of course), i.e. we have two different ways from top-node to some
      child-node, then bdrv_check_perm will do wrong thing:
      
        top
        | \
        |  |
        v  v
        A  B
        |  |
        v  v
        node
      
      It will once check new permissions of node in context of new A
      permissions and old B permissions and once visa-versa. It's a wrong way
      and may lead to corruption of permission system. We may start with
      no-permissions and all-shared for both A->node and B->node relations
      and finish up with non shared write permission for both ways.
      
      The following commit will add a test, which shows this bug.
      
      To fix this situation, let's really set BdrvChild permissions during
      bdrv_check_perm procedure. And we are happy here, as check-perm is
      already written in transaction manner, so we just need to restore
      backed-up permissions in _abort.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      f962e961
    • V
      block: improve should_update_child · 2f30b7c3
      Vladimir Sementsov-Ogievskiy 提交于
      As it already said in the comment, we don't want to create loops in
      parent->child relations. So, when we try to append @to to @c, we should
      check that @c is not in @to children subtree, and we should check it
      recursively, not only the first level. The patch provides BFS-based
      search, to check the relations.
      
      This is needed for further fleecing-hook filter usage: we need to
      append it to source, when the hook is already a parent of target, and
      source may be in a backing chain of target (fleecing-scheme). So, on
      appending, the hook should not became a child (direct or through
      children subtree) of the target.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2f30b7c3
    • K
      block: Use normal drain for bdrv_set_aio_context() · d70d5954
      Kevin Wolf 提交于
      Now that bdrv_set_aio_context() works inside drained sections, it can
      also use the real drain function instead of open coding something
      similar.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d70d5954
    • K
      block: Fix AioContext switch for drained node · e64f25f3
      Kevin Wolf 提交于
      When a drained node changes its AioContext, we need to move its
      aio_disable_external() to the new context, too.
      
      Without this fix, drain_end will try to reenable the new context, which
      has never been disabled, so an assertion failure is triggered.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      e64f25f3
    • K
      block: Don't poll in bdrv_set_aio_context() · 6c75d761
      Kevin Wolf 提交于
      The explicit aio_poll() call in bdrv_set_aio_context() was added in
      commit c2b6428d as a workaround for bdrv_drain() failing to achieve
      to actually quiesce everything (specifically the NBD client code to
      switch AioContext).
      
      Now that the NBD client has been fixed to complete this operation during
      bdrv_drain(), we don't need the workaround any more.
      
      It was wrong anyway: aio_poll() must always be run in the home thread of
      the AioContext.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      6c75d761
    • D
      block: don't set the same context · 57830a49
      Denis Plotnikov 提交于
      Adds a fast path on aio context setting preventing
      unnecessary context setting routine.
      Also, it prevents issues with cyclic walk of child
      bds-es appeared because of registering aio walking
      notifiers:
      
      Call stack:
      
      0  __GI_raise
      1  __GI_abort
      2  __assert_fail_base
      3  __GI___assert_fail
      4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
      5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
      6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
      7  block_job_attached_aio_context
      8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
      9  bdrv_set_aio_context (bs=0x55f54d65c000)
      10 blk_set_aio_context
      11 virtio_blk_data_plane_stop
      12 virtio_bus_stop_ioeventfd
      13 virtio_vmstate_change
      14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
      15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
      16 vm_stop (state=RUN_STATE_SHUTDOWN)
      17 main_loop_should_exit
      18 main_loop
      19 main
      
      This can happen because of "new" context attachment to VM disk bds.
      When attaching a new context the corresponding aio context handler is
      called for each of aio_notifiers registered on the VM disk bds context.
      Among those handlers, there is the block_job_attached_aio_context handler
      which sets a new aio context for the block job bds. When doing so,
      the old context is detached from all the block job bds children and one of
      them is the VM disk bds, serving as backing store for the blockjob bds,
      although the VM disk bds is actually the initializer of that process.
      Since the VM disk bds is protected with walking_aio_notifiers flag
      from double processing in recursive calls, the assert fires.
      Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      57830a49
  9. 12 2月, 2019 1 次提交
  10. 01 2月, 2019 1 次提交
    • K
      block: Fix invalidate_cache error path for parent activation · 78fc3b3a
      Kevin Wolf 提交于
      bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
      actually activating a node so that the correct permissions etc. are
      taken. In case of errors, the flag must be restored so that the next
      call to bdrv_co_invalidate_cache() retries activation.
      
      Restoring the flag was missing in the error path for a failed
      parent->role->activate() call. The consequence is that this attempt to
      activate all images correctly fails because we still set errp, however
      on the next attempt BDRV_O_INACTIVE is already clear, so we return
      success without actually retrying the failed action.
      
      An example where this is observable in practice is migration to a QEMU
      instance that has a raw format block node attached to a guest device
      with share-rw=off (the default) while another process holds
      BLK_PERM_WRITE for the same image. In this case, all activation steps
      before parent->role->activate() succeed because raw can tolerate other
      writers to the image. Only the parent callback (in particular
      blk_root_activate()) tries to implement the share-rw=on property and
      requests exclusive write permissions. This fails when the migration
      completes and correctly displays an error. However, a manual 'cont' will
      incorrectly resume the VM without calling blk_root_activate() again.
      
      This case is described in more detail in the following bug report:
      https://bugzilla.redhat.com/show_bug.cgi?id=1531888
      
      Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
      path.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Tested-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      78fc3b3a