1. 25 2月, 2019 15 次提交
    • M
      block: Add bdrv_make_absolute_filename() · 9f4793d8
      Max Reitz 提交于
      This is a general function for making a filename that is relative to a
      certain BDS absolute.
      
      It calls bdrv_get_full_backing_filename_from_filename() for now, but
      that will be changed in a follow-up patch.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-13-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      9f4793d8
    • M
      block: bdrv_get_full_backing_filename's ret. val. · 6b6833c1
      Max Reitz 提交于
      Make bdrv_get_full_backing_filename() return an allocated string instead
      of placing the result in a caller-provided buffer.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-12-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      6b6833c1
    • M
      block: bdrv_get_full_backing_filename_from_...'s ret. val. · 645ae7d8
      Max Reitz 提交于
      Make bdrv_get_full_backing_filename_from_filename() return an allocated
      string instead of placing the result in a caller-provided buffer.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190201192935.18394-11-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      645ae7d8
    • M
      block: Make path_combine() return the path · 009b03aa
      Max Reitz 提交于
      Besides being safe for arbitrary path lengths, after some follow-up
      patches all callers will want a freshly allocated buffer anyway.
      
      In the meantime, path_combine_deprecated() is added which has the same
      interface as path_combine() had before this patch. All callers to that
      function will be converted in follow-up patches.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Message-id: 20190201192935.18394-10-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      009b03aa
    • M
      block: Respect backing bs in bdrv_refresh_filename · 90993623
      Max Reitz 提交于
      Basically, bdrv_refresh_filename() should respect all children of a
      BlockDriverState. However, generally those children are driver-specific,
      so this function cannot handle the general case. On the other hand,
      there are only few drivers which use other children than @file and
      @backing (that being vmdk, quorum, and blkverify).
      
      Most block drivers only use @file and/or @backing (if they use any
      children at all). Both can be implemented directly in
      bdrv_refresh_filename.
      
      The user overriding the file's filename is already handled, however, the
      user overriding the backing file is not. If this is done, opening the
      BDS with the plain filename of its file will not be correct, so we may
      not set bs->exact_filename in that case.
      
      iotest 051 contains test cases for overriding the backing file, and so
      its output changes with this patch applied.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-6-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      90993623
    • M
      block: Add BDS.auto_backing_file · 998c2019
      Max Reitz 提交于
      If the backing file is overridden, this most probably does change the
      guest-visible data of a BDS.  Therefore, we will need to consider this
      in bdrv_refresh_filename().
      
      To see whether it has been overridden, we might want to compare
      bs->backing_file and bs->backing->bs->filename.  However,
      bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
      to change the backing child at runtime, without modifying the image
      header), so bs->backing_file most of the time simply contains a copy of
      bs->backing->bs->filename anyway, so it is useless for such a
      comparison.
      
      This patch adds an auto_backing_file BDS field which contains the
      backing file path as indicated by the image header, which is not changed
      by bdrv_set_backing_hd().
      
      Because of bdrv_refresh_filename() magic, however, a BDS's filename may
      differ from what has been specified during bdrv_open().  Then, the
      comparison between bs->auto_backing_file and bs->backing->bs->filename
      may fail even though bs->backing was opened from bs->auto_backing_file.
      To mitigate this, we can copy the real BDS's filename (after the whole
      bdrv_open() and bdrv_refresh_filename() process) into
      bs->auto_backing_file, if we know the former has been opened based on
      the latter.  This is only possible if no options modifying the backing
      file's behavior have been specified, though.  To simplify things, this
      patch only copies the filename from the backing file if no options have
      been specified for it at all.
      
      Furthermore, there are cases where an overlay is created by qemu which
      already contains a BDS's filename (e.g. in blockdev-snapshot-sync).  We
      do not need to worry about updating the overlay's bs->auto_backing_file
      there, because we actually wrote a post-bdrv_refresh_filename() filename
      into the image header.
      
      So all in all, there will be false negatives where (as of a future
      patch) bdrv_refresh_filename() will assume that the backing file differs
      from what was specified in the image header, even though it really does
      not.  However, these cases should be limited to where (1) the user
      actually did override something in the backing chain (e.g. by specifying
      options for the backing file), or (2) the user executed a QMP command to
      change some node's backing file (e.g. change-backing-file or
      block-commit with @backing-file given) where the given filename does not
      happen to coincide with qemu's idea of the backing BDS's filename.
      
      Then again, (1) really is limited to -drive.  With -blockdev or
      blockdev-add, you have to adhere to the schema, so a user cannot give
      partial "unimportant" options (e.g. by just setting backing.node-name
      and leaving the rest to the image header).  Therefore, trying to fix
      this would mean trying to fix something for -drive only.
      
      To improve on (2), we would need a full infrastructure to "canonicalize"
      an arbitrary filename (+ options), so it can be compared against
      another.  That seems a bit over the top, considering that filenames
      nowadays are there mostly for the user's entertainment.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-5-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      998c2019
    • M
      block: Skip implicit nodes for filename info · bb808d5f
      Max Reitz 提交于
      bdrv_refresh_filename() should simply skip all implicit nodes.  They are
      supposed to be invisible to the user, so they should not appear in
      filename information.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-4-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      bb808d5f
    • M
      block: Use children list in bdrv_refresh_filename · e24518e3
      Max Reitz 提交于
      bdrv_refresh_filename() should invoke itself recursively on all
      children, not just on file.
      
      With that change, we can remove the manual invocations in blkverify,
      quorum, commit, mirror, and blklogwrites.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20190201192935.18394-3-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      e24518e3
    • M
      block: Use bdrv_refresh_filename() to pull · f30c66ba
      Max Reitz 提交于
      Before this patch, bdrv_refresh_filename() is used in a pushing manner:
      Whenever the BDS graph is modified, the parents of the modified edges
      are supposed to be updated (recursively upwards).  However, that is
      nonviable, considering that we want child changes not to concern
      parents.
      
      Also, in the long run we want a pull model anyway: Here, we would have a
      bdrv_filename() function which returns a BDS's filename, freshly
      constructed.
      
      This patch is an intermediate step.  It adds bdrv_refresh_filename()
      calls before every place a BDS.filename value is used.  The only
      exceptions are protocol drivers that use their own filename, which
      clearly would not profit from refreshing that filename before.
      
      Also, bdrv_get_encrypted_filename() is removed along the way (as a user
      of BDS.filename), since it is completely unused.
      
      In turn, all of the calls to bdrv_refresh_filename() before this patch
      are removed, because we no longer have to call this function on graph
      changes.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20190201192935.18394-2-mreitz@redhat.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      f30c66ba
    • V
      block: fix bdrv_check_perm for non-tree subgraph · f962e961
      Vladimir Sementsov-Ogievskiy 提交于
      bdrv_check_perm in it's recursion checks each node in context of new
      permissions for one parent, because of nature of DFS. It works well,
      while children subgraph of top-most updated node is a tree, i.e. it
      doesn't have any kind of loops. But if we have a loop (not oriented,
      of course), i.e. we have two different ways from top-node to some
      child-node, then bdrv_check_perm will do wrong thing:
      
        top
        | \
        |  |
        v  v
        A  B
        |  |
        v  v
        node
      
      It will once check new permissions of node in context of new A
      permissions and old B permissions and once visa-versa. It's a wrong way
      and may lead to corruption of permission system. We may start with
      no-permissions and all-shared for both A->node and B->node relations
      and finish up with non shared write permission for both ways.
      
      The following commit will add a test, which shows this bug.
      
      To fix this situation, let's really set BdrvChild permissions during
      bdrv_check_perm procedure. And we are happy here, as check-perm is
      already written in transaction manner, so we just need to restore
      backed-up permissions in _abort.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      f962e961
    • V
      block: improve should_update_child · 2f30b7c3
      Vladimir Sementsov-Ogievskiy 提交于
      As it already said in the comment, we don't want to create loops in
      parent->child relations. So, when we try to append @to to @c, we should
      check that @c is not in @to children subtree, and we should check it
      recursively, not only the first level. The patch provides BFS-based
      search, to check the relations.
      
      This is needed for further fleecing-hook filter usage: we need to
      append it to source, when the hook is already a parent of target, and
      source may be in a backing chain of target (fleecing-scheme). So, on
      appending, the hook should not became a child (direct or through
      children subtree) of the target.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2f30b7c3
    • K
      block: Use normal drain for bdrv_set_aio_context() · d70d5954
      Kevin Wolf 提交于
      Now that bdrv_set_aio_context() works inside drained sections, it can
      also use the real drain function instead of open coding something
      similar.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d70d5954
    • K
      block: Fix AioContext switch for drained node · e64f25f3
      Kevin Wolf 提交于
      When a drained node changes its AioContext, we need to move its
      aio_disable_external() to the new context, too.
      
      Without this fix, drain_end will try to reenable the new context, which
      has never been disabled, so an assertion failure is triggered.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      e64f25f3
    • K
      block: Don't poll in bdrv_set_aio_context() · 6c75d761
      Kevin Wolf 提交于
      The explicit aio_poll() call in bdrv_set_aio_context() was added in
      commit c2b6428d as a workaround for bdrv_drain() failing to achieve
      to actually quiesce everything (specifically the NBD client code to
      switch AioContext).
      
      Now that the NBD client has been fixed to complete this operation during
      bdrv_drain(), we don't need the workaround any more.
      
      It was wrong anyway: aio_poll() must always be run in the home thread of
      the AioContext.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      6c75d761
    • D
      block: don't set the same context · 57830a49
      Denis Plotnikov 提交于
      Adds a fast path on aio context setting preventing
      unnecessary context setting routine.
      Also, it prevents issues with cyclic walk of child
      bds-es appeared because of registering aio walking
      notifiers:
      
      Call stack:
      
      0  __GI_raise
      1  __GI_abort
      2  __assert_fail_base
      3  __GI___assert_fail
      4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
      5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
      6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
      7  block_job_attached_aio_context
      8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
      9  bdrv_set_aio_context (bs=0x55f54d65c000)
      10 blk_set_aio_context
      11 virtio_blk_data_plane_stop
      12 virtio_bus_stop_ioeventfd
      13 virtio_vmstate_change
      14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
      15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
      16 vm_stop (state=RUN_STATE_SHUTDOWN)
      17 main_loop_should_exit
      18 main_loop
      19 main
      
      This can happen because of "new" context attachment to VM disk bds.
      When attaching a new context the corresponding aio context handler is
      called for each of aio_notifiers registered on the VM disk bds context.
      Among those handlers, there is the block_job_attached_aio_context handler
      which sets a new aio context for the block job bds. When doing so,
      the old context is detached from all the block job bds children and one of
      them is the VM disk bds, serving as backing store for the blockjob bds,
      although the VM disk bds is actually the initializer of that process.
      Since the VM disk bds is protected with walking_aio_notifiers flag
      from double processing in recursive calls, the assert fires.
      Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      57830a49
  2. 12 2月, 2019 1 次提交
  3. 01 2月, 2019 3 次提交
    • K
      block: Fix invalidate_cache error path for parent activation · 78fc3b3a
      Kevin Wolf 提交于
      bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
      actually activating a node so that the correct permissions etc. are
      taken. In case of errors, the flag must be restored so that the next
      call to bdrv_co_invalidate_cache() retries activation.
      
      Restoring the flag was missing in the error path for a failed
      parent->role->activate() call. The consequence is that this attempt to
      activate all images correctly fails because we still set errp, however
      on the next attempt BDRV_O_INACTIVE is already clear, so we return
      success without actually retrying the failed action.
      
      An example where this is observable in practice is migration to a QEMU
      instance that has a raw format block node attached to a guest device
      with share-rw=off (the default) while another process holds
      BLK_PERM_WRITE for the same image. In this case, all activation steps
      before parent->role->activate() succeed because raw can tolerate other
      writers to the image. Only the parent callback (in particular
      blk_root_activate()) tries to implement the share-rw=on property and
      requests exclusive write permissions. This fails when the migration
      completes and correctly displays an error. However, a manual 'cont' will
      incorrectly resume the VM without calling blk_root_activate() again.
      
      This case is described in more detail in the following bug report:
      https://bugzilla.redhat.com/show_bug.cgi?id=1531888
      
      Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
      path.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Tested-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      78fc3b3a
    • K
      block: Apply auto-read-only for ro-whitelist drivers · 8be25de6
      Kevin Wolf 提交于
      If QEMU was configured with a driver in --block-drv-ro-whitelist, trying
      to use that driver read-write resulted in an error message even if
      auto-read-only=on was set.
      
      Consider auto-read-only=on for the whitelist checking and use it to
      automatically degrade to read-only for block drivers on the read-only
      whitelist.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      8be25de6
    • K
      block: Fix hangs in synchronous APIs with iothreads · 4720cbee
      Kevin Wolf 提交于
      In the block layer, synchronous APIs are often implemented by creating a
      coroutine that calls the asynchronous coroutine-based implementation and
      then waiting for completion with BDRV_POLL_WHILE().
      
      For this to work with iothreads (more specifically, when the synchronous
      API is called in a thread that is not the home thread of the block
      device, so that the coroutine will run in a different thread), we must
      make sure to call aio_wait_kick() at the end of the operation. Many
      places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
      the condition has long become false.
      
      Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
      corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
      generally not enough for most other operations because they haven't set
      the return value in the coroutine entry stub yet. To avoid race
      conditions there, we need to kick after setting the return value.
      
      The race window is small enough that the problem doesn't usually surface
      in the common path. However, it does surface and causes easily
      reproducible hangs if the operation can return early before even calling
      bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
      success paths).
      
      The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
      slightly different: These functions even neglected to schedule the
      coroutine in the home thread of the node. This avoids the hang, but is
      obviously wrong, too. Fix those to schedule the coroutine in the right
      AioContext in addition to adding aio_wait_kick() calls.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      4720cbee
  4. 31 1月, 2019 1 次提交
  5. 14 12月, 2018 7 次提交
  6. 27 11月, 2018 1 次提交
    • K
      block: Don't inactivate children before parents · 9e37271f
      Kevin Wolf 提交于
      bdrv_child_cb_inactivate() asserts that parents are already inactive
      when children get inactivated. This precondition is necessary because
      parents could still issue requests in their inactivation code.
      
      When block nodes are created individually with -blockdev, all of them
      are monitor owned and will be returned by bdrv_next() in an undefined
      order (in practice, in the order of their creation, which is usually
      children before parents), which obviously fails the assertion:
      
      qemu: block.c:899: bdrv_child_cb_inactivate: Assertion `bs->open_flags & BDRV_O_INACTIVE' failed.
      
      This patch fixes the ordering by skipping nodes with still active
      parents in bdrv_inactivate_recurse() because we know that they will be
      covered by recursion when the last active parent becomes inactive.
      
      With the correct parents-before-children ordering, we also got rid of
      the reason why commit aad0b7a0 introduced two passes, so we can go
      back to a single-pass recursion. This is necessary so we can rely on the
      BDRV_O_INACTIVE flag to skip nodes with active parents (the flag used
      to be set only in pass 2, so we would always skip non-root nodes in
      pass 1 because all parents would still be considered active; setting the
      flag in pass 1 would mean, that we never skip anything in pass 2 because
      all parents are already considered inactive).
      
      Because of the change to single pass, this patch is best reviewed with
      whitespace changes ignored.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      9e37271f
  7. 23 11月, 2018 2 次提交
    • A
      block: Update BlockDriverState.inherits_from on bdrv_drop_intermediate() · 6bd858b3
      Alberto Garcia 提交于
      The previous patch fixed the inherits_from pointer after block-stream,
      and this one does the same for block-commit.
      
      When block-commit finishes and the 'top' node is not the topmost one
      from the backing chain then all nodes above 'base' up to and including
      'top' are removed from the chain.
      
      The bdrv_drop_intermediate() call converts a chain like this one:
      
          base <- intermediate <- top <- active
      
      into this one:
      
          base <- active
      
      In a simple scenario each backing file from the first chain has the
      inherits_from attribute pointing to its parent. This means that
      reopening 'active' will recursively reopen all its children, whose
      options can be changed in the process.
      
      However after the 'block-commit' call base.inherits_from is NULL and
      the chain is broken, so 'base' does not inherit from 'active' and will
      not be reopened automatically:
      
         $ qemu-img create -f qcow2 hd0.qcow2 1M
         $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2
         $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2
         $ $QEMU -drive if=none,file=hd2.qcow2
      
         { 'execute': 'block-commit',
           'arguments': {
             'device': 'none0',
             'top': 'hd1.qcow2' } }
      
         { 'execute': 'human-monitor-command',
           'arguments': {
              'command-line':
                'qemu-io none0 "reopen -o backing.l2-cache-size=2M"' } }
      
         { "return": "Cannot change the option 'backing.l2-cache-size'\r\n"}
      
      This patch updates base.inherits_from in this scenario, and adds a
      test case.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      6bd858b3
    • A
      block: Update BlockDriverState.inherits_from on bdrv_set_backing_hd() · 0065c455
      Alberto Garcia 提交于
      When a BlockDriverState's child is opened (be it a backing file, the
      protocol layer, or any other) inherits_from is set to point to the
      parent node. Children opened separately and then attached to a parent
      don't have this pointer set.
      
      bdrv_reopen_queue_child() uses this to determine whether a node's
      children must also be reopened inheriting the options from the parent
      or not. If inherits_from points to the parent then the child is
      reopened and its options can be changed, like in this example:
      
         $ qemu-img create -f qcow2 hd0.qcow2 1M
         $ qemu-img create -f qcow2 hd1.qcow2 1M
         $ $QEMU -drive if=none,node-name=hd0,file=hd0.qcow2,\
                        backing.driver=qcow2,backing.file.filename=hd1.qcow2
         (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M"
      
      If the child does not inherit from the parent then it does not get
      reopened and its options cannot be changed:
      
         $ $QEMU -drive if=none,node-name=hd1,file=hd1.qcow2
                 -drive if=none,node-name=hd0,file=hd0.qcow2,backing=hd1
         (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M"
         Cannot change the option 'backing.l2-cache-size'
      
      If a disk image has a chain of backing files then all of them are also
      connected through their inherits_from pointers (i.e. it's possible to
      walk the chain in reverse order from base to top).
      
      However this is broken if the intermediate nodes are removed using
      e.g. block-stream because the inherits_from pointer from the base node
      becomes NULL:
      
         $ qemu-img create -f qcow2 hd0.qcow2 1M
         $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2
         $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2
         $ $QEMU -drive if=none,file=hd2.qcow2
         (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M"
         (qemu) block_stream none0 0 hd0.qcow2
         (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M"
         Cannot change the option 'backing.l2-cache-size'
      
      This patch updates the inherits_from pointer if the intermediate nodes
      of a backing chain are removed using bdrv_set_backing_hd(), and adds a
      test case for this scenario.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0065c455
  8. 22 11月, 2018 1 次提交
  9. 19 11月, 2018 1 次提交
    • M
      block: Always abort reopen after prepare succeeded · 9ad08c44
      Max Reitz 提交于
      bdrv_reopen_multiple() does not invoke bdrv_reopen_abort() for the
      element of the reopen queue for which bdrv_reopen_prepare() failed,
      because it assumes that the prepare function will have rolled back all
      changes already.
      
      However, bdrv_reopen_prepare() does not do this in every case: It may
      notice an error after BlockDriver.bdrv_reopen_prepare() succeeded, and
      it will not invoke BlockDriver.bdrv_reopen_abort() then; and neither
      will bdrv_reopen_multiple(), as explained above.
      
      This is wrong because we must always call .bdrv_reopen_commit() or
      .bdrv_reopen_abort() after .bdrv_reopen_prepare() has succeeded.
      Otherwise, the block driver has no chance to undo what it has done in
      its implementation of .bdrv_reopen_prepare().
      
      To fix this, bdrv_reopen_prepare() has to call .bdrv_reopen_abort() if
      it wants to return an error after .bdrv_reopen_prepare() has succeeded.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9ad08c44
  10. 05 11月, 2018 4 次提交
    • K
      block: Require auto-read-only for existing fallbacks · eaa2410f
      Kevin Wolf 提交于
      Some block drivers have traditionally changed their node to read-only
      mode without asking the user. This behaviour has been marked deprecated
      since 2.11, expecting users to provide an explicit read-only=on option.
      
      Now that we have auto-read-only=on, enable these drivers to make use of
      the option.
      
      This is the only use of bdrv_set_read_only(), so we can make it a bit
      more specific and turn it into a bdrv_apply_auto_read_only() that is
      more convenient for drivers to use.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      eaa2410f
    • K
      block: Add auto-read-only option · e35bdc12
      Kevin Wolf 提交于
      If a management application builds the block graph node by node, the
      protocol layer doesn't inherit its read-only option from the format
      layer any more, so it must be set explicitly.
      
      Backing files should work on read-only storage, but at the same time, a
      block job like commit should be able to reopen them read-write if they
      are on read-write storage. However, without option inheritance, reopen
      only changes the read-only option for the root node (typically the
      format layer), but not the protocol layer, so reopening fails (the
      format layer wants to get write permissions, but the protocol layer is
      still read-only).
      
      A simple workaround for the problem in the management tool would be to
      open the protocol layer always read-write and to make only the format
      layer read-only for backing files. However, sometimes the file is
      actually stored on read-only storage and we don't know whether the image
      can be opened read-write (for example, for NBD it depends on the server
      we're trying to connect to). This adds an option that makes QEMU try to
      open the image read-write, but allows it to degrade to a read-only mode
      without returning an error.
      
      The documentation for this option is consciously phrased in a way that
      allows QEMU to switch to a better model eventually: Instead of trying
      when the image is first opened, making the read-only flag dynamic and
      changing it automatically whenever the first BLK_PERM_WRITE user is
      attached or the last one is detached would be much more useful
      behaviour.
      
      Unfortunately, this more useful behaviour is also a lot harder to
      implement, and libvirt needs a solution now before it can switch to
      -blockdev, so let's start with this easier approach for now.
      
      Instead of adding a new auto-read-only option, turning the existing
      read-only into an enum (with a bool alternate for compatibility) was
      considered, but it complicated the implementation to the point that it
      didn't seem to be worth it.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      e35bdc12
    • K
      block: Update flags in bdrv_set_read_only() · eeae6a59
      Kevin Wolf 提交于
      To fully change the read-only state of a node, we must not only change
      bs->read_only, but also update bs->open_flags.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      eeae6a59
    • A
      415bbca8
  11. 30 10月, 2018 1 次提交
    • V
      dirty-bitmaps: clean-up bitmaps loading and migration logic · 9c98f145
      Vladimir Sementsov-Ogievskiy 提交于
      This patch aims to bring the following behavior:
      
      1. We don't load bitmaps, when started in inactive mode. It's the case
      of incoming migration. In this case we wait for bitmaps migration
      through migration channel (if 'dirty-bitmaps' capability is enabled) or
      for invalidation (to load bitmaps from the image).
      
      2. We don't remove persistent bitmaps on inactivation. Instead, we only
      remove bitmaps after storing. This is the only way to restore bitmaps,
      if we decided to resume source after [failed] migration with
      'dirty-bitmaps' capability enabled (which means, that bitmaps were not
      stored).
      
      3. We load bitmaps on open and any invalidation, it's ok for all cases:
        - normal open
        - migration target invalidation with dirty-bitmaps capability
          (bitmaps are migrating through migration channel, the are not
           stored, so they should have IN_USE flag set and will be skipped
           when loading. However, it would fail if bitmaps are read-only[1])
        - migration target invalidation without dirty-bitmaps capability
          (normal load of the bitmaps, if migrated with shared storage)
        - source invalidation with dirty-bitmaps capability
          (skip because IN_USE)
        - source invalidation without dirty-bitmaps capability
          (bitmaps were dropped, reload them)
      
      [1]: to accurately handle this, migration of read-only bitmaps is
           explicitly forbidden in this patch.
      
      New mechanism for not storing bitmaps when migrate with dirty-bitmaps
      capability is introduced: migration filed in BdrvDirtyBitmap.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Signed-off-by: NJohn Snow <jsnow@redhat.com>
      9c98f145
  12. 19 10月, 2018 2 次提交
    • M
      block: Clean up bdrv_img_create()'s error reporting · da7e92ca
      Markus Armbruster 提交于
      bdrv_img_create() takes an Error ** argument and uses it in the
      conventional way, except for one place: when qemu_opts_do_parse()
      fails, it first reports its error to stderr or the HMP monitor with
      error_report_err(), then error_setg()'s a generic error.
      
      When the caller reports that second error similarly, this produces two
      consecutive error messages on stderr or the HMP monitor.
      
      When the caller does something else with it, such as send it via QMP,
      the first error still goes to stderr or the HMP monitor.  Fortunately,
      no such caller exists.
      
      Simply use the first error as is.  Update expected output of
      qemu-iotest 049 accordingly.
      
      Cc: Kevin Wolf <kwolf@redhat.com>
      Cc: Max Reitz <mreitz@redhat.com>
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20181017082702.5581-37-armbru@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      da7e92ca
    • M
      error: Fix use of error_prepend() with &error_fatal, &error_abort · 4b576648
      Markus Armbruster 提交于
      From include/qapi/error.h:
      
        * Pass an existing error to the caller with the message modified:
        *     error_propagate(errp, err);
        *     error_prepend(errp, "Could not frobnicate '%s': ", name);
      
      Fei Li pointed out that doing error_propagate() first doesn't work
      well when @errp is &error_fatal or &error_abort: the error_prepend()
      is never reached.
      
      Since I doubt fixing the documentation will stop people from getting
      it wrong, introduce error_propagate_prepend(), in the hope that it
      lures people away from using its constituents in the wrong order.
      Update the instructions in error.h accordingly.
      
      Convert existing error_prepend() next to error_propagate to
      error_propagate_prepend().  If any of these get reached with
      &error_fatal or &error_abort, the error messages improve.  I didn't
      check whether that's the case anywhere.
      
      Cc: Fei Li <fli@suse.com>
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <philmd@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20181017082702.5581-2-armbru@redhat.com>
      4b576648
  13. 01 10月, 2018 1 次提交