1. 09 2月, 2018 6 次提交
  2. 22 12月, 2017 5 次提交
    • K
      block: Keep nodes drained between reopen_queue/multiple · 1a63a907
      Kevin Wolf 提交于
      The bdrv_reopen*() implementation doesn't like it if the graph is
      changed between queuing nodes for reopen and actually reopening them
      (one of the reasons is that queuing can be recursive).
      
      So instead of draining the device only in bdrv_reopen_multiple(),
      require that callers already drained all affected nodes, and assert this
      in bdrv_reopen_queue().
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      1a63a907
    • K
      block: Allow graph changes in subtree drained section · d736f119
      Kevin Wolf 提交于
      We need to remember how many of the drain sections in which a node is
      were recursive (i.e. subtree drain rather than node drain), so that they
      can be correctly applied when children are added or removed during the
      drained section.
      
      With this change, it is safe to modify the graph even inside a
      bdrv_subtree_drained_begin/end() section.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      d736f119
    • K
      block: Don't notify parents in drain call chain · 0152bf40
      Kevin Wolf 提交于
      This is in preparation for subtree drains, i.e. drained sections that
      affect not only a single node, but recursively all child nodes, too.
      
      Calling the parent callbacks for drain is pointless when we just came
      from that parent node recursively and leads to multiple increases of
      bs->quiesce_counter in a single drain call. Don't do it.
      
      In order for this to work correctly, the parent callback must be called
      for every bdrv_drain_begin/end() call, not only for the outermost one:
      
      If we have a node N with two parents A and B, recursive draining of A
      should cause the quiesce_counter of B to increase because its child N is
      drained independently of B. If now B is recursively drained, too, A must
      increase its quiesce_counter because N is drained independently of A
      only now, even if N is going from quiesce_counter 1 to 2.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0152bf40
    • F
      block: Open backing image in force share mode for size probe · cc954f01
      Fam Zheng 提交于
      Management tools create overlays of running guests with qemu-img:
      
        $ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2
      
      but this doesn't work anymore due to image locking:
      
          qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock
          Is another process using the image?
          Could not open backing image to determine size.
      Use the force share option to allow this use case again.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      cc954f01
    • K
      block: Formats don't need CONSISTENT_READ with NO_IO · 5fbfabd3
      Kevin Wolf 提交于
      Commit 1f4ad7d3 fixed 'qemu-img info' for raw images that are currently
      in use as a mirror target. It is not enough for image formats, though,
      as these still unconditionally request BLK_PERM_CONSISTENT_READ.
      
      As this permission is geared towards whether the guest-visible data is
      consistent, and has no impact on whether the metadata is sane, and
      'qemu-img info' does not read guest-visible data (except for the raw
      format), it makes sense to not require BLK_PERM_CONSISTENT_READ if there
      is not going to be any guest I/O performed, regardless of image format.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      5fbfabd3
  3. 19 12月, 2017 1 次提交
    • P
      block: avoid recursive AioContext acquire in bdrv_inactivate_all() · bd6458e4
      Paolo Bonzini 提交于
      BDRV_POLL_WHILE() does not support recursive AioContext locking.  It
      only releases the AioContext lock once regardless of how many times the
      caller has acquired it.  This results in a hang since the IOThread does
      not make progress while the AioContext is still locked.
      
      The following steps trigger the hang:
      
        $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
                             -object iothread,id=iothread0 \
                             -device virtio-scsi-pci,iothread=iothread0 \
                             -drive if=none,id=drive0,file=test.img,format=raw \
                             -device scsi-hd,drive=drive0 \
                             -drive if=none,id=drive1,file=test.img,format=raw \
                             -device scsi-hd,drive=drive1
        $ qemu-system-x86_64 ...same options... \
                             -incoming tcp::1234
        (qemu) migrate tcp:127.0.0.1:1234
        ...hang...
      Tested-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Message-id: 20171207201320.19284-2-stefanha@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      bd6458e4
  4. 21 11月, 2017 2 次提交
    • A
      block: Close a BlockDriverState completely even when bs->drv is NULL · 50a3efb0
      Alberto Garcia 提交于
      bdrv_close() skips much of its logic when bs->drv is NULL. This is
      fine when we're closing a BlockDriverState that has just been created
      (because e.g the initialization process failed), but it's not enough
      in other cases.
      
      For example, when a valid qcow2 image is found to be corrupted then
      QEMU marks it as such in the file header and then sets bs->drv to
      NULL in order to make the BlockDriverState unusable. When that BDS is
      later closed then many of its data structures are not freed (leaking
      their memory) and none of its children are detached. This results in
      bdrv_close_all() failing to close all BDSs and making this assertion
      fail when QEMU is being shut down:
      
         bdrv_close_all: Assertion `QTAILQ_EMPTY(&all_bdrv_states)' failed.
      
      This patch makes bdrv_close() do the full uninitialization process
      in all cases. This fixes the problem with corrupted images and still
      works fine with freshly created BDSs.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20171106145345.12038-1-berto@igalia.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      50a3efb0
    • K
      block: Don't use BLK_PERM_CONSISTENT_READ for format probing · dacaa162
      Kevin Wolf 提交于
      For format probing, we don't really care whether all of the image
      content is consistent. The only thing we're looking at is the image
      header, and specifically the magic numbers that are expected to never
      change, no matter how inconsistent the guest visible disk content is.
      
      Therefore, don't request BLK_PERM_CONSISTENT_READ. This allows to use
      format probing, e.g. in the context of 'qemu-img info', even while the
      guest visible data in the image is inconsistent during a running block
      job.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      dacaa162
  5. 18 11月, 2017 4 次提交
    • M
      block: Make bdrv_next() keep strong references · 5e003f17
      Max Reitz 提交于
      On one hand, it is a good idea for bdrv_next() to return a strong
      reference because ideally nearly every pointer should be refcounted.
      This fixes intermittent failure of iotest 194.
      
      On the other, it is absolutely necessary for bdrv_next() itself to keep
      a strong reference to both the BB (in its first phase) and the BDS (at
      least in the second phase) because when called the next time, it will
      dereference those objects to get a link to the next one.  Therefore, it
      needs these objects to stay around until then.  Just storing the pointer
      to the next in the iterator is not really viable because that pointer
      might become invalid as well.
      
      Both arguments taken together means we should probably just invoke
      bdrv_ref() and blk_ref() in bdrv_next().  This means we have to assert
      that bdrv_next() is always called from the main loop, but that was
      probably necessary already before this patch and judging from the
      callers, it also looks to actually be the case.
      
      Keeping these strong references means however that callers need to give
      them up if they decide to abort the iteration early.  They can do so
      through the new bdrv_next_cleanup() function.
      Suggested-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20171110172545.32609-1-mreitz@redhat.com
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      5e003f17
    • M
      block: Guard against NULL bs->drv · d470ad42
      Max Reitz 提交于
      We currently do not guard everywhere against a NULL bs->drv where we
      should be doing so.  Most of the places fixed here just do not care
      about that case at all.
      
      Some care implicitly, e.g. through a prior function call to
      bdrv_getlength() which would always fail for an ejected BDS.  Add an
      assert there to make it more obvious.
      
      Other places seem to care, but do so insufficiently: Freeing clusters in
      a qcow2 image is an error-free operation, but it may leave the image in
      an unusable state anyway.  Giving qcow2_free_clusters() an error code is
      not really viable, it is much easier to note that bs->drv may be NULL
      even after a successful driver call.  This concerns bdrv_co_flush(), and
      the way the check is added to bdrv_co_pdiscard() (in every iteration
      instead of only once).
      
      Finally, some places employ at least an assert(bs->drv); somewhere, that
      may be reasonable (such as in the reopen code), but in
      bdrv_has_zero_init(), it is definitely not.  Returning 0 there in case
      of an ejected BDS saves us much headache instead.
      Reported-by: NR. Nageswara Sastry <nasastry@in.ibm.com>
      Buglink: https://bugs.launchpad.net/qemu/+bug/1728660Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20171110203111.7666-4-mreitz@redhat.com
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      d470ad42
    • M
      block: qobject_is_equal() in bdrv_reopen_prepare() · 54fd1b0d
      Max Reitz 提交于
      Currently, bdrv_reopen_prepare() assumes that all BDS options are
      strings. However, this is not the case if the BDS has been created
      through the json: pseudo-protocol or blockdev-add.
      
      Note that the user-invokable reopen command is an HMP command, so you
      can only specify strings there. Therefore, specifying a non-string
      option with the "same" value as it was when originally created will now
      return an error because the values are supposedly similar (and there is
      no way for the user to circumvent this but to just not specify the
      option again -- however, this is still strictly better than just
      crashing).
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Message-id: 20171114180128.17076-5-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      54fd1b0d
    • K
      block: Fix permissions in image activation · dafe0960
      Kevin Wolf 提交于
      Inactive images generally request less permissions for their image files
      than they would if they were active (in particular, write permissions).
      Activating the image involves extending the permissions, therefore.
      
      drv->bdrv_invalidate_cache() can already require write access to the
      image file, so we have to update the permissions earlier than that.
      The current code does it only later, so we have to move up this part.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      dafe0960
  6. 17 11月, 2017 2 次提交
  7. 26 10月, 2017 1 次提交
    • P
      block: don't add 'driver' to options when referring to backing via node name · 6bff597b
      Peter Krempa 提交于
      When referring to a backing file of an image via node name
      bdrv_open_backing_file would add the 'driver' option to the option list
      filling it with the backing format driver. This breaks construction of
      the backing chain via -blockdev, as bdrv_open_inherit reports an error
      if both 'reference' and 'options' are provided.
      
      $ qemu-img create -f raw /tmp/backing.raw 64M
      $ qemu-img create -f qcow2 -F raw -b /tmp/backing.raw /tmp/test.qcow2
      $ qemu-system-x86_64 \
        -blockdev driver=file,filename=/tmp/backing.raw,node-name=backing \
        -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing
      qemu-system-x86_64: -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing: Could not open backing file: Cannot reference an existing block device with additional options or a new filename
      Signed-off-by: NPeter Krempa <pkrempa@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      6bff597b
  8. 06 10月, 2017 5 次提交
    • K
      commit: Remove overlay_bs · bde70715
      Kevin Wolf 提交于
      We don't need to make any assumptions about the graph layout above the
      top node of the commit operation any more. Remove the use of
      bdrv_find_overlay() and related variables from the commit job code.
      
      bdrv_drop_intermediate() doesn't use the 'active' parameter any more, so
      we can just drop it.
      
      The overlay node was previously added to the block job to get a
      BLK_PERM_GRAPH_MOD. We really need to respect those permissions in
      bdrv_drop_intermediate() now, but as long as we haven't figured out yet
      how BLK_PERM_GRAPH_MOD is actually supposed to work, just leave a TODO
      comment there.
      
      With this change, it is now possible to perform another block job on an
      overlay node without conflicts. qemu-iotests 030 is changed accordingly.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      bde70715
    • K
      commit: Support multiple roots above top node · 61f09cea
      Kevin Wolf 提交于
      This changes the commit block job to support operation in a graph where
      there is more than a single active layer that references the top node.
      
      This involves inserting the commit filter node not only on the path
      between the given active node and the top node, but between the top node
      and all of its parents.
      
      On completion, bdrv_drop_intermediate() must consider all parents for
      updating the backing file link. These parents may be backing files
      themselves and as such read-only; reopen them temporarily if necessary.
      Previously this was achieved by the bdrv_reopen() calls in the commit
      block job that made overlay_bs read-write for the whole duration of the
      block job, even though write access is only needed on completion.
      
      Now that we consider all parents, overlay_bs is meaningless. It is left
      in place in this commit, but we'll remove it soon.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      61f09cea
    • K
      block: Introduce BdrvChildRole.update_filename · 6858eba0
      Kevin Wolf 提交于
      There is no good reason for bdrv_drop_intermediate() to know the active
      layer above the subchain it is operating on - even more so, because
      the assumption that there is a single active layer above it is not
      generally true.
      
      In order to prepare removal of the active parameter, use a BdrvChildRole
      callback to update the backing file string in the overlay image instead
      of directly calling bdrv_change_backing_file().
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      6858eba0
    • E
      dirty-bitmap: Avoid size query failure during truncate · 1b6cc579
      Eric Blake 提交于
      We've previously fixed several places where we failed to account
      for possible errors from bdrv_nb_sectors().  Fix another one by
      making bdrv_dirty_bitmap_truncate() take the new size from the
      caller instead of querying itself; then adjust the sole caller
      bdrv_truncate() to pass the size just determined by a successful
      resize, or to reuse the size given to the original truncate
      operation when refresh_total_sectors() was not able to confirm the
      actual size (the two sizes can potentially differ according to
      rounding constraints), thus avoiding sizing the bitmaps to -1.
      This also fixes a bug where not all failure paths in
      bdrv_truncate() would set errp.
      
      Note that bdrv_truncate() is still a bit awkward.  We may want
      to revisit it later and clean up things to better guarantee that
      a resize attempt either fails cleanly up front, or cannot fail
      after guest-visible changes have been made (if temporary changes
      are made, then they need to be cleanly rolled back).  But that
      is a task for another day; for now, the goal is the bare minimum
      fix to ensure that just bdrv_dirty_bitmap_truncate() cannot fail.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      1b6cc579
    • E
      block: Make bdrv_img_create() size selection easier to read · a8b42a1c
      Eric Blake 提交于
      All callers of bdrv_img_create() pass in a size, or -1 to read the
      size from the backing file.  We then set that size as the QemuOpt
      default, which means we will reuse that default rather than the
      final parameter to qemu_opt_get_size() several lines later.  But
      it is rather confusing to read subsequent checks of 'size == -1'
      when it looks (without seeing the full context) like size defaults
      to 0; it also doesn't help that a size of 0 is valid (for some
      formats).
      
      Rework the logic to make things more legible.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NJohn Snow <jsnow@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      a8b42a1c
  9. 26 9月, 2017 5 次提交
  10. 05 9月, 2017 2 次提交
  11. 04 9月, 2017 3 次提交
  12. 23 8月, 2017 1 次提交
  13. 08 8月, 2017 3 次提交