1. 01 10月, 2018 1 次提交
    • A
      block: Remove child references from bs->{options,explicit_options} · 50196d7a
      Alberto Garcia 提交于
      Block drivers allow opening their children using a reference to an
      existing BlockDriverState. These references remain stored in the
      'options' and 'explicit_options' QDicts, but we don't need to keep
      them once everything is open.
      
      What is more important, these values can become wrong if the children
      change:
      
          $ qemu-img create -f qcow2 hd0.qcow2 10M
          $ qemu-img create -f qcow2 hd1.qcow2 10M
          $ qemu-img create -f qcow2 hd2.qcow2 10M
          $ $QEMU -drive if=none,file=hd0.qcow2,node-name=hd0 \
                  -drive if=none,file=hd1.qcow2,node-name=hd1,backing=hd0 \
                  -drive file=hd2.qcow2,node-name=hd2,backing=hd1
      
      After this hd2 has hd1 as its backing file. Now let's remove it using
      block_stream:
      
          (qemu) block_stream hd2 0 hd0.qcow2
      
      Now hd0 is the backing file of hd2, but hd2's options QDicts still
      contain backing=hd1.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      50196d7a
  2. 25 9月, 2018 2 次提交
    • K
      block: Use a single global AioWait · cfe29d82
      Kevin Wolf 提交于
      When draining a block node, we recurse to its parent and for subtree
      drains also to its children. A single AIO_WAIT_WHILE() is then used to
      wait for bdrv_drain_poll() to become true, which depends on all of the
      nodes we recursed to. However, if the respective child or parent becomes
      quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent
      is checked, while AIO_WAIT_WHILE() depends on the AioWait of the
      original node.
      
      Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE().
      
      This may mean that the draining thread gets a few more unnecessary
      wakeups because an unrelated operation got completed, but we already
      wake it up when something _could_ have changed rather than only if it
      has certainly changed.
      
      Apart from that, drain is a slow path anyway. In theory it would be
      possible to use wakeups more selectively and still correctly, but the
      gains are likely not worth the additional complexity. In fact, this
      patch is a nice simplification for some places in the code.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      cfe29d82
    • A
      block: Fix use after free error in bdrv_open_inherit() · 8961be33
      Alberto Garcia 提交于
      When a block device is opened with BDRV_O_SNAPSHOT and the
      bdrv_append_temp_snapshot() call fails then the error code path tries
      to unref the already destroyed 'options' QDict.
      
      This can be reproduced easily by setting TMPDIR to a location where
      the QEMU process can't write:
      
         $ TMPDIR=/nonexistent $QEMU -drive driver=null-co,snapshot=on
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      8961be33
  3. 25 8月, 2018 1 次提交
    • M
      qjson: Have qobject_from_json() & friends reject empty and blank · dd98e848
      Markus Armbruster 提交于
      The last case where qobject_from_json() & friends return null without
      setting an error is empty or blank input.  Callers:
      
      * block.c's parse_json_protocol() reports "Could not parse the JSON
        options".  It's marked as a work-around, because it also covered
        actual bugs, but they got fixed in the previous few commits.
      
      * qobject_input_visitor_new_str() reports "JSON parse error".  Also
        marked as work-around.  The recent fixes have made this unreachable,
        because it currently gets called only for input starting with '{'.
      
      * check-qjson.c's empty_input() and blank_input() demonstrate the
        behavior.
      
      * The other callers are not affected since they only pass input with
        exactly one JSON value or, in the case of negative tests, one error.
      
      Fail with "Expecting a JSON value" instead of returning null, and
      simplify callers.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-48-armbru@redhat.com>
      dd98e848
  4. 15 8月, 2018 5 次提交
    • A
      block: Simplify append_open_options() · 261dbcb1
      Alberto Garcia 提交于
      This function returns a BDS's driver-specific options, excluding also
      those from its children. Since we have just removed all children
      options from bs->options there's no need to do this last step.
      
      We allow references to children, though ("backing": "node0"), so those
      we still have to remove.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      261dbcb1
    • A
      block: Update bs->options if bdrv_reopen() succeeds · 4c8350fe
      Alberto Garcia 提交于
      If bdrv_reopen() succeeds then bs->explicit_options is updated with
      the new values, but bs->options never changes.
      
      Here's an example:
      
         { "execute": "blockdev-add",
           "arguments": {
             "driver": "qcow2",
             "node-name": "hd0",
             "overlap-check": "all",
             "file": {
               "driver": "file",
               "filename": "hd0.qcow2"
             }
           }
         }
      
      After this, both bs->options and bs->explicit_options contain
      "overlap-check": "all".
      
      Now let's change that using qemu-io's reopen command:
      
         (qemu) qemu-io hd0 "reopen -o overlap-check=none"
      
      After this, bs->explicit_options contains the new value but
      bs->options still keeps the old one.
      
      This patch updates bs->options after a BDS has been successfully
      reopened.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      4c8350fe
    • A
      block: Simplify bdrv_reopen_abort() · 1bab38e7
      Alberto Garcia 提交于
      If a bdrv_reopen_multiple() call fails, then the explicit_options
      QDict has to be deleted for every entry in the reopen queue. This must
      happen regardless of whether that entry's bdrv_reopen_prepare() call
      succeeded or not.
      
      This patch simplifies the cleanup code a bit.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      1bab38e7
    • A
      block: Remove children options from bs->{options,explicit_options} · 2f624b80
      Alberto Garcia 提交于
      When bdrv_open_inherit() opens a BlockDriverState the options QDict
      can contain options for some of its children, passed in the form of
      child-name.option=value
      
      So while each child is opened with that subset of options, those same
      options remain stored in the parent BDS, leaving (at least) two copies
      of each one of them ("child-name.option=value" in the parent and
      "option=value" in the child).
      
      Having the children options stored in the parent is unnecessary and it
      can easily lead to an inconsistent state:
      
        $ qemu-img create -f qcow2 hd0.qcow2 10M
        $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2
        $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2
      
        $ $QEMU -drive file=hd2.qcow2,node-name=hd2,backing.node-name=hd1
      
      This opens a chain of images hd0 <- hd1 <- hd2. Now let's remove hd1
      using block_stream:
      
        (qemu) block_stream hd2 0 hd0.qcow2
      
      After this hd2 contains backing.node-name=hd1, which is no longer
      correct because hd1 doesn't exist anymore.
      
      This patch removes all children options from the parent dictionaries
      at the end of bdrv_open_inherit() and bdrv_reopen_queue_child().
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      2f624b80
    • V
  5. 23 7月, 2018 1 次提交
  6. 10 7月, 2018 1 次提交
    • K
      block: Poll after drain on attaching a node · 4be6a6d1
      Kevin Wolf 提交于
      Commit dcf94a23 ('block: Don't poll in parent drain callbacks')
      removed polling in bdrv_child_cb_drained_begin() on the grounds that the
      original bdrv_drain() already will poll and BdrvChildRole.drained_begin
      calls must not cause graph changes (and therefore must not call
      aio_poll() or the recursion through the graph will break.
      
      This reasoning is correct for calls through bdrv_do_drained_begin().
      However, BdrvChildRole.drained_begin is also called when a node that is
      already in a drained section (i.e. bdrv_do_drained_begin() has already
      returned and therefore can't poll any more) is attached to a new parent.
      In this case, we must explicitly poll to have all requests completed
      before the drained new child can be attached to the parent.
      
      In bdrv_replace_child_noperm(), we know that we're not inside the
      recursion of bdrv_do_drained_begin() because graph changes are not
      allowed there, and bdrv_replace_child_noperm() is a graph change. The
      call of BdrvChildRole.drained_begin() must therefore be followed by a
      BDRV_POLL_WHILE() that waits for the completion of requests.
      Reported-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      4be6a6d1
  7. 05 7月, 2018 2 次提交
  8. 29 6月, 2018 2 次提交
    • K
      block: Move bdrv_truncate() implementation to io.c · 3d9f2d2a
      Kevin Wolf 提交于
      This moves the bdrv_truncate() implementation from block.c to block/io.c
      so it can have access to the tracked requests infrastructure.
      
      This involves making refresh_total_sectors() public (in block_int.h).
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      3d9f2d2a
    • K
      block: Convert .bdrv_truncate callback to coroutine_fn · 061ca8a3
      Kevin Wolf 提交于
      bdrv_truncate() is an operation that can block (even for a quite long
      time, depending on the PreallocMode) in I/O paths that shouldn't block.
      Convert it to a coroutine_fn so that we have the infrastructure for
      drivers to make their .bdrv_co_truncate implementation asynchronous.
      
      This change could potentially introduce new race conditions because
      bdrv_truncate() isn't necessarily executed atomically any more. Whether
      this is a problem needs to be evaluated for each block driver that
      supports truncate:
      
      * file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
        protocol drivers are trivially safe because they don't actually yield
        yet, so there is no change in behaviour.
      
      * copy-on-read, crypto, raw-format: Essentially just filter drivers that
        pass the request to a child node, no problem.
      
      * qcow2: The implementation modifies metadata, so it needs to hold
        s->lock to be safe with concurrent I/O requests. In order to avoid
        double locking, this requires pulling the locking out into
        preallocate_co() and using qcow2_write_caches() instead of
        bdrv_flush().
      
      * qed: Does a single header update, this is fine without locking.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      061ca8a3
  9. 18 6月, 2018 6 次提交
    • M
      block: Generalize should_update_child() rule · ec9f10fe
      Max Reitz 提交于
      Currently, bdrv_replace_node() refuses to create loops from one BDS to
      itself if the BDS to be replaced is the backing node of the BDS to
      replace it: Say there is a node A and a node B.  Replacing B by A means
      making all references to B point to A.  If B is a child of A (i.e. A has
      a reference to B), that would mean we would have to make this reference
      point to A itself -- so we'd create a loop.
      
      bdrv_replace_node() (through should_update_child()) refuses to do so if
      B is the backing node of A.  There is no reason why we should create
      loops if B is not the backing node of A, though.  The BDS graph should
      never contain loops, so we should always refuse to create them.
      
      If B is a child of A and B is to be replaced by A, we should simply
      leave B in place there because it is the most sensible choice.
      
      A more specific argument would be: Putting filter drivers into the BDS
      graph is basically the same as appending an overlay to a backing chain.
      But the main child BDS of a filter driver is not "backing" but "file",
      so restricting the no-loop rule to backing nodes would fail here.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NAlberto Garcia <berto@igalia.com>
      Message-id: 20180613181823.13618-7-mreitz@redhat.com
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      ec9f10fe
    • K
      block: Allow graph changes in bdrv_drain_all_begin/end sections · 0f12264e
      Kevin Wolf 提交于
      bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and
      did a subtree drain for each of them. This works fine as long as the
      graph is static, but sadly, reality looks different.
      
      If the graph changes so that root nodes are added or removed, we would
      have to compensate for this. bdrv_next() returns each root node only
      once even if it's the root node for multiple BlockBackends or for a
      monitor-owned block driver tree, which would only complicate things.
      
      The much easier and more obviously correct way is to fundamentally
      change the way the functions work: Iterate over all BlockDriverStates,
      no matter who owns them, and drain them individually. Compensation is
      only necessary when a new BDS is created inside a drain_all section.
      Removal of a BDS doesn't require any action because it's gone afterwards
      anyway.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0f12264e
    • K
      block: ignore_bds_parents parameter for drain functions · 6cd5c9d7
      Kevin Wolf 提交于
      In the future, bdrv_drained_all_begin/end() will drain all invidiual
      nodes separately rather than whole subtrees. This means that we don't
      want to propagate the drain to all parents any more: If the parent is a
      BDS, it will already be drained separately. Recursing to all parents is
      unnecessary work and would make it an O(n²) operation.
      
      Prepare the drain function for the changed drain_all by adding an
      ignore_bds_parents parameter to the internal implementation that
      prevents the propagation of the drain to BDS parents. We still (have to)
      propagate it to non-BDS parents like BlockBackends or Jobs because those
      are not drained separately.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      6cd5c9d7
    • K
      block: Don't poll in parent drain callbacks · dcf94a23
      Kevin Wolf 提交于
      bdrv_do_drained_begin() is only safe if we have a single
      BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow
      that parent callbacks introduce a nested polling loop that could cause
      graph changes while we're traversing the graph.
      
      Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single
      node without waiting for its requests to complete. These requests will
      be waited for in the BDRV_POLL_WHILE() call down the call chain.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      dcf94a23
    • K
      block: Drain recursively with a single BDRV_POLL_WHILE() · fe4f0614
      Kevin Wolf 提交于
      Anything can happen inside BDRV_POLL_WHILE(), including graph
      changes that may interfere with its callers (e.g. child list iteration
      in recursive callers of bdrv_do_drained_begin).
      
      Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the
      end of bdrv_do_drained_begin() to avoid such effects. The recursion
      happens now inside the loop condition. As the graph can only change
      between bdrv_drain_poll() calls, but not inside of it, doing the
      recursion here is safe.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      fe4f0614
    • K
      block: Really pause block jobs on drain · 89bd0305
      Kevin Wolf 提交于
      We already requested that block jobs be paused in .bdrv_drained_begin,
      but no guarantee was made that the job was actually inactive at the
      point where bdrv_drained_begin() returned.
      
      This introduces a new callback BdrvChildRole.bdrv_drained_poll() and
      uses it to make bdrv_drain_poll() consider block jobs using the node to
      be drained.
      
      For the test case to work as expected, we have to switch from
      block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even
      considered active and must be waited for when draining the node.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      89bd0305
  10. 15 6月, 2018 1 次提交
    • M
      block: Add block-specific QDict header · 609f45ea
      Max Reitz 提交于
      There are numerous QDict functions that have been introduced for and are
      used only by the block layer.  Move their declarations into an own
      header file to reflect that.
      
      While qdict_extract_subqdict() is in fact used outside of the block
      layer (in util/qemu-config.c), it is still a function related very
      closely to how the block layer works with nested QDicts, namely by
      sometimes flattening them.  Therefore, its declaration is put into this
      header as well and util/qemu-config.c includes it with a comment stating
      exactly which function it needs.
      Suggested-by: NMarkus Armbruster <armbru@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-Id: <20180509165530.29561-7-mreitz@redhat.com>
      [Copyright note tweaked, superfluous includes dropped]
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      609f45ea
  11. 11 6月, 2018 2 次提交
  12. 23 5月, 2018 2 次提交
  13. 04 5月, 2018 2 次提交
  14. 20 3月, 2018 3 次提交
  15. 19 3月, 2018 2 次提交
  16. 09 3月, 2018 5 次提交
  17. 03 3月, 2018 2 次提交