1. 24 2月, 2017 5 次提交
  2. 12 2月, 2017 2 次提交
    • V
      block: bdrv_invalidate_cache: invalidate children first · 16e977d5
      Vladimir Sementsov-Ogievskiy 提交于
      Current implementation invalidates firstly parent bds and then its
      children. This leads to the following bug:
      
      after incoming migration, in bdrv_invalidate_cache_all:
      1. invalidate parent bds - reopen it with BDRV_O_INACTIVE cleared
      2. child is not yet invalidated
      3. parent check that its BDRV_O_INACTIVE is cleared
      4. parent writes to child
      5. assert in bdrv_co_pwritev, as BDRV_O_INACTIVE is set for child
      
      This patch fixes it by just changing invalidate sequence: invalidate
      children first.
      Signed-off-by: NVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
      Message-id: 20170131112308.54189-1-vsementsov@virtuozzo.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      16e977d5
    • J
      block: check full backing filename when searching protocol filenames · 418661e0
      Jeff Cody 提交于
      In bdrv_find_backing_image(), if we are searching an image for a backing
      file that contains a protocol, we currently only compare unmodified
      paths.
      
      However, some management software will change the backing filename to be
      a relative filename in a path.  QEMU is able to handle this fine,
      because internally it will use path_combine to put together the full
      protocol URI.
      
      However, this can lead to an inability to match an image during a QAPI
      command that needs to use bdrv_find_backing_image() to find the image,
      when it is searched by the full URI.
      
      When searching for a protocol filename, if the straight comparison
      fails, this patch will also compare against the full backing filename to
      see if that is a match.
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      Message-id: c2d025adca8a2b665189e6f4cf080f44126d0b6b.1485392617.git.jcody@redhat.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      418661e0
  3. 01 2月, 2017 1 次提交
  4. 25 1月, 2017 1 次提交
  5. 11 11月, 2016 2 次提交
  6. 31 10月, 2016 2 次提交
    • A
      block: Support streaming to an intermediate layer · 61b49e48
      Alberto Garcia 提交于
      This makes sure that the image we are streaming into is open in
      read-write mode during the operation.
      
      Operation blockers are also set in all intermediate nodes, since they
      will be removed from the chain afterwards.
      
      Finally, this also unblocks the stream operation in backing files.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      61b49e48
    • A
      block: Pause all jobs during bdrv_reopen_multiple() · 40840e41
      Alberto Garcia 提交于
      When a BlockDriverState is about to be reopened it can trigger certain
      operations that need to write to disk. During this process a different
      block job can be woken up. If that block job completes and also needs
      to call bdrv_reopen() it can happen that it needs to do it on the same
      BlockDriverState that is still in the process of being reopened.
      
      This can have fatal consequences, like in this example:
      
        1) Block job A starts and sleeps after a while.
        2) Block job B starts and tries to reopen node1 (a qcow2 file).
        3) Reopening node1 means flushing and replacing its qcow2 cache.
        4) While the qcow2 cache is being flushed, job A wakes up.
        5) Job A completes and reopens node1, replacing its cache.
        6) Job B resumes, but the cache that was being flushed no longer
           exists.
      
      This patch splits the bdrv_drain_all() call to keep all block jobs
      paused during bdrv_reopen_multiple(), so that step 4 can never happen
      and the operation is safe.
      
      Note that this scenario can only happen if both bdrv_reopen() calls
      are made by block jobs on the same backing chain. Otherwise there's no
      chance that the same BlockDriverState appears in both reopen queues.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      40840e41
  7. 28 10月, 2016 2 次提交
  8. 07 10月, 2016 2 次提交
    • K
      block: Add qdev ID to DEVICE_TRAY_MOVED · 2d76e724
      Kevin Wolf 提交于
      The event currently only contains the BlockBackend name. However, with
      anonymous BlockBackends, this is always the empty string. Add the qdev
      ID (or if none was given, the QOM path) so that the user can still see
      which device caused the event.
      
      Event generation has to be moved from bdrv_eject() to the BlockBackend
      because the BDS doesn't know the attached device, but that's easy
      because blk_eject() is the only user of it.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      2d76e724
    • K
      block: Add bdrv_runtime_opts to query-command-line-options · c5f3014b
      Kevin Wolf 提交于
      Recently we moved a few options from QemuOptsLists in blockdev.c to
      bdrv_runtime_opts in block.c in order to make them accissble using
      blockdev-add. However, this has the side effect that these options are
      missing from query-command-line-options now, and libvirt consequently
      disables the corresponding feature.
      
      This problem was reported as a regression for the 'discard' option,
      introduced in commit 818584a4. However, it is more general than that.
      
      Fix it by adding bdrv_runtime_opts to the list of QemuOptsLists that are
      returned in query-command-line-options. For the future, libvirt is
      advised to use QMP schema introspection for block device options.
      Reported-by: NMichal Privoznik <mprivozn@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Tested-by: NMichal Privoznik <mprivozn@redhat.com>
      Tested-by: NGerd Hoffmann <kraxel@redhat.com>
      c5f3014b
  9. 29 9月, 2016 2 次提交
  10. 23 9月, 2016 5 次提交
  11. 21 9月, 2016 1 次提交
    • M
      blockdev: Add dynamic module loading for block drivers · 88d88798
      Marc Mari 提交于
      Extend the current module interface to allow for block drivers to be
      loaded dynamically on request. The only block drivers that can be
      converted into modules are the drivers that don't perform any init
      operation except for registering themselves.
      
      In addition, only the protocol drivers are being modularized, as they
      are the only ones which see significant performance benefits. The format
      drivers do not generally link to external libraries, so modularizing
      them is of no benefit from a performance perspective.
      
      All the necessary module information is located in a new structure found
      in module_block.h
      
      This spoils the purpose of 5505e8b7 (block/dmg: make it modular).
      
      Before this patch, if module build is enabled, block-dmg.so is linked to
      libbz2, whereas the main binary is not. In downstream, theoretically, it
      means only the qemu-block-extra package depends on libbz2, while the
      main QEMU package needn't to. With this patch, we (temporarily) change
      the case so that the main QEMU depends on libbz2 again.
      Signed-off-by: NMarc Marí <markmb@redhat.com>
      Signed-off-by: NColin Lord <clord@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Message-id: 1471008424-16465-4-git-send-email-clord@redhat.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      [mreitz: Do a signed comparison against the length of
       block_driver_modules[], so it will not cause a compile error when
       empty]
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      88d88798
  12. 13 9月, 2016 1 次提交
  13. 06 9月, 2016 1 次提交
    • K
      nbd-server: Use a separate BlockBackend · cd7fca95
      Kevin Wolf 提交于
      The builtin NBD server uses its own BlockBackend now instead of reusing
      the monitor/guest device one.
      
      This means that it has its own writethrough setting now. The builtin
      NBD server always uses writeback caching now regardless of whether the
      guest device has WCE enabled. qemu-nbd respects the cache mode given on
      the command line.
      
      We still need to keep a reference to the monitor BB because we put an
      eject notifier on it, but we don't use it for any I/O.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      cd7fca95
  14. 20 7月, 2016 1 次提交
  15. 19 7月, 2016 1 次提交
    • E
      block: ignore flush requests when storage is clean · 3ff2f67a
      Evgeny Yakovlev 提交于
      Some guests (win2008 server for example) do a lot of unnecessary
      flushing when underlying media has not changed. This adds additional
      overhead on host when calling fsync/fdatasync.
      
      This change introduces a write generation scheme in BlockDriverState.
      Current write generation is checked against last flushed generation to
      avoid unnessesary flushes.
      
      The problem with excessive flushing was found by a performance test
      which does parallel directory tree creation (from 2 processes).
      Results improved from 0.424 loops/sec to 0.432 loops/sec.
      Each loop creates 10^3 directories with 10 files in each.
      
      This affected some blkdebug testcases that were expecting error logs from
      failure-injected flushes which are now skipped entirely
      (tests 026 071 089).
      
      This also affects the performance of block jobs and thus BLOCK_JOB_READY
      events for driver-mirror and active block-commit commands now arrives
      faster, before QMP send successfully returns to caller (tests 141 144).
      Signed-off-by: NEvgeny Yakovlev <eyakovlev@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1468870792-7411-5-git-send-email-den@openvz.org
      CC: Kevin Wolf <kwolf@redhat.com>
      CC: Max Reitz <mreitz@redhat.com>
      CC: Stefan Hajnoczi <stefanha@redhat.com>
      CC: Fam Zheng <famz@redhat.com>
      CC: John Snow <jsnow@redhat.com>
      Signed-off-by: NJohn Snow <jsnow@redhat.com>
      3ff2f67a
  16. 13 7月, 2016 1 次提交
    • P
      coroutine: move entry argument to qemu_coroutine_create · 0b8b8753
      Paolo Bonzini 提交于
      In practice the entry argument is always known at creation time, and
      it is confusing that sometimes qemu_coroutine_enter is used with a
      non-NULL argument to re-enter a coroutine (this happens in
      block/sheepdog.c and tests/test-coroutine.c).  So pass the opaque value
      at creation time, for consistency with e.g. aio_bh_new.
      
      Mostly done with the following semantic patch:
      
      @ entry1 @
      expression entry, arg, co;
      @@
      - co = qemu_coroutine_create(entry);
      + co = qemu_coroutine_create(entry, arg);
        ...
      - qemu_coroutine_enter(co, arg);
      + qemu_coroutine_enter(co);
      
      @ entry2 @
      expression entry, arg;
      identifier co;
      @@
      - Coroutine *co = qemu_coroutine_create(entry);
      + Coroutine *co = qemu_coroutine_create(entry, arg);
        ...
      - qemu_coroutine_enter(co, arg);
      + qemu_coroutine_enter(co);
      
      @ entry3 @
      expression entry, arg;
      @@
      - qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
      + qemu_coroutine_enter(qemu_coroutine_create(entry, arg));
      
      @ reentry @
      expression co;
      @@
      - qemu_coroutine_enter(co, NULL);
      + qemu_coroutine_enter(co);
      
      except for the aforementioned few places where the semantic patch
      stumbled (as expected) and for test_co_queue, which would otherwise
      produce an uninitialized variable warning.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0b8b8753
  17. 05 7月, 2016 5 次提交
  18. 20 6月, 2016 2 次提交
  19. 16 6月, 2016 3 次提交
    • M
      block/mirror: Fix target backing BDS · 274fccee
      Max Reitz 提交于
      Currently, we are trying to move the backing BDS from the source to the
      target in bdrv_replace_in_backing_chain() which is called from
      mirror_exit(). However, mirror_complete() already tries to open the
      target's backing chain with a call to bdrv_open_backing_file().
      
      First, we should only set the target's backing BDS once. Second, the
      mirroring block job has a better idea of what to set it to than the
      generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
      conditions on when to move the backing BDS from source to target are not
      really correct).
      
      Therefore, remove that code from bdrv_replace_in_backing_chain() and
      leave it to mirror_complete().
      
      Depending on what kind of mirroring is performed, we furthermore want to
      use different strategies to open the target's backing chain:
      
      - If blockdev-mirror is used, we can assume the user made sure that the
        target already has the correct backing chain. In particular, we should
        not try to open a backing file if the target does not have any yet.
      
      - If drive-mirror with mode=absolute-paths is used, we can and should
        reuse the already existing chain of nodes that the source BDS is in.
        In case of sync=full, no backing BDS is required; with sync=top, we
        just link the source's backing BDS to the target, and with sync=none,
        we use the source BDS as the target's backing BDS.
        We should not try to open these backing files anew because this would
        lead to two BDSs existing per physical file in the backing chain, and
        we would like to avoid such concurrent access.
      
      - If drive-mirror with mode=existing is used, we have to use the
        information provided in the physical image file which means opening
        the target's backing chain completely anew, just as it has been done
        already.
        If the target's backing chain shares images with the source, this may
        lead to multiple BDSs per physical image file. But since we cannot
        reliably ascertain this case, there is nothing we can do about it.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20160610185750.30956-3-mreitz@redhat.com
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      274fccee
    • M
      block: Allow replacement of a BDS by its overlay · 9bd910e2
      Max Reitz 提交于
      change_parent_backing_link() asserts that the BDS to be replaced is not
      used as a backing file. However, we may want to replace a BDS by its
      overlay in which case that very link should not be redirected.
      
      For instance, when doing a sync=none drive-mirror operation, we may have
      the following BDS/BB forest before block job completion:
      
        target
      
        base <- source <- BlockBackend
      
      During job completion, we want to establish the source BDS as the
      target's backing node:
      
                target
                  |
                  v
        base <- source <- BlockBackend
      
      This makes the target a valid replacement for the source:
      
                target <- BlockBackend
                  |
                  v
        base <- source
      
      Without this modification to change_parent_backing_link() we have to
      inject the target into the graph before the source is its backing node,
      thus temporarily creating a wrong graph:
      
        target <- BlockBackend
      
        base <- source
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20160610185750.30956-2-mreitz@redhat.com
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      9bd910e2
    • K
      block: Fix snapshot=on with aio=native · 41869044
      Kevin Wolf 提交于
      snapshot=on creates a temporary overlay that is always opened with
      cache=unsafe (the cache mode specified by the user is only for the
      actual image file and its children). This means that we must not inherit
      the BDRV_O_NATIVE_AIO flag for the temporary overlay because trying to
      use Linux AIO with cache=unsafe results in an error.
      
      Reproducer without this patch:
      
      $ x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/test.qcow2,cache=none,aio=native,snapshot=on
      qemu-system-x86_64: -drive file=/tmp/test.qcow2,cache=none,aio=native,snapshot=on: aio=native was
      specified, but it requires cache.direct=on, which was not specified.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      41869044