1. 22 4月, 2016 1 次提交
    • F
      mirror: Workaround for unexpected iohandler events during completion · ab27c3b5
      Fam Zheng 提交于
      Commit 5a7e7a0b moved mirror_exit to a BH handler but didn't add any
      protection against new requests that could sneak in just before the
      BH is dispatched. For example (assuming a code base at that commit):
      
              main_loop_wait # 1
                os_host_main_loop_wait
                  g_main_context_dispatch
                    aio_ctx_dispatch
                      aio_dispatch
                        ...
                          mirror_run
                            bdrv_drain
          (a)               block_job_defer_to_main_loop
                qemu_iohandler_poll
                  virtio_queue_host_notifier_read
                    ...
                      virtio_submit_multiwrite
          (b)           blk_aio_multiwrite
      
              main_loop_wait # 2
                <snip>
                      aio_dispatch
                        aio_bh_poll
          (c)             mirror_exit
      
      At (a) we know the BDS has no pending request. However, the same
      main_loop_wait call is going to dispatch iohandlers (EventNotifier
      events), which may lead to a new I/O from guest. So the invariant is
      already broken at (c). Data loss.
      
      Commit f3926945 made iohandler to use aio API.  The order of
      virtio_queue_host_notifier_read and block_job_defer_to_main_loop within
      a main_loop_wait becomes unpredictable, and even worse, if the host
      notifier event arrives at the next main_loop_wait call, the
      unpredictable order between mirror_exit and
      virtio_queue_host_notifier_read is also a trouble. As shown below, this
      commit made the bug easier to trigger:
      
          - Bug case 1:
      
              main_loop_wait # 1
                os_host_main_loop_wait
                  g_main_context_dispatch
                    aio_ctx_dispatch (qemu_aio_context)
                      ...
                        mirror_run
                          bdrv_drain
          (a)             block_job_defer_to_main_loop
                    aio_ctx_dispatch (iohandler_ctx)
                      virtio_queue_host_notifier_read
                        ...
                          virtio_submit_multiwrite
          (b)               blk_aio_multiwrite
      
              main_loop_wait # 2
                ...
                      aio_dispatch
                        aio_bh_poll
          (c)             mirror_exit
      
          - Bug case 2:
      
              main_loop_wait # 1
                os_host_main_loop_wait
                  g_main_context_dispatch
                    aio_ctx_dispatch (qemu_aio_context)
                      ...
                        mirror_run
                          bdrv_drain
          (a)             block_job_defer_to_main_loop
      
              main_loop_wait # 2
                ...
                  aio_ctx_dispatch (iohandler_ctx)
                    virtio_queue_host_notifier_read
                      ...
                        virtio_submit_multiwrite
          (b)             blk_aio_multiwrite
                    aio_dispatch
                      aio_bh_poll
          (c)           mirror_exit
      
      In both cases, (b) breaks the invariant wanted by (a) and (c).
      
      Until then, the request loss has been silent. Later, 3f09bfbc added
      asserts at (c) to check the invariant (in
      bdrv_replace_in_backing_chain), and Max reported an assertion failure
      first visible there, by doing active committing while the guest is
      running bonnie++.
      
      2.5 added bdrv_drained_begin at (a) to protect the dataplane case from
      similar problems, but we never realize the main loop bug until now.
      
      As a bandage, this patch disables iohandler's external events
      temporarily together with bs->ctx.
      
      Launchpad Bug: 1570134
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      ab27c3b5
  2. 20 4月, 2016 3 次提交
  3. 11 4月, 2016 1 次提交
  4. 30 3月, 2016 1 次提交
  5. 23 3月, 2016 1 次提交
    • M
      include/qemu/osdep.h: Don't include qapi/error.h · da34e65c
      Markus Armbruster 提交于
      Commit 57cb38b3 included qapi/error.h into qemu/osdep.h to get the
      Error typedef.  Since then, we've moved to include qemu/osdep.h
      everywhere.  Its file comment explains: "To avoid getting into
      possible circular include dependencies, this file should not include
      any other QEMU headers, with the exceptions of config-host.h,
      compiler.h, os-posix.h and os-win32.h, all of which are doing a
      similar job to this file and are under similar constraints."
      qapi/error.h doesn't do a similar job, and it doesn't adhere to
      similar constraints: it includes qapi-types.h.  That's in excess of
      100KiB of crap most .c files don't actually need.
      
      Add the typedef to qemu/typedefs.h, and include that instead of
      qapi/error.h.  Include qapi/error.h in .c files that need it and don't
      get it now.  Include qapi-types.h in qom/object.h for uint16List.
      
      Update scripts/clean-includes accordingly.  Update it further to match
      reality: replace config.h by config-target.h, add sysemu/os-posix.h,
      sysemu/os-win32.h.  Update the list of includes in the qemu/osdep.h
      comment quoted above similarly.
      
      This reduces the number of objects depending on qapi/error.h from "all
      of them" to less than a third.  Unfortunately, the number depending on
      qapi-types.h shrinks only a little.  More work is needed for that one.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      [Fix compilation without the spice devel packages. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      da34e65c
  6. 01 3月, 2016 2 次提交
  7. 03 2月, 2016 1 次提交
    • F
      block: Add "file" output parameter to block status query functions · 67a0fd2a
      Fam Zheng 提交于
      The added parameter can be used to return the BDS pointer which the
      valid offset is referring to. Its value should be ignored unless
      BDRV_BLOCK_OFFSET_VALID in ret is set.
      
      Until block drivers fill in the right value, let's clear it explicitly
      right before calling .bdrv_get_block_status.
      
      The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
      case 102 passing, and will be fixed once all drivers return the right file
      pointer.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      67a0fd2a
  8. 20 1月, 2016 1 次提交
  9. 22 12月, 2015 1 次提交
  10. 18 12月, 2015 2 次提交
    • K
      block: Allow references for backing files · d9b7b057
      Kevin Wolf 提交于
      For bs->file, using references to existing BDSes has been possible for a
      while already. This patch enables the same for bs->backing.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      d9b7b057
    • K
      mirror: Error out when a BDS would get two BBs · 40365552
      Kevin Wolf 提交于
      bdrv_replace_in_backing_chain() asserts that not both old and new
      BlockDdriverState have a BlockBackend attached to them because both
      would have to end up pointing to the new BDS and we don't support more
      than one BB per BDS yet.
      
      Before we can safely allow references to existing nodes as backing
      files, we need to make sure that even if a backing file has a BB on it,
      this doesn't crash qemu.
      
      There are probably also some cases with the 'replaces' option set where
      drive-mirror could fail this assertion today. They are fixed with this
      error check as well.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      40365552
  11. 02 12月, 2015 1 次提交
    • F
      mirror: Quiesce source during "mirror_exit" · 176c3699
      Fam Zheng 提交于
      With dataplane, the ioeventfd events could be dispatched after
      mirror_run releases the dirty bitmap, but before mirror_exit actually
      does the device switch, because the iothread will still be running, and
      it will cause silent data loss.
      
      Fix this by adding a bdrv_drained_begin/end pair around the window, so
      that no new external request will be handled.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      176c3699
  12. 12 11月, 2015 1 次提交
  13. 11 11月, 2015 1 次提交
  14. 24 10月, 2015 1 次提交
  15. 16 10月, 2015 4 次提交
  16. 12 10月, 2015 1 次提交
  17. 02 10月, 2015 1 次提交
    • J
      block: mirror - fix full sync mode when target does not support zero init · 5279efeb
      Jeff Cody 提交于
      During mirror, if the target device does not support zero init, a
      mirror may result in a corrupted image for sync="full" mode.
      
      This is due to how the initial dirty bitmap is set up prior to copying
      data - we did not mark sectors as dirty that are unallocated.  This
      means those unallocated sectors are skipped over on the target, and for
      a device without zero init, invalid data may reside in those holes.
      
      If both of the following conditions are true, then we will explicitly
      mark all sectors as dirty:
      
          1.) sync = "full"
          2.) bdrv_has_zero_init(target) == false
      
      If the target does support zero init, but a target image is passed in
      with data already present (i.e. an "existing" image), it is assumed the
      data present in the existing image is valid data for those sectors.
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 91ed4bc5bda7e2b09eb508b07c83f4071fe0b3c9.1443705220.git.jcody@redhat.com
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      5279efeb
  18. 02 9月, 2015 1 次提交
    • W
      block: more check for replaced node · e12f3784
      Wen Congyang 提交于
      We use mirror+replace to fix quorum's broken child. bs/s->common.bs
      is quorum, and to_replace is the broken child. The new child is target_bs.
      Without this patch, the replace node can be any node, and it can be
      top BDS with BB, or another quorum's child. We just check if the broken
      child is part of the quorum BDS in this patch.
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Message-id: 55A86486.1000404@cn.fujitsu.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      e12f3784
  19. 14 8月, 2015 1 次提交
    • K
      mirror: Fix coroutine reentrance · e424aff5
      Kevin Wolf 提交于
      This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
      write on target if sectors not allocated"), which was reported to cause
      aborts with the message "Co-routine re-entered recursively".
      
      The cause for this bug is the following code in mirror_iteration_done():
      
          if (s->common.busy) {
              qemu_coroutine_enter(s->common.co, NULL);
          }
      
      This has always been ugly because - unlike most places that reenter - it
      doesn't have a specific yield that it pairs with, but is more
      uncontrolled.  What we really mean here is "reenter the coroutine if
      it's in one of the four explicit yields in mirror.c".
      
      This used to be equivalent with s->common.busy because neither
      mirror_run() nor mirror_iteration() call any function that could yield.
      However since commit dcfb3beb this doesn't hold true any more:
      bdrv_get_block_status_above() can yield.
      
      So what happens is that bdrv_get_block_status_above() wants to take a
      lock that is already held, so it adds itself to the queue of waiting
      coroutines and yields. Instead of being woken up by the unlock function,
      however, it gets woken up by mirror_iteration_done(), which is obviously
      wrong.
      
      In most cases the code actually happens to cope fairly well with such
      cases, but in this specific case, the unlock must already have scheduled
      the coroutine for wakeup when mirror_iteration_done() reentered it. And
      then the coroutine happened to process the scheduled restarts and tried
      to reenter itself recursively.
      
      This patch fixes the problem by pairing the reenter in
      mirror_iteration_done() with specific yields instead of abusing
      s->common.busy.
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Message-id: 1439455310-11263-1-git-send-email-kwolf@redhat.com
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      e424aff5
  20. 06 8月, 2015 1 次提交
    • S
      block/mirror: limit qiov to IOV_MAX elements · cae98cb8
      Stefan Hajnoczi 提交于
      If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2)
      EINVAL failures may be encountered.
      
      It is possible to trigger this by setting granularity to a low value
      like 8192.
      
      This patch stops appending chunks once IOV_MAX is reached.
      
      The spurious EINVAL failure can be reproduced with a qcow2 image file
      and the following QMP invocation:
      
        qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1',
                    granularity=8192, sync='full', mode='absolute-paths',
                    format='raw')
      
      While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct
      bs=4k.
      
      Cc: Jeff Cody <jcody@redhat.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 1435761950-26714-1-git-send-email-stefanha@redhat.com
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      cae98cb8
  21. 22 7月, 2015 1 次提交
  22. 15 7月, 2015 2 次提交
  23. 07 7月, 2015 1 次提交
  24. 02 7月, 2015 3 次提交
  25. 23 6月, 2015 2 次提交
  26. 28 4月, 2015 4 次提交