1. 20 7月, 2016 3 次提交
  2. 13 7月, 2016 6 次提交
    • S
      Improve block job rate limiting for small bandwidth values · f14a39cc
      Sascha Silbe 提交于
      ratelimit_calculate_delay() previously reset the accounting every time
      slice, no matter how much data had been processed before. This had (at
      least) two consequences:
      
      1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream.
      
         Not sure if there are real-world use cases where this would be a
         problem. Mirroring and backup over a slow link (e.g. DSL) would
         come to mind, though.
      
      2. Tests for block job operations (e.g. cancel) were rather racy
      
         All block jobs currently use a time slice of 100ms. That's a
         reasonable value to get smooth output during regular
         operation. However this also meant that the state of block jobs
         changed every 100ms, no matter how low the configured limit was. On
         busy hosts, qemu often transferred additional chunks until the test
         case had a chance to cancel the job.
      
      Fix the block job rate limit code to delay for more than one time
      slice to address the above issues. To make it easier to handle
      oversized chunks we switch the semantics from returning a delay
      _before_ the current request to a delay _after_ the current
      request. If necessary, this delay consists of multiple time slice
      units.
      
      Since the mirror job sends multiple chunks in one go even if the rate
      limit was exceeded in between, we need to keep track of the start of
      the current time slice so we can correctly re-compute the delay for
      the updated amount of data.
      
      The minimum bandwidth now is 1 data unit per time slice. The block
      jobs are currently passing the amount of data transferred in sectors
      and using 100ms time slices, so this translates to 5120
      bytes/second. With chunk sizes usually being O(512KiB), tests have
      plenty of time (O(100s)) to operate on block jobs. The chance of a
      race condition now is fairly remote, except possibly on insanely
      loaded systems.
      Signed-off-by: NSascha Silbe <silbe@linux.vnet.ibm.com>
      Message-id: 1467127721-9564-2-git-send-email-silbe@linux.vnet.ibm.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      f14a39cc
    • P
      coroutine: move entry argument to qemu_coroutine_create · 0b8b8753
      Paolo Bonzini 提交于
      In practice the entry argument is always known at creation time, and
      it is confusing that sometimes qemu_coroutine_enter is used with a
      non-NULL argument to re-enter a coroutine (this happens in
      block/sheepdog.c and tests/test-coroutine.c).  So pass the opaque value
      at creation time, for consistency with e.g. aio_bh_new.
      
      Mostly done with the following semantic patch:
      
      @ entry1 @
      expression entry, arg, co;
      @@
      - co = qemu_coroutine_create(entry);
      + co = qemu_coroutine_create(entry, arg);
        ...
      - qemu_coroutine_enter(co, arg);
      + qemu_coroutine_enter(co);
      
      @ entry2 @
      expression entry, arg;
      identifier co;
      @@
      - Coroutine *co = qemu_coroutine_create(entry);
      + Coroutine *co = qemu_coroutine_create(entry, arg);
        ...
      - qemu_coroutine_enter(co, arg);
      + qemu_coroutine_enter(co);
      
      @ entry3 @
      expression entry, arg;
      @@
      - qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
      + qemu_coroutine_enter(qemu_coroutine_create(entry, arg));
      
      @ reentry @
      expression co;
      @@
      - qemu_coroutine_enter(co, NULL);
      + qemu_coroutine_enter(co);
      
      except for the aforementioned few places where the semantic patch
      stumbled (as expected) and for test_co_queue, which would otherwise
      produce an uninitialized variable warning.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      0b8b8753
    • A
      commit: Add 'job-id' parameter to 'block-commit' · fd62c609
      Alberto Garcia 提交于
      This patch adds a new optional 'job-id' parameter to 'block-commit',
      allowing the user to specify the ID of the block job to be created.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      fd62c609
    • A
      mirror: Add 'job-id' parameter to 'blockdev-mirror' and 'drive-mirror' · 71aa9867
      Alberto Garcia 提交于
      This patch adds a new optional 'job-id' parameter to 'blockdev-mirror'
      and 'drive-mirror', allowing the user to specify the ID of the block
      job to be created.
      
      The HMP 'drive_mirror' command remains unchanged.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      71aa9867
    • A
      blockjob: Add 'job_id' parameter to block_job_create() · 7f0317cf
      Alberto Garcia 提交于
      When a new job is created, the job ID is taken from the device name of
      the BDS. This patch adds a new 'job_id' parameter to let the caller
      provide one instead.
      
      This patch also verifies that the ID is always unique and well-formed.
      This causes problems in a couple of places where no ID is being set,
      because the BDS does not have a device name.
      
      In the case of test_block_job_start() (from test-blockjob-txn.c) we
      can simply use this new 'job_id' parameter to set the missing ID.
      
      In the case of img_commit() (from qemu-img.c) we still don't have the
      API to make commit_active_start() set the job ID, so we solve it by
      setting a default value. We'll get rid of this as soon as we extend
      the API.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      7f0317cf
    • A
      blockjob: Update description of the 'id' field · 9df229c3
      Alberto Garcia 提交于
      The 'id' field of the BlockJob structure will be able to hold any ID,
      not only a device name. This patch updates the description of that
      field and the error messages where it is being used.
      
      Soon we'll add the ability to set an arbitrary ID when creating a
      block job.
      Signed-off-by: NAlberto Garcia <berto@igalia.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      9df229c3
  3. 29 6月, 2016 4 次提交
  4. 20 6月, 2016 1 次提交
  5. 16 6月, 2016 3 次提交
    • M
      block/mirror: Fix target backing BDS · 274fccee
      Max Reitz 提交于
      Currently, we are trying to move the backing BDS from the source to the
      target in bdrv_replace_in_backing_chain() which is called from
      mirror_exit(). However, mirror_complete() already tries to open the
      target's backing chain with a call to bdrv_open_backing_file().
      
      First, we should only set the target's backing BDS once. Second, the
      mirroring block job has a better idea of what to set it to than the
      generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
      conditions on when to move the backing BDS from source to target are not
      really correct).
      
      Therefore, remove that code from bdrv_replace_in_backing_chain() and
      leave it to mirror_complete().
      
      Depending on what kind of mirroring is performed, we furthermore want to
      use different strategies to open the target's backing chain:
      
      - If blockdev-mirror is used, we can assume the user made sure that the
        target already has the correct backing chain. In particular, we should
        not try to open a backing file if the target does not have any yet.
      
      - If drive-mirror with mode=absolute-paths is used, we can and should
        reuse the already existing chain of nodes that the source BDS is in.
        In case of sync=full, no backing BDS is required; with sync=top, we
        just link the source's backing BDS to the target, and with sync=none,
        we use the source BDS as the target's backing BDS.
        We should not try to open these backing files anew because this would
        lead to two BDSs existing per physical file in the backing chain, and
        we would like to avoid such concurrent access.
      
      - If drive-mirror with mode=existing is used, we have to use the
        information provided in the physical image file which means opening
        the target's backing chain completely anew, just as it has been done
        already.
        If the target's backing chain shares images with the source, this may
        lead to multiple BDSs per physical image file. But since we cannot
        reliably ascertain this case, there is nothing we can do about it.
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      Message-id: 20160610185750.30956-3-mreitz@redhat.com
      Reviewed-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      274fccee
    • K
      block: Byte-based bdrv_co_do_copy_on_readv() · 244483e6
      Kevin Wolf 提交于
      In a first step to convert the common I/O path to work on bytes rather
      than sectors, this converts the copy-on-read logic that is used by
      bdrv_aligned_preadv().
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
      244483e6
    • E
      block: Avoid bogus flags during mirroring · 73698c30
      Eric Blake 提交于
      Commit e253f4b8 converted mirroring from sector-based bdrv_aio_*
      to byte-based blk_aio_*, but failed to account for the subtle
      difference in signatures (the former takes a semi-redundant length,
      the latter takes a flags parameter).  Since all of our flags are
      currently smaller in size than BDRV_SECTOR_SIZE, it has no ill
      effects until we either perform sub-sector mirroring, or we start
      asserting that no unexpected flags are set.  I found it while
      testing new asserts when qemu-iotests 132 started warning about an
      unknown flag 0x200000.
      Signed-off-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      73698c30
  6. 26 5月, 2016 3 次提交
  7. 19 5月, 2016 3 次提交
    • K
      block: Remove BlockDriverState.blk · 1f0c461b
      Kevin Wolf 提交于
      This patch removes the remaining users of bs->blk, which will allow us
      to have multiple BBs on top of a single BDS. In the meantime, all checks
      that are currently in place to prevent the user from creating such
      setups can be switched to bdrv_has_blk() instead of accessing BDS.blk.
      
      Future patches can allow them and e.g. enable users to mirror to a block
      device that already has a BlockBackend on it.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      1f0c461b
    • K
      blockjob: Don't touch BDS iostatus · 66a0fae4
      Kevin Wolf 提交于
      Block jobs don't actually make use of the iostatus for their BDSes, but
      they manage a separate block job iostatus. Still, they require that it
      is enabled for the source BDS and they enable it automatically for the
      target and set the error handling mode - which ends up never being used
      by the job.
      
      This patch removes all of the BDS iostatus handling from the block job,
      which removes another few bs->blk accesses.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      66a0fae4
    • K
      blockjob: Don't set iostatus of target · 81e254dc
      Kevin Wolf 提交于
      When block job errors were introduced, we assigned the iostatus of the
      target BDS "just in case". The field has never been accessible for the
      user because the target isn't listed in query-block.
      
      Before we can allow the user to have a second BlockBackend on the
      target, we need to clean this up. If anything, we would want to set the
      iostatus for the internal BB of the job (which we can always do later),
      but certainly not for a separate BB which the job doesn't even use.
      
      As a nice side effect, this gets us rid of another bs->blk use.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      81e254dc
  8. 22 4月, 2016 1 次提交
    • F
      mirror: Workaround for unexpected iohandler events during completion · ab27c3b5
      Fam Zheng 提交于
      Commit 5a7e7a0b moved mirror_exit to a BH handler but didn't add any
      protection against new requests that could sneak in just before the
      BH is dispatched. For example (assuming a code base at that commit):
      
              main_loop_wait # 1
                os_host_main_loop_wait
                  g_main_context_dispatch
                    aio_ctx_dispatch
                      aio_dispatch
                        ...
                          mirror_run
                            bdrv_drain
          (a)               block_job_defer_to_main_loop
                qemu_iohandler_poll
                  virtio_queue_host_notifier_read
                    ...
                      virtio_submit_multiwrite
          (b)           blk_aio_multiwrite
      
              main_loop_wait # 2
                <snip>
                      aio_dispatch
                        aio_bh_poll
          (c)             mirror_exit
      
      At (a) we know the BDS has no pending request. However, the same
      main_loop_wait call is going to dispatch iohandlers (EventNotifier
      events), which may lead to a new I/O from guest. So the invariant is
      already broken at (c). Data loss.
      
      Commit f3926945 made iohandler to use aio API.  The order of
      virtio_queue_host_notifier_read and block_job_defer_to_main_loop within
      a main_loop_wait becomes unpredictable, and even worse, if the host
      notifier event arrives at the next main_loop_wait call, the
      unpredictable order between mirror_exit and
      virtio_queue_host_notifier_read is also a trouble. As shown below, this
      commit made the bug easier to trigger:
      
          - Bug case 1:
      
              main_loop_wait # 1
                os_host_main_loop_wait
                  g_main_context_dispatch
                    aio_ctx_dispatch (qemu_aio_context)
                      ...
                        mirror_run
                          bdrv_drain
          (a)             block_job_defer_to_main_loop
                    aio_ctx_dispatch (iohandler_ctx)
                      virtio_queue_host_notifier_read
                        ...
                          virtio_submit_multiwrite
          (b)               blk_aio_multiwrite
      
              main_loop_wait # 2
                ...
                      aio_dispatch
                        aio_bh_poll
          (c)             mirror_exit
      
          - Bug case 2:
      
              main_loop_wait # 1
                os_host_main_loop_wait
                  g_main_context_dispatch
                    aio_ctx_dispatch (qemu_aio_context)
                      ...
                        mirror_run
                          bdrv_drain
          (a)             block_job_defer_to_main_loop
      
              main_loop_wait # 2
                ...
                  aio_ctx_dispatch (iohandler_ctx)
                    virtio_queue_host_notifier_read
                      ...
                        virtio_submit_multiwrite
          (b)             blk_aio_multiwrite
                    aio_dispatch
                      aio_bh_poll
          (c)           mirror_exit
      
      In both cases, (b) breaks the invariant wanted by (a) and (c).
      
      Until then, the request loss has been silent. Later, 3f09bfbc added
      asserts at (c) to check the invariant (in
      bdrv_replace_in_backing_chain), and Max reported an assertion failure
      first visible there, by doing active committing while the guest is
      running bonnie++.
      
      2.5 added bdrv_drained_begin at (a) to protect the dataplane case from
      similar problems, but we never realize the main loop bug until now.
      
      As a bandage, this patch disables iohandler's external events
      temporarily together with bs->ctx.
      
      Launchpad Bug: 1570134
      
      Cc: qemu-stable@nongnu.org
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Reviewed-by: NJeff Cody <jcody@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      ab27c3b5
  9. 20 4月, 2016 3 次提交
  10. 11 4月, 2016 1 次提交
  11. 30 3月, 2016 1 次提交
  12. 23 3月, 2016 1 次提交
    • M
      include/qemu/osdep.h: Don't include qapi/error.h · da34e65c
      Markus Armbruster 提交于
      Commit 57cb38b3 included qapi/error.h into qemu/osdep.h to get the
      Error typedef.  Since then, we've moved to include qemu/osdep.h
      everywhere.  Its file comment explains: "To avoid getting into
      possible circular include dependencies, this file should not include
      any other QEMU headers, with the exceptions of config-host.h,
      compiler.h, os-posix.h and os-win32.h, all of which are doing a
      similar job to this file and are under similar constraints."
      qapi/error.h doesn't do a similar job, and it doesn't adhere to
      similar constraints: it includes qapi-types.h.  That's in excess of
      100KiB of crap most .c files don't actually need.
      
      Add the typedef to qemu/typedefs.h, and include that instead of
      qapi/error.h.  Include qapi/error.h in .c files that need it and don't
      get it now.  Include qapi-types.h in qom/object.h for uint16List.
      
      Update scripts/clean-includes accordingly.  Update it further to match
      reality: replace config.h by config-target.h, add sysemu/os-posix.h,
      sysemu/os-win32.h.  Update the list of includes in the qemu/osdep.h
      comment quoted above similarly.
      
      This reduces the number of objects depending on qapi/error.h from "all
      of them" to less than a third.  Unfortunately, the number depending on
      qapi-types.h shrinks only a little.  More work is needed for that one.
      Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
      [Fix compilation without the spice devel packages. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      da34e65c
  13. 01 3月, 2016 2 次提交
  14. 03 2月, 2016 1 次提交
    • F
      block: Add "file" output parameter to block status query functions · 67a0fd2a
      Fam Zheng 提交于
      The added parameter can be used to return the BDS pointer which the
      valid offset is referring to. Its value should be ignored unless
      BDRV_BLOCK_OFFSET_VALID in ret is set.
      
      Until block drivers fill in the right value, let's clear it explicitly
      right before calling .bdrv_get_block_status.
      
      The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
      case 102 passing, and will be fixed once all drivers return the right file
      pointer.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      Signed-off-by: NMax Reitz <mreitz@redhat.com>
      67a0fd2a
  15. 20 1月, 2016 1 次提交
  16. 22 12月, 2015 1 次提交
  17. 18 12月, 2015 2 次提交
    • K
      block: Allow references for backing files · d9b7b057
      Kevin Wolf 提交于
      For bs->file, using references to existing BDSes has been possible for a
      while already. This patch enables the same for bs->backing.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      d9b7b057
    • K
      mirror: Error out when a BDS would get two BBs · 40365552
      Kevin Wolf 提交于
      bdrv_replace_in_backing_chain() asserts that not both old and new
      BlockDdriverState have a BlockBackend attached to them because both
      would have to end up pointing to the new BDS and we don't support more
      than one BB per BDS yet.
      
      Before we can safely allow references to existing nodes as backing
      files, we need to make sure that even if a backing file has a BB on it,
      this doesn't crash qemu.
      
      There are probably also some cases with the 'replaces' option set where
      drive-mirror could fail this assertion today. They are fixed with this
      error check as well.
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      Reviewed-by: NMax Reitz <mreitz@redhat.com>
      40365552
  18. 02 12月, 2015 1 次提交
    • F
      mirror: Quiesce source during "mirror_exit" · 176c3699
      Fam Zheng 提交于
      With dataplane, the ioeventfd events could be dispatched after
      mirror_run releases the dirty bitmap, but before mirror_exit actually
      does the device switch, because the iothread will still be running, and
      it will cause silent data loss.
      
      Fix this by adding a bdrv_drained_begin/end pair around the window, so
      that no new external request will be handled.
      Signed-off-by: NFam Zheng <famz@redhat.com>
      Signed-off-by: NJeff Cody <jcody@redhat.com>
      176c3699
  19. 12 11月, 2015 1 次提交
  20. 11 11月, 2015 1 次提交