1. 18 7月, 2013 1 次提交
  2. 13 7月, 2013 1 次提交
    • C
      Force auto-convegence of live migration · 7ca1dfad
      Chegu Vinod 提交于
      If a user chooses to turn on the auto-converge migration capability
      these changes detect the lack of convergence and throttle down the
      guest. i.e. force the VCPUs out of the guest for some duration
      and let the migration thread catchup and help converge.
      
      Verified the convergence using the following :
       - Java Warehouse workload running on a 20VCPU/256G guest(~80% busy)
       - OLTP like workload running on a 80VCPU/512G guest (~80% busy)
      
      Sample results with Java warehouse workload : (migrate speed set to 20Gb and
      migrate downtime set to 4seconds).
      
       (qemu) info migrate
       capabilities: xbzrle: off auto-converge: off  <----
       Migration status: active
       total time: 1487503 milliseconds
       expected downtime: 519 milliseconds
       transferred ram: 383749347 kbytes
       remaining ram: 2753372 kbytes
       total ram: 268444224 kbytes
       duplicate: 65461532 pages
       skipped: 64901568 pages
       normal: 95750218 pages
       normal bytes: 383000872 kbytes
       dirty pages rate: 67551 pages
      
       ---
      
       (qemu) info migrate
       capabilities: xbzrle: off auto-converge: on   <----
       Migration status: completed
       total time: 241161 milliseconds
       downtime: 6373 milliseconds
       transferred ram: 28235307 kbytes
       remaining ram: 0 kbytes
       total ram: 268444224 kbytes
       duplicate: 64946416 pages
       skipped: 64903523 pages
       normal: 7044971 pages
       normal bytes: 28179884 kbytes
      Signed-off-by: NChegu Vinod <chegu_vinod@hp.com>
      Signed-off-by: NJuan Quintela <quintela@redhat.com>
      7ca1dfad
  3. 28 6月, 2013 1 次提交
    • D
      block: add basic backup support to block driver · 98d2c6f2
      Dietmar Maurer 提交于
      backup_start() creates a block job that copies a point-in-time snapshot
      of a block device to a target block device.
      
      We call backup_do_cow() for each write during backup. That function
      reads the original data from the block device before it gets
      overwritten.  The data is then written to the target device.
      
      Currently backup cluster size is hardcoded to 65536 bytes.
      
      [I made a number of changes to Dietmar's original patch and folded them
      in to make code review easy.  Here is the full list:
      
       * Drop BackupDumpFunc interface in favor of a target block device
       * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
       * Use 0 delay instead of 1us, like other block jobs
       * Unify creation/start functions into backup_start()
       * Simplify cleanup, free bitmap in backup_run() instead of cb
       * function
       * Use HBitmap to avoid duplicating bitmap code
       * Use bdrv_getlength() instead of accessing ->total_sectors
       * directly
       * Delete the backup.h header file, it is no longer necessary
       * Move ./backup.c to block/backup.c
       * Remove #ifdefed out code
       * Coding style and whitespace cleanups
       * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
       * Keep our own in-flight CowRequest list instead of using block.c
         tracked requests.  This means a little code duplication but is much
         simpler than trying to share the tracked requests list and use the
         backup block size.
       * Add on_source_error and on_target_error error handling.
       * Use trace events instead of DPRINTF()
      
      -- stefanha]
      Signed-off-by: NDietmar Maurer <dietmar@proxmox.com>
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      98d2c6f2
  4. 25 5月, 2013 1 次提交
  5. 24 5月, 2013 1 次提交
    • S
      coroutine: stop using AioContext in CoQueue · 02ffb504
      Stefan Hajnoczi 提交于
      qemu_co_queue_next(&queue) arranges that the next queued coroutine is
      run at a later point in time.  This deferred restart is useful because
      the caller may not want to transfer control yet.
      
      This behavior was implemented using QEMUBH in the past, which meant that
      CoQueue (and hence CoMutex and CoRwlock) had a dependency on the
      AioContext event loop.  This hidden dependency causes trouble when we
      move to a world with multiple event loops - now qemu_co_queue_next()
      needs to know which event loop to schedule the QEMUBH in.
      
      After pondering how to stash AioContext I realized the best solution is
      to not use AioContext at all.  This patch implements the deferred
      restart behavior purely in terms of coroutines and no longer uses
      QEMUBH.
      
      Here is how it works:
      
      Each Coroutine has a wakeup queue that starts out empty.  When
      qemu_co_queue_next() is called, the next coroutine is added to our
      wakeup queue.  The wakeup queue is processed when we yield or terminate.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      02ffb504
  6. 20 5月, 2013 1 次提交
  7. 14 5月, 2013 2 次提交
  8. 13 5月, 2013 1 次提交
  9. 03 5月, 2013 2 次提交
  10. 02 5月, 2013 1 次提交
  11. 23 4月, 2013 1 次提交
  12. 19 4月, 2013 1 次提交
  13. 16 4月, 2013 6 次提交
    • G
      use libusb for usb-host · 2b2325ff
      Gerd Hoffmann 提交于
      Reimplement usb-host on top of libusb.
      Reasons to do this:
      
       (1) Largely rewritten from scratch, nice opportunity to kill historical
           cruft.
       (2) Offload usbfs handling to libusb.
       (3) Have a single portable code base instead of bsd + linux variants.
       (4) Bring usb-host support to any platform supported by libusbx.
      
      For now this goes side-by-side to the existing code.  That is only to
      simplify regression testing though, at the end of the day I want remove
      the old code and support libusb exclusively.  Merge early in 1.5 cycle,
      remove the old code after 1.5 release or something like this.
      
      Thanks to qdev the old and new code can coexist nicely on linux.  Just
      use "-device usb-host-linux" to use the old linux driver instead of the
      libusb one (which takes over the "usb-host" name).
      
      The bsd driver isn't qdev'ified so it isn't that easy for bsd.
      I didn't bother making it runtime switchable, so you have to rebuild
      qemu with --disable-libusb to get back the old code.
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      2b2325ff
    • G
      xhci: fix portsc writes · bdfce20d
      Gerd Hoffmann 提交于
      Check for port reset first and skip everything else then.
      Add sanity checks for PLS updates.
      Add PLC notification when entering PLS_U0 state.
      
      This gets host-initiated port resume going on win8.
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      bdfce20d
    • G
      console: gui timer fixes · 0f7b2864
      Gerd Hoffmann 提交于
      Make gui update rate adaption code in gui_update() actually work.
      Sprinkle in a tracepoint so you can see the code at work.  Remove
      the update rate adaption code in vnc and make vnc simply use the
      generic bits instead.
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      0f7b2864
    • G
      console: add trace events · 437fe106
      Gerd Hoffmann 提交于
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      437fe106
    • G
      hw/vmware_vga.c: various vmware vga fixes. · eb2f9b02
      Gerd Hoffmann 提交于
      Hardcode depth to 32 bpp.  It effectively was that way before because
      that is the default surface depth, this just makes it explicit in the
      code.
      
      Rename depth to new_depth to make it consistent with the new_width +
      new_height names.  In theory we can make new_depth changeable (i.e.
      allow the guest to fill in -- say -- 16 there).  In practice the guests
      don't try, the X-Server refuses to start if you ask it to use 16bpp
      depth (via DefaultDepth in the Screen section).
      
      Always return the correct rmask+gmask+bmask values for the given
      new_depth.
      
      Fix mode setting to also verify at new_depth to make sure we have a
      correct DisplaySurface, even if the current video mode happes to be
      16bpp (set by vgabios via bochs vbe interface).  While being at it
      switch over to use qemu_create_displaysurface_from, so the surface is
      backed by guest-visible video memory and we save a memcpy.
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      eb2f9b02
    • G
      7a6404cd
  14. 03 4月, 2013 1 次提交
  15. 28 3月, 2013 3 次提交
  16. 18 3月, 2013 2 次提交
  17. 15 3月, 2013 1 次提交
  18. 13 3月, 2013 1 次提交
    • G
      Add search path support for qemu data files. · 4524051c
      Gerd Hoffmann 提交于
      This patch allows to specify multiple directories where qemu should look
      for data files.  To implement that the behavior of the -L switch is
      slightly different now:  Instead of replacing the data directory the
      path specified will be appended to the data directory list.  So when
      specifiying -L multiple times all directories specified will be checked,
      in the order they are specified on the command line, instead of just the
      last one.
      
      Additionally the default paths are always appended to the directory
      data list.  This allows to specify a incomplete directory (such as the
      seabios out/ directory) via -L.  Anything not found there will be loaded
      from the default paths, so you don't have to create a symlink farm for
      all the rom blobs.
      
      For trouble-shooting a tracepoint has been added, logging which blob
      has been loaded from which location.
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      Message-id: 1362739344-8068-1-git-send-email-kraxel@redhat.com
      Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
      4524051c
  19. 11 3月, 2013 1 次提交
  20. 06 3月, 2013 1 次提交
  21. 19 2月, 2013 1 次提交
  22. 30 1月, 2013 3 次提交
  23. 26 1月, 2013 6 次提交
    • P
      mirror: support arbitrarily-sized iterations · 884fea4e
      Paolo Bonzini 提交于
      Yet another optimization is to extend the mirroring iteration to include more
      adjacent dirty blocks.  This limits the number of I/O operations and makes
      mirroring efficient even with a small granularity.  Most of the infrastructure
      is already in place; we only need to put a loop around the computation of
      the origin and sector count of the iteration.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      884fea4e
    • P
      mirror: support more than one in-flight AIO operation · 402a4741
      Paolo Bonzini 提交于
      With AIO support in place, we can start copying more than one chunk
      in parallel.  This patch introduces the required infrastructure for
      this: the buffer is split into multiple granularity-sized chunks,
      and there is a free list to access them.
      
      Because of copy-on-write, a single operation may already require
      multiple chunks to be available on the free list.
      
      In addition, two different iterations on the HBitmap may want to
      copy the same cluster.  We avoid this by keeping a bitmap of in-flight
      I/O operations, and blocking until the previous iteration completes.
      This should be a pretty rare occurrence, though; as long as there is
      no overlap the next iteration can start before the previous one finishes.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      402a4741
    • P
      mirror: switch mirror_iteration to AIO · bd48bde8
      Paolo Bonzini 提交于
      There is really no change in the behavior of the job here, since
      there is still a maximum of one in-flight I/O operation between
      the source and the target.  However, this patch already introduces
      the AIO callbacks (which are unmodified in the next patch)
      and some of the logic to count in-flight operations and only
      complete the job when there is none.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      bd48bde8
    • P
      mirror: perform COW if the cluster size is bigger than the granularity · b812f671
      Paolo Bonzini 提交于
      When mirroring runs, the backing files for the target may not yet be
      ready.  However, this means that a copy-on-write operation on the target
      would fill the missing sectors with zeros.  Copy-on-write only happens
      if the granularity of the dirty bitmap is smaller than the cluster size
      (and only for clusters that are allocated in the source after the job
      has started copying).  So far, the granularity was fixed to 1MB; to avoid
      the problem we detected the situation and required the backing files to
      be available in that case only.
      
      However, we want to lower the granularity for efficiency, so we need
      a better solution.  The solution is to always copy a whole cluster the
      first time it is touched.  The code keeps a bitmap of clusters that
      have already been allocated by the mirroring job, and only does "manual"
      copy-on-write if the chunk being copied is zero in the bitmap.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      b812f671
    • P
      block: implement dirty bitmap using HBitmap · 8f0720ec
      Paolo Bonzini 提交于
      This actually uses the dirty bitmap in the block layer, and converts
      mirroring to use an HBitmapIter.
      
      Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      8f0720ec
    • P
      add hierarchical bitmap data type and test cases · e7c033c3
      Paolo Bonzini 提交于
      HBitmaps provides an array of bits.  The bits are stored as usual in an
      array of unsigned longs, but HBitmap is also optimized to provide fast
      iteration over set bits; going from one bit to the next is O(logB n)
      worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
      that the number of levels is in fact fixed.
      
      In order to do this, it stacks multiple bitmaps with progressively coarser
      granularity; in all levels except the last, bit N is set iff the N-th
      unsigned long is nonzero in the immediately next level.  When iteration
      completes on the last level it can examine the 2nd-last level to quickly
      skip entire words, and even do so recursively to skip blocks of 64 words or
      powers thereof (32 on 32-bit machines).
      
      Given an index in the bitmap, it can be split in group of bits like
      this (for the 64-bit case):
      
           bits 0-57 => word in the last bitmap     | bits 58-63 => bit in the word
           bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
           bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word
      
      So it is easy to move up simply by shifting the index right by
      log2(BITS_PER_LONG) bits.  To move down, you shift the index left
      similarly, and add the word index within the group.  Iteration uses
      ffs (find first set bit) to find the next word to examine; this
      operation can be done in constant time in most current architectures.
      
      Setting or clearing a range of m bits on all levels, the work to perform
      is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
      
      When iterating on a bitmap, each bit (on any level) is only visited
      once.  Hence, The total cost of visiting a bitmap with m bits in it is
      the number of bits that are set in all bitmaps.  Unless the bitmap is
      extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
      cost of advancing from one bit to the next is usually constant.
      Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
      Reviewed-by: NEric Blake <eblake@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NKevin Wolf <kwolf@redhat.com>
      e7c033c3