1. 16 1月, 2017 2 次提交
  2. 04 1月, 2017 2 次提交
    • S
      aio: self-tune polling time · 82a41186
      Stefan Hajnoczi 提交于
      This patch is based on the algorithm for the kvm.ko halt_poll_ns
      parameter in Linux.  The initial polling time is zero.
      
      If the event loop is woken up within the maximum polling time it means
      polling could be effective, so grow polling time.
      
      If the event loop is woken up beyond the maximum polling time it means
      polling is not effective, so shrink polling time.
      
      If the event loop makes progress within the current polling time then
      the sweet spot has been reached.
      
      This algorithm adjusts the polling time so it can adapt to variations in
      workloads.  The goal is to reach the sweet spot while also recognizing
      when polling would hurt more than help.
      
      Two new trace events, poll_grow and poll_shrink, are added for observing
      polling time adjustment.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 20161201192652.9509-13-stefanha@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      82a41186
    • S
      aio: add polling mode to AioContext · 4a1cba38
      Stefan Hajnoczi 提交于
      The AioContext event loop uses ppoll(2) or epoll_wait(2) to monitor file
      descriptors or until a timer expires.  In cases like virtqueues, Linux
      AIO, and ThreadPool it is technically possible to wait for events via
      polling (i.e. continuously checking for events without blocking).
      
      Polling can be faster than blocking syscalls because file descriptors,
      the process scheduler, and system calls are bypassed.
      
      The main disadvantage to polling is that it increases CPU utilization.
      In classic polling configuration a full host CPU thread might run at
      100% to respond to events as quickly as possible.  This patch implements
      a timeout so we fall back to blocking syscalls if polling detects no
      activity.  After the timeout no CPU cycles are wasted on polling until
      the next event loop iteration.
      
      The run_poll_handlers_begin() and run_poll_handlers_end() trace events
      are added to aid performance analysis and troubleshooting.  If you need
      to know whether polling mode is being used, trace these events to find
      out.
      
      Note that the AioContext is now re-acquired before disabling notify_me
      in the non-polling case.  This makes the code cleaner since notify_me
      was enabled outside the non-polling AioContext release region.  This
      change is correct since it's safe to keep notify_me enabled longer
      (disabling is an optimization) but potentially causes unnecessary
      event_notifer_set() calls.  I think the chance of performance regression
      is small here.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-id: 20161201192652.9509-4-stefanha@redhat.com
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      4a1cba38
  3. 31 10月, 2016 1 次提交
    • A
      memory: Don't use memcpy for ram_device regions · 4a2e242b
      Alex Williamson 提交于
      With a vfio assigned device we lay down a base MemoryRegion registered
      as an IO region, giving us read & write accessors.  If the region
      supports mmap, we lay down a higher priority sub-region MemoryRegion
      on top of the base layer initialized as a RAM device pointer to the
      mmap.  Finally, if we have any quirks for the device (ie. address
      ranges that need additional virtualization support), we put another IO
      sub-region on top of the mmap MemoryRegion.  When this is flattened,
      we now potentially have sub-page mmap MemoryRegions exposed which
      cannot be directly mapped through KVM.
      
      This is as expected, but a subtle detail of this is that we end up
      with two different access mechanisms through QEMU.  If we disable the
      mmap MemoryRegion, we make use of the IO MemoryRegion and service
      accesses using pread and pwrite to the vfio device file descriptor.
      If the mmap MemoryRegion is enabled and results in one of these
      sub-page gaps, QEMU handles the access as RAM, using memcpy to the
      mmap.  Using either pread/pwrite or the mmap directly should be
      correct, but using memcpy causes us problems.  I expect that not only
      does memcpy not necessarily honor the original width and alignment in
      performing a copy, but it potentially also uses processor instructions
      not intended for MMIO spaces.  It turns out that this has been a
      problem for Realtek NIC assignment, which has such a quirk that
      creates a sub-page mmap MemoryRegion access.
      
      To resolve this, we disable memory_access_is_direct() for ram_device
      regions since QEMU assumes that it can use memcpy for those regions.
      Instead we access through MemoryRegionOps, which replaces the memcpy
      with simple de-references of standard sizes to the host memory.
      
      With this patch we attempt to provide unrestricted access to the RAM
      device, allowing byte through qword access as well as unaligned
      access.  The assumption here is that accesses initiated by the VM are
      driven by a device specific driver, which knows the device
      capabilities.  If unaligned accesses are not supported by the device,
      we don't want them to work in a VM by performing multiple aligned
      accesses to compose the unaligned access.  A down-side of this
      philosophy is that the xp command from the monitor attempts to use
      the largest available access weidth, unaware of the underlying
      device.  Using memcpy had this same restriction, but at least now an
      operator can dump individual registers, even if blocks of device
      memory may result in access widths beyond the capabilities of a
      given device (RTL NICs only support up to dword).
      Reported-by: NThorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      4a2e242b
  4. 12 10月, 2016 2 次提交
  5. 29 9月, 2016 5 次提交
  6. 27 9月, 2016 4 次提交
  7. 19 9月, 2016 1 次提交
  8. 13 8月, 2016 1 次提交
  9. 22 7月, 2016 1 次提交
  10. 29 6月, 2016 1 次提交
  11. 21 6月, 2016 20 次提交