1. 27 4月, 2013 4 次提交
    • S
      kvm: add device control API · 852b6d57
      Scott Wood 提交于
      Currently, devices that are emulated inside KVM are configured in a
      hardcoded manner based on an assumption that any given architecture
      only has one way to do it.  If there's any need to access device state,
      it is done through inflexible one-purpose-only IOCTLs (e.g.
      KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
      cumbersome and depletes a limited numberspace.
      
      This API provides a mechanism to instantiate a device of a certain
      type, returning an ID that can be used to set/get attributes of the
      device.  Attributes may include configuration parameters (e.g.
      register base address), device state, operational commands, etc.  It
      is similar to the ONE_REG API, except that it acts on devices rather
      than vcpus.
      
      Both device types and individual attributes can be tested without having
      to create the device or get/set the attribute, without the need for
      separately managing enumerated capabilities.
      Signed-off-by: NScott Wood <scottwood@freescale.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      852b6d57
    • A
      KVM: Move irqfd resample cap handling to generic code · 7df35f54
      Alexander Graf 提交于
      Now that we have most irqfd code completely platform agnostic, let's move
      irqfd's resample capability return to generic code as well.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      7df35f54
    • A
      KVM: Move irq routing to generic code · aa8d5944
      Alexander Graf 提交于
      The IRQ routing set ioctl lives in the hacky device assignment code inside
      of KVM today. This is definitely the wrong place for it. Move it to the much
      more natural kvm_main.c.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      aa8d5944
    • A
      KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING · a725d56a
      Alexander Graf 提交于
      Quite a bit of code in KVM has been conditionalized on availability of
      IOAPIC emulation. However, most of it is generically applicable to
      platforms that don't have an IOPIC, but a different type of irq chip.
      
      Make code that only relies on IRQ routing, not an APIC itself, on
      CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      a725d56a
  2. 17 4月, 2013 2 次提交
  3. 16 4月, 2013 1 次提交
  4. 08 4月, 2013 2 次提交
  5. 11 3月, 2013 2 次提交
  6. 06 3月, 2013 1 次提交
  7. 05 3月, 2013 5 次提交
  8. 11 2月, 2013 1 次提交
  9. 05 2月, 2013 2 次提交
    • T
      KVM: set_memory_region: Disallow changing read-only attribute later · 75d61fbc
      Takuya Yoshikawa 提交于
      As Xiao pointed out, there are a few problems with it:
       - kvm_arch_commit_memory_region() write protects the memory slot only
         for GET_DIRTY_LOG when modifying the flags.
       - FNAME(sync_page) uses the old spte value to set a new one without
         checking KVM_MEM_READONLY flag.
      
      Since we flush all shadow pages when creating a new slot, the simplest
      fix is to disallow such problematic flag changes: this is safe because
      no one is doing such things.
      Reviewed-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
      Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      75d61fbc
    • T
      KVM: set_memory_region: Identify the requested change explicitly · f64c0398
      Takuya Yoshikawa 提交于
      KVM_SET_USER_MEMORY_REGION forces __kvm_set_memory_region() to identify
      what kind of change is being requested by checking the arguments.  The
      current code does this checking at various points in code and each
      condition being used there is not easy to understand at first glance.
      
      This patch consolidates these checks and introduces an enum to name the
      possible changes to clean up the code.
      
      Although this does not introduce any functional changes, there is one
      change which optimizes the code a bit: if we have nothing to change, the
      new code returns 0 immediately.
      
      Note that the return value for this case cannot be changed since QEMU
      relies on it: we noticed this when we changed it to -EINVAL and got a
      section mismatch error at the final stage of live migration.
      Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      f64c0398
  10. 29 1月, 2013 2 次提交
    • R
      kvm: Handle yield_to failure return code for potential undercommit case · c45c528e
      Raghavendra K T 提交于
      yield_to returns -ESRCH, When source and target of yield_to
      run queue length is one. When we see three successive failures of
      yield_to we assume we are in potential undercommit case and abort
      from PLE handler.
      The assumption is backed by low probability of wrong decision
      for even worst case scenarios such as average runqueue length
      between 1 and 2.
      
      More detail on rationale behind using three tries:
      if p is the probability of finding rq length one on a particular cpu,
      and if we do n tries, then probability of exiting ple handler is:
      
       p^(n+1) [ because we would have come across one source with rq length
      1 and n target cpu rqs  with length 1 ]
      
      so
      num tries:         probability of aborting ple handler (1.5x overcommit)
       1                 1/4
       2                 1/8
       3                 1/16
      
      We can increase this probability with more tries, but the problem is
      the overhead.
      Also, If we have tried three times that means we would have iterated
      over 3 good eligible vcpus along with many non-eligible candidates. In
      worst case if we iterate all the vcpus, we reduce 1x performance and
      overcommit performance get hit.
      
      note that we do not update last boosted vcpu in failure cases.
      Thank Avi for raising question on aborting after first fail from yield_to.
      Reviewed-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Tested-by: NChegu Vinod <chegu_vinod@hp.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      c45c528e
    • Y
      x86, apicv: add virtual interrupt delivery support · c7c9c56c
      Yang Zhang 提交于
      Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
      manually, which is fully taken care of by the hardware. This needs
      some special awareness into existing interrupr injection path:
      
      - for pending interrupt, instead of direct injection, we may need
        update architecture specific indicators before resuming to guest.
      
      - A pending interrupt, which is masked by ISR, should be also
        considered in above update action, since hardware will decide
        when to inject it at right time. Current has_interrupt and
        get_interrupt only returns a valid vector from injection p.o.v.
      Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: NKevin Tian <kevin.tian@intel.com>
      Signed-off-by: NYang Zhang <yang.z.zhang@Intel.com>
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      c7c9c56c
  11. 27 1月, 2013 1 次提交
  12. 17 1月, 2013 3 次提交
  13. 14 1月, 2013 1 次提交
  14. 24 12月, 2012 1 次提交
  15. 23 12月, 2012 1 次提交
  16. 14 12月, 2012 7 次提交
  17. 30 11月, 2012 1 次提交
    • A
      KVM: Fix user memslot overlap check · 5419369e
      Alex Williamson 提交于
      Prior to memory slot sorting this loop compared all of the user memory
      slots for overlap with new entries.  With memory slot sorting, we're
      just checking some number of entries in the array that may or may not
      be user slots.  Instead, walk all the slots with kvm_for_each_memslot,
      which has the added benefit of terminating early when we hit the first
      empty slot, and skip comparison to private slots.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      5419369e
  18. 28 11月, 2012 2 次提交
    • M
      KVM: x86: add kvm_arch_vcpu_postcreate callback, move TSC initialization · 42897d86
      Marcelo Tosatti 提交于
      TSC initialization will soon make use of online_vcpus.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      42897d86
    • M
      KVM: x86: implement PVCLOCK_TSC_STABLE_BIT pvclock flag · d828199e
      Marcelo Tosatti 提交于
      KVM added a global variable to guarantee monotonicity in the guest.
      One of the reasons for that is that the time between
      
      	1. ktime_get_ts(&timespec);
      	2. rdtscll(tsc);
      
      Is variable. That is, given a host with stable TSC, suppose that
      two VCPUs read the same time via ktime_get_ts() above.
      
      The time required to execute 2. is not the same on those two instances
      executing in different VCPUS (cache misses, interrupts...).
      
      If the TSC value that is used by the host to interpolate when
      calculating the monotonic time is the same value used to calculate
      the tsc_timestamp value stored in the pvclock data structure, and
      a single <system_timestamp, tsc_timestamp> tuple is visible to all
      vcpus simultaneously, this problem disappears. See comment on top
      of pvclock_update_vm_gtod_copy for details.
      
      Monotonicity is then guaranteed by synchronicity of the host TSCs
      and guest TSCs.
      
      Set TSC stable pvclock flag in that case, allowing the guest to read
      clock from userspace.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      d828199e
  19. 14 11月, 2012 1 次提交