1. 28 1月, 2013 1 次提交
    • F
      kvm: Prepare to add generic guest entry/exit callbacks · c11f11fc
      Frederic Weisbecker 提交于
      Do some ground preparatory work before adding guest_enter()
      and guest_exit() context tracking callbacks. Those will
      be later used to read the guest cputime safely when we
      run in full dynticks mode.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      c11f11fc
  2. 06 12月, 2012 1 次提交
  3. 05 12月, 2012 1 次提交
  4. 28 11月, 2012 2 次提交
    • M
      KVM: x86: add kvm_arch_vcpu_postcreate callback, move TSC initialization · 42897d86
      Marcelo Tosatti 提交于
      TSC initialization will soon make use of online_vcpus.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      42897d86
    • M
      KVM: x86: implement PVCLOCK_TSC_STABLE_BIT pvclock flag · d828199e
      Marcelo Tosatti 提交于
      KVM added a global variable to guarantee monotonicity in the guest.
      One of the reasons for that is that the time between
      
      	1. ktime_get_ts(&timespec);
      	2. rdtscll(tsc);
      
      Is variable. That is, given a host with stable TSC, suppose that
      two VCPUs read the same time via ktime_get_ts() above.
      
      The time required to execute 2. is not the same on those two instances
      executing in different VCPUS (cache misses, interrupts...).
      
      If the TSC value that is used by the host to interpolate when
      calculating the monotonic time is the same value used to calculate
      the tsc_timestamp value stored in the pvclock data structure, and
      a single <system_timestamp, tsc_timestamp> tuple is visible to all
      vcpus simultaneously, this problem disappears. See comment on top
      of pvclock_update_vm_gtod_copy for details.
      
      Monotonicity is then guaranteed by synchronicity of the host TSCs
      and guest TSCs.
      
      Set TSC stable pvclock flag in that case, allowing the guest to read
      clock from userspace.
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      d828199e
  5. 19 11月, 2012 1 次提交
    • F
      vtime: Remove the underscore prefix invasion · fd25b4c2
      Frederic Weisbecker 提交于
      Prepending irq-unsafe vtime APIs with underscores was actually
      a bad idea as the result is a big mess in the API namespace that
      is even waiting to be further extended. Also these helpers
      are always called from irq safe callers except kvm. Just
      provide a vtime_account_system_irqsafe() for this specific
      case so that we can remove the underscore prefix on other
      vtime functions.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      fd25b4c2
  6. 01 11月, 2012 1 次提交
    • X
      KVM: x86: fix vcpu->mmio_fragments overflow · 87da7e66
      Xiao Guangrong 提交于
      After commit b3356bf0 (KVM: emulator: optimize "rep ins" handling),
      the pieces of io data can be collected and write them to the guest memory
      or MMIO together
      
      Unfortunately, kvm splits the mmio access into 8 bytes and store them to
      vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
      will cause vcpu->mmio_fragments overflow
      
      The bug can be exposed by isapc (-M isapc):
      
      [23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ ......]
      [23154.858083] Call Trace:
      [23154.859874]  [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
      [23154.861677]  [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
      [23154.863604]  [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]
      
      Actually, we can use one mmio_fragment to store a large mmio access then
      split it when we pass the mmio-exit-info to userspace. After that, we only
      need two entries to store mmio info for the cross-mmio pages access
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      87da7e66
  7. 30 10月, 2012 2 次提交
    • X
      KVM: do not treat noslot pfn as a error pfn · 81c52c56
      Xiao Guangrong 提交于
      This patch filters noslot pfn out from error pfns based on Marcelo comment:
      noslot pfn is not a error pfn
      
      After this patch,
      - is_noslot_pfn indicates that the gfn is not in slot
      - is_error_pfn indicates that the gfn is in slot but the error is occurred
        when translate the gfn to pfn
      - is_error_noslot_pfn indicates that the pfn either it is error pfns or it
        is noslot pfn
      And is_invalid_pfn can be removed, it makes the code more clean
      Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      81c52c56
    • F
      kvm: Directly account vtime to system on guest switch · b080935c
      Frederic Weisbecker 提交于
      Switching to or from guest context is done on ioctl context.
      So by the time we call kvm_guest_enter() or kvm_guest_exit()
      we know we are not running the idle task.
      
      As a result, we can directly account the cputime using
      vtime_account_system().
      
      There are two good reasons to do this:
      
      * We avoid some useless checks on guest switch. It optimizes
      a bit this fast path.
      
      * In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account()
      checks for irq time to account. This is pointless since we know
      we are not in an irq on guest switch. This is wasting cpu cycles
      for no good reason. vtime_account_system() OTOH is a no-op in
      this config option.
      
      * We can remove the irq disable/enable around kvm guest switch in s390.
      
      A further optimization may consist in introducing a vtime_account_guest()
      that directly calls account_guest_time().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Xiantao Zhang <xiantao.zhang@intel.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      b080935c
  8. 23 10月, 2012 1 次提交
  9. 09 10月, 2012 1 次提交
  10. 06 10月, 2012 1 次提交
  11. 25 9月, 2012 1 次提交
    • F
      cputime: Use a proper subsystem naming for vtime related APIs · bf9fae9f
      Frederic Weisbecker 提交于
      Use a naming based on vtime as a prefix for virtual based
      cputime accounting APIs:
      
      - account_system_vtime() -> vtime_account()
      - account_switch_vtime() -> vtime_task_switch()
      
      It makes it easier to allow for further declension such
      as vtime_account_system(), vtime_account_idle(), ... if we
      want to find out the context we account to from generic code.
      
      This also make it better to know on which subsystem these APIs
      refer to.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      bf9fae9f
  12. 23 9月, 2012 1 次提交
    • A
      KVM: Add resampling irqfds for level triggered interrupts · 7a84428a
      Alex Williamson 提交于
      To emulate level triggered interrupts, add a resample option to
      KVM_IRQFD.  When specified, a new resamplefd is provided that notifies
      the user when the irqchip has been resampled by the VM.  This may, for
      instance, indicate an EOI.  Also in this mode, posting of an interrupt
      through an irqfd only asserts the interrupt.  On resampling, the
      interrupt is automatically de-asserted prior to user notification.
      This enables level triggered interrupts to be posted and re-enabled
      from vfio with no userspace intervention.
      
      All resampling irqfds can make use of a single irq source ID, so we
      reserve a new one for this interface.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7a84428a
  13. 18 9月, 2012 1 次提交
    • M
      KVM: make processes waiting on vcpu mutex killable · 9fc77441
      Michael S. Tsirkin 提交于
      vcpu mutex can be held for unlimited time so
      taking it with mutex_lock on an ioctl is wrong:
      one process could be passed a vcpu fd and
      call this ioctl on the vcpu used by another process,
      it will then be unkillable until the owner exits.
      
      Call mutex_lock_killable instead and return status.
      Note: mutex_lock_interruptible would be even nicer,
      but I am not sure all users are prepared to handle EINTR
      from these ioctls. They might misinterpret it as an error.
      
      Cleanup paths expect a vcpu that can't be used by
      any userspace so this will always succeed - catch bugs
      by calling BUG_ON.
      
      Catch callers that don't check return state by adding
      __must_check.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      9fc77441
  14. 06 9月, 2012 1 次提交
  15. 28 8月, 2012 1 次提交
  16. 22 8月, 2012 6 次提交
  17. 06 8月, 2012 9 次提交
  18. 26 7月, 2012 2 次提交
  19. 23 7月, 2012 2 次提交
  20. 20 7月, 2012 4 次提交