1. 21 7月, 2007 1 次提交
    • A
      KVM: MMU: Store nx bit for large page shadows · d55e2cb2
      Avi Kivity 提交于
      We need to distinguish between large page shadows which have the nx bit set
      and those which don't.  The problem shows up when booting a newer smp Linux
      kernel, where the trampoline page (which is in real mode, which uses the
      same shadow pages as large pages) is using the same mapping as a kernel data
      page, which is mapped using nx, causing kvm to spin on that page.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      d55e2cb2
  2. 16 7月, 2007 19 次提交
  3. 15 6月, 2007 1 次提交
  4. 22 5月, 2007 1 次提交
    • A
      Detach sched.h from mm.h · e8edc6e0
      Alexey Dobriyan 提交于
      First thing mm.h does is including sched.h solely for can_do_mlock() inline
      function which has "current" dereference inside. By dealing with can_do_mlock()
      mm.h can be detached from sched.h which is good. See below, why.
      
      This patch
      a) removes unconditional inclusion of sched.h from mm.h
      b) makes can_do_mlock() normal function in mm/mlock.c
      c) exports can_do_mlock() to not break compilation
      d) adds sched.h inclusions back to files that were getting it indirectly.
      e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
         getting them indirectly
      
      Net result is:
      a) mm.h users would get less code to open, read, preprocess, parse, ... if
         they don't need sched.h
      b) sched.h stops being dependency for significant number of files:
         on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
         after patch it's only 3744 (-8.3%).
      
      Cross-compile tested on
      
      	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
      	alpha alpha-up
      	arm
      	i386 i386-up i386-defconfig i386-allnoconfig
      	ia64 ia64-up
      	m68k
      	mips
      	parisc parisc-up
      	powerpc powerpc-up
      	s390 s390-up
      	sparc sparc-up
      	sparc64 sparc64-up
      	um-x86_64
      	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
      
      as well as my two usual configs.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e8edc6e0
  5. 03 5月, 2007 17 次提交
    • A
      KVM: Remove extraneous guest entry on mmio read · e7df56e4
      Avi Kivity 提交于
      When emulating an mmio read, we actually emulate twice: once to determine
      the physical address of the mmio, and, after we've exited to userspace to
      get the mmio value, we emulate again to place the value in the result
      register and update any flags.
      
      But we don't really need to enter the guest again for that, only to take
      an immediate vmexit.  So, if we detect that we're doing an mmio read,
      emulate a single instruction before entering the guest again.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      e7df56e4
    • A
      KVM: VMX: Properly shadow the CR0 register in the vcpu struct · 25c4c276
      Anthony Liguori 提交于
      Set all of the host mask bits for CR0 so that we can maintain a proper
      shadow of CR0.  This exposes CR0.TS, paving the way for lazy fpu handling.
      Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      25c4c276
    • A
      KVM: Lazy FPU support for SVM · 7807fa6c
      Anthony Liguori 提交于
      Avoid saving and restoring the guest fpu state on every exit.  This
      shaves ~100 cycles off the guest/host switch.
      Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      7807fa6c
    • A
      KVM: Per-vcpu statistics · 1165f5fe
      Avi Kivity 提交于
      Make the exit statistics per-vcpu instead of global.  This gives a 3.5%
      boost when running one virtual machine per core on my two socket dual core
      (4 cores total) machine.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      1165f5fe
    • A
      KVM: Use slab caches to allocate mmu data structures · b5a33a75
      Avi Kivity 提交于
      Better leak detection, statistics, memory use, speed -- goodness all
      around.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      b5a33a75
    • A
      KVM: Add physical memory aliasing feature · e8207547
      Avi Kivity 提交于
      With this, we can specify that accesses to one physical memory range will
      be remapped to another.  This is useful for the vga window at 0xa0000 which
      is used as a movable window into the (much larger) framebuffer.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      e8207547
    • A
      KVM: Simply gfn_to_page() · 954bbbc2
      Avi Kivity 提交于
      Mapping a guest page to a host page is a common operation.  Currently,
      one has first to find the memory slot where the page belongs (gfn_to_memslot),
      then locate the page itself (gfn_to_page()).
      
      This is clumsy, and also won't work well with memory aliases.  So simplify
      gfn_to_page() not to require memory slot translation first, and instead do it
      internally.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      954bbbc2
    • D
      KVM: Add mmu cache clear function · e0fa826f
      Dor Laor 提交于
      Functions that play around with the physical memory map
      need a way to clear mappings to possibly nonexistent or
      invalid memory.  Both the mmu cache and the processor tlb
      are cleared.
      Signed-off-by: NDor Laor <dor.laor@qumranet.com>
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      e0fa826f
    • A
      KVM: SVM: Ensure timestamp counter monotonicity · 0cc5064d
      Avi Kivity 提交于
      When a vcpu is migrated from one cpu to another, its timestamp counter
      may lose its monotonic property if the host has unsynced timestamp counters.
      This can confuse the guest, sometimes to the point of refusing to boot.
      
      As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
      we can simply record the last host tsc when we drop the cpu, and adjust
      the vcpu tsc offset when we detect that we've migrated to a different cpu.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      0cc5064d
    • A
      KVM: MMU: Fix hugepage pdes mapping same physical address with different access · d28c6cfb
      Avi Kivity 提交于
      The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
      the same physical address, they share the same shadow page.  This is a fairly
      common case (kernel mappings on i386 nonpae Linux, for example).
      
      However, if the two pdes map the same memory but with different permissions, kvm
      will happily use the cached shadow page.  If the access through the more
      permissive pde will occur after the access to the strict pde, an endless pagefault
      loop will be generated and the guest will make no progress.
      
      Fix by making the access permissions part of the cache lookup key.
      
      The fix allows Xen pae to boot on kvm and run guest domains.
      
      Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      d28c6cfb
    • A
      KVM: Remove set_cr0_no_modeswitch() arch op · f6528b03
      Avi Kivity 提交于
      set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
      As we now cache the protected mode values on entry to real mode, this
      isn't an issue anymore, and it interferes with reboot (which usually _is_
      a modeswitch).
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      f6528b03
    • A
      KVM: MMU: Remove global pte tracking · aac01224
      Avi Kivity 提交于
      The initial, noncaching, version of the kvm mmu flushed the all nonglobal
      shadow page table translations (much like a native tlb flush).  The new
      implementation flushes translations only when they change, rendering global
      pte tracking superfluous.
      
      This removes the unused tracking mechanism and storage space.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      aac01224
    • A
      KVM: Avoid guest virtual addresses in string pio userspace interface · 039576c0
      Avi Kivity 提交于
      The current string pio interface communicates using guest virtual addresses,
      relying on userspace to translate addresses and to check permissions.  This
      interface cannot fully support guest smp, as the check needs to take into
      account two pages at one in case an unaligned string transfer straddles a
      page boundary.
      
      Change the interface not to communicate guest addresses at all; instead use
      a buffer page (mmaped by userspace) and do transfers there.  The kernel
      manages the virtual to physical translation and can perform the checks
      atomically by taking the appropriate locks.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      039576c0
    • A
      KVM: Add guest mode signal mask · 1961d276
      Avi Kivity 提交于
      Allow a special signal mask to be used while executing in guest mode.  This
      allows signals to be used to interrupt a vcpu without requiring signal
      delivery to a userspace handler, which is quite expensive.  Userspace still
      receives -EINTR and can get the signal via sigwait().
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      1961d276
    • A
      KVM: Handle cpuid in the kernel instead of punting to userspace · 06465c5a
      Avi Kivity 提交于
      KVM used to handle cpuid by letting userspace decide what values to
      return to the guest.  We now handle cpuid completely in the kernel.  We
      still let userspace decide which values the guest will see by having
      userspace set up the value table beforehand (this is necessary to allow
      management software to set the cpu features to the least common denominator,
      so that live migration can work).
      
      The motivation for the change is that kvm kernel code can be impacted by
      cpuid features, for example the x86 emulator.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      06465c5a
    • A
      KVM: Do not communicate to userspace through cpu registers during PIO · 46fc1477
      Avi Kivity 提交于
      Currently when passing the a PIO emulation request to userspace, we
      rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
      (on string instructions).  This (a) requires two extra ioctls for getting
      and setting the registers and (b) is unfriendly to non-x86 archs, when
      they get kvm ports.
      
      So fix by doing the register fixups in the kernel and passing to userspace
      only an abstract description of the PIO to be done.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      46fc1477
    • A
      KVM: Use a shared page for kernel/user communication when runing a vcpu · 9a2bb7f4
      Avi Kivity 提交于
      Instead of passing a 'struct kvm_run' back and forth between the kernel and
      userspace, allocate a page and allow the user to mmap() it.  This reduces
      needless copying and makes the interface expandable by providing lots of
      free space.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      9a2bb7f4
  6. 04 3月, 2007 1 次提交
    • A
      KVM: Per-vcpu inodes · bccf2150
      Avi Kivity 提交于
      Allocate a distinct inode for every vcpu in a VM.  This has the following
      benefits:
      
       - the filp cachelines are no longer bounced when f_count is incremented on
         every ioctl()
       - the API and internal code are distinctly clearer; for example, on the
         KVM_GET_REGS ioctl, there is no need to copy the vcpu number from
         userspace and then copy the registers back; the vcpu identity is derived
         from the fd used to make the call
      
      Right now the performance benefits are completely theoretical since (a) we
      don't support more than one vcpu per VM and (b) virtualization hardware
      inefficiencies completely everwhelm any cacheline bouncing effects.  But
      both of these will change, and we need to prepare the API today.
      Signed-off-by: NAvi Kivity <avi@qumranet.com>
      bccf2150