1. 08 4月, 2015 14 次提交
    • N
      KVM: x86: DR0-DR3 are not clear on reset · ae561ede
      Nadav Amit 提交于
      DR0-DR3 are not cleared as they should during reset and when they are set from
      userspace.  It appears to be caused by c77fb5fe ("KVM: x86: Allow the guest
      to run with dirty debug registers").
      
      Force their reload on these situations.
      Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
      Message-Id: <1427933438-12782-4-git-send-email-namit@cs.technion.ac.il>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ae561ede
    • N
      KVM: x86: BSP in MSR_IA32_APICBASE is writable · 58d269d8
      Nadav Amit 提交于
      After reset, the CPU can change the BSP, which will be used upon INIT.  Reset
      should return the BSP which QEMU asked for, and therefore handled accordingly.
      
      To quote: "If the MP protocol has completed and a BSP is chosen, subsequent
      INITs (either to a specific processor or system wide) do not cause the MP
      protocol to be repeated."
      [Intel SDM 8.4.2: MP Initialization Protocol Requirements and Restrictions]
      Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
      Message-Id: <1427933438-12782-3-git-send-email-namit@cs.technion.ac.il>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      58d269d8
    • R
      KVM: x86: simplify kvm_apic_map · 3b5a5ffa
      Radim Krčmář 提交于
      recalculate_apic_map() uses two passes over all VCPUs.  This is a relic
      from time when we selected a global mode in the first pass and set up
      the optimized table in the second pass (to have a consistent mode).
      
      Recent changes made mixed mode unoptimized and we can do it in one pass.
      Format of logical MDA is a function of the mode, so we encode it in
      apic_logical_id() and drop obsoleted variables from the struct.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-5-git-send-email-rkrcmar@redhat.com>
      [Add lid_bits temporary in apic_logical_id. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3b5a5ffa
    • R
      KVM: x86: avoid logical_map when it is invalid · 3548a259
      Radim Krčmář 提交于
      We want to support mixed modes and the easiest solution is to avoid
      optimizing those weird and unlikely scenarios.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-4-git-send-email-rkrcmar@redhat.com>
      [Add comment above KVM_APIC_MODE_* defines. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3548a259
    • R
      KVM: x86: fix mixed APIC mode broadcast · 9ea369b0
      Radim Krčmář 提交于
      Broadcast allowed only one global APIC mode, but mixed modes are
      theoretically possible.  x2APIC IPI doesn't mean 0xff as broadcast,
      the rest does.
      
      x2APIC broadcasts are accepted by xAPIC.  If we take SDM to be logical,
      even addreses beginning with 0xff should be accepted, but real hardware
      disagrees.  This patch aims for simple code by considering most of real
      behavior as undefined.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-3-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9ea369b0
    • R
      KVM: x86: use MDA for interrupt matching · 03d2249e
      Radim Krčmář 提交于
      In mixed modes, we musn't deliver xAPIC IPIs like x2APIC and vice versa.
      Instead of preserving the information in apic_send_ipi(), we regain it
      by converting all destinations into correct MDA in the slow path.
      This allows easier reasoning about subsequent matching.
      
      Our kvm_apic_broadcast() had an interesting design decision: it didn't
      consider IOxAPIC 0xff as broadcast in x2APIC mode ...
      everything worked because IOxAPIC can't set that in physical mode and
      logical mode considered it as a message for first 8 VCPUs.
      This patch interprets IOxAPIC 0xff as x2APIC broadcast.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-2-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      03d2249e
    • E
      KVM: nVMX: remove unnecessary double caching of MAXPHYADDR · 92d71bc6
      Eugene Korenevsky 提交于
      After speed-up of cpuid_maxphyaddr() it can be called frequently:
      instead of heavyweight enumeration of CPUID entries it returns a cached
      pre-computed value. It is also inlined now. So caching its result became
      unnecessary and can be removed.
      Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205644.GA1258@gnote>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      92d71bc6
    • E
      KVM: nVMX: checks for address bits beyond MAXPHYADDR on VM-entry · 9090422f
      Eugene Korenevsky 提交于
      On each VM-entry CPU should check the following VMCS fields for zero bits
      beyond physical address width:
      -  APIC-access address
      -  virtual-APIC address
      -  posted-interrupt descriptor address
      This patch adds these checks required by Intel SDM.
      Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205627.GA1244@gnote>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9090422f
    • E
      KVM: x86: cache maxphyaddr CPUID leaf in struct kvm_vcpu · 5a4f55cd
      Eugene Korenevsky 提交于
      cpuid_maxphyaddr(), which performs lot of memory accesses is called
      extensively across KVM, especially in nVMX code.
      
      This patch adds a cached value of maxphyaddr to vcpu.arch to reduce the
      pressure onto CPU cache and simplify the code of cpuid_maxphyaddr()
      callers. The cached value is initialized in kvm_arch_vcpu_init() and
      reloaded every time CPUID is updated by usermode. It is obvious that
      these reloads occur infrequently.
      Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205612.GA1223@gnote>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5a4f55cd
    • R
      KVM: vmx: pass error code with internal error #2 · 80f0e95d
      Radim Krčmář 提交于
      Exposing the on-stack error code with internal error is cheap and
      potentially useful.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1428001865-32280-1-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      80f0e95d
    • R
      x86: vdso: fix pvclock races with task migration · 80f7fdb1
      Radim Krčmář 提交于
      If we were migrated right after __getcpu, but before reading the
      migration_count, we wouldn't notice that we read TSC of a different
      VCPU, nor that KVM's bug made pvti invalid, as only migration_count
      on source VCPU is increased.
      
      Change vdso instead of updating migration_count on destination.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Fixes: 0a4e6be9 ("x86: kvm: Revert "remove sched notifier for cross-cpu migrations"")
      Message-Id: <1428000263-11892-1-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      80f7fdb1
    • P
      KVM: x86: optimize delivery of TSC deadline timer interrupt · 9c8fd1ba
      Paolo Bonzini 提交于
      The newly-added tracepoint shows the following results on
      the tscdeadline_latency test:
      
              qemu-kvm-8387  [002]  6425.558974: kvm_vcpu_wakeup:      poll time 10407 ns
              qemu-kvm-8387  [002]  6425.558984: kvm_vcpu_wakeup:      poll time 0 ns
              qemu-kvm-8387  [002]  6425.561242: kvm_vcpu_wakeup:      poll time 10477 ns
              qemu-kvm-8387  [002]  6425.561251: kvm_vcpu_wakeup:      poll time 0 ns
      
      and so on.  This is because we need to go through kvm_vcpu_block again
      after the timer IRQ is injected.  Avoid it by polling once before
      entering kvm_vcpu_block.
      
      On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%)
      from the latency of the TSC deadline timer.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9c8fd1ba
    • P
      KVM: x86: extract blocking logic from __vcpu_run · 362c698f
      Paolo Bonzini 提交于
      Rename the old __vcpu_run to vcpu_run, and extract part of it to a new
      function vcpu_block.
      
      The next patch will add a new condition in vcpu_block, avoid extra
      indentation.
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      362c698f
    • W
      kvm: x86: fix x86 eflags fixed bit · 35fd68a3
      Wanpeng Li 提交于
      Guest can't be booted w/ ept=0, there is a message dumped as below:
      
      If you're running a guest on an Intel machine without unrestricted mode
      support, the failure can be most likely due to the guest entering an invalid
      state for Intel VT. For example, the guest maybe running in big real mode
      which is not supported on less recent Intel processors.
      
      EAX=00000011 EBX=f000d2f6 ECX=00006cac EDX=000f8956
      ESI=bffbdf62 EDI=00000000 EBP=00006c68 ESP=00006c68
      EIP=0000d187 EFL=00000004 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
      ES =e000 000e0000 ffffffff 00809300 DPL=0 DS16 [-WA]
      CS =f000 000f0000 ffffffff 00809b00 DPL=0 CS16 [-RA]
      SS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      DS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      FS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      GS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
      TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
      GDT=     000f6a80 00000037
      IDT=     000f6abe 00000000
      CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
      DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
      DR6=00000000ffff0ff0 DR7=0000000000000400
      EFER=0000000000000000
      Code=01 1e b8 6a 2e 0f 01 16 74 6a 0f 20 c0 66 83 c8 01 0f 22 c0 <66> ea 8f d1 0f 00 08 00 b8 10 00 00 00 8e d8 8e c0 8e d0 8e e0 8e e8 89 c8 ff e2 89 c1 b8X
      
      X86 eflags bit 1 is fixed set, which means that 1 << 1 is set instead of 1,
      this patch fix it.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Message-Id: <1428473294-6633-1-git-send-email-wanpeng.li@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      35fd68a3
  2. 01 4月, 2015 1 次提交
  3. 30 3月, 2015 7 次提交
  4. 27 3月, 2015 4 次提交
  5. 24 3月, 2015 2 次提交
  6. 19 3月, 2015 1 次提交
  7. 18 3月, 2015 2 次提交
  8. 14 3月, 2015 2 次提交
  9. 13 3月, 2015 1 次提交
  10. 11 3月, 2015 2 次提交
  11. 10 3月, 2015 3 次提交
  12. 06 3月, 2015 1 次提交
    • Q
      x86/fpu/xsaves: Fix improper uses of __ex_table · 06c8173e
      Quentin Casasnovas 提交于
      Commit:
      
        f31a9f7c ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
      
      introduced alternative instructions for XSAVES/XRSTORS and commit:
      
        adb9d526 ("x86/xsaves: Add xsaves and xrstors support for booting time")
      
      added support for the XSAVES/XRSTORS instructions at boot time.
      
      Unfortunately both failed to properly protect them against faulting:
      
      The 'xstate_fault' macro will use the closest label named '1'
      backward and that ends up in the .altinstr_replacement section
      rather than in .text. This means that the kernel will never find
      in the __ex_table the .text address where this instruction might
      fault, leading to serious problems if userspace manages to
      trigger the fault.
      Signed-off-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com>
      Signed-off-by: NJamie Iles <jamie.iles@oracle.com>
      [ Improved the changelog, fixed some whitespace noise. ]
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: <stable@vger.kernel.org>
      Cc: Allan Xavier <mr.a.xavier@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: adb9d526 ("x86/xsaves: Add xsaves and xrstors support for booting time")
      Fixes: f31a9f7c ("x86/xsaves: Use xsaves/xrstors to save and restore xsave area")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      06c8173e