1. 08 4月, 2015 15 次提交
    • R
      KVM: x86: fix mixed APIC mode broadcast · 9ea369b0
      Radim Krčmář 提交于
      Broadcast allowed only one global APIC mode, but mixed modes are
      theoretically possible.  x2APIC IPI doesn't mean 0xff as broadcast,
      the rest does.
      
      x2APIC broadcasts are accepted by xAPIC.  If we take SDM to be logical,
      even addreses beginning with 0xff should be accepted, but real hardware
      disagrees.  This patch aims for simple code by considering most of real
      behavior as undefined.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-3-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9ea369b0
    • R
      KVM: x86: use MDA for interrupt matching · 03d2249e
      Radim Krčmář 提交于
      In mixed modes, we musn't deliver xAPIC IPIs like x2APIC and vice versa.
      Instead of preserving the information in apic_send_ipi(), we regain it
      by converting all destinations into correct MDA in the slow path.
      This allows easier reasoning about subsequent matching.
      
      Our kvm_apic_broadcast() had an interesting design decision: it didn't
      consider IOxAPIC 0xff as broadcast in x2APIC mode ...
      everything worked because IOxAPIC can't set that in physical mode and
      logical mode considered it as a message for first 8 VCPUs.
      This patch interprets IOxAPIC 0xff as x2APIC broadcast.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-2-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      03d2249e
    • A
      kvm/ppc/mpic: drop unused IRQ_testbit · 19456060
      Arseny Solokha 提交于
      Drop unused static procedure which doesn't have callers within its
      translation unit. It had been already removed independently in QEMU[1]
      from the OpenPIC implementation borrowed from the kernel.
      
      [1] https://lists.gnu.org/archive/html/qemu-devel/2014-06/msg01812.htmlSigned-off-by: NArseny Solokha <asolokha@kb.kras.ru>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1424768706-23150-3-git-send-email-asolokha@kb.kras.ru>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      19456060
    • E
      KVM: nVMX: remove unnecessary double caching of MAXPHYADDR · 92d71bc6
      Eugene Korenevsky 提交于
      After speed-up of cpuid_maxphyaddr() it can be called frequently:
      instead of heavyweight enumeration of CPUID entries it returns a cached
      pre-computed value. It is also inlined now. So caching its result became
      unnecessary and can be removed.
      Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205644.GA1258@gnote>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      92d71bc6
    • E
      KVM: nVMX: checks for address bits beyond MAXPHYADDR on VM-entry · 9090422f
      Eugene Korenevsky 提交于
      On each VM-entry CPU should check the following VMCS fields for zero bits
      beyond physical address width:
      -  APIC-access address
      -  virtual-APIC address
      -  posted-interrupt descriptor address
      This patch adds these checks required by Intel SDM.
      Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205627.GA1244@gnote>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9090422f
    • E
      KVM: x86: cache maxphyaddr CPUID leaf in struct kvm_vcpu · 5a4f55cd
      Eugene Korenevsky 提交于
      cpuid_maxphyaddr(), which performs lot of memory accesses is called
      extensively across KVM, especially in nVMX code.
      
      This patch adds a cached value of maxphyaddr to vcpu.arch to reduce the
      pressure onto CPU cache and simplify the code of cpuid_maxphyaddr()
      callers. The cached value is initialized in kvm_arch_vcpu_init() and
      reloaded every time CPUID is updated by usermode. It is obvious that
      these reloads occur infrequently.
      Signed-off-by: NEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205612.GA1223@gnote>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5a4f55cd
    • R
      KVM: vmx: pass error code with internal error #2 · 80f0e95d
      Radim Krčmář 提交于
      Exposing the on-stack error code with internal error is cheap and
      potentially useful.
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1428001865-32280-1-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      80f0e95d
    • R
      x86: vdso: fix pvclock races with task migration · 80f7fdb1
      Radim Krčmář 提交于
      If we were migrated right after __getcpu, but before reading the
      migration_count, we wouldn't notice that we read TSC of a different
      VCPU, nor that KVM's bug made pvti invalid, as only migration_count
      on source VCPU is increased.
      
      Change vdso instead of updating migration_count on destination.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Fixes: 0a4e6be9 ("x86: kvm: Revert "remove sched notifier for cross-cpu migrations"")
      Message-Id: <1428000263-11892-1-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      80f7fdb1
    • P
      KVM: remove kvm_read_hva and kvm_read_hva_atomic · 3180a7fc
      Paolo Bonzini 提交于
      The corresponding write functions just use __copy_to_user.  Do the
      same on the read side.
      
      This reverts what's left of commit 86ab8cff (KVM: introduce
      gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic, 2012-08-21)
      
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1427976500-28533-1-git-send-email-pbonzini@redhat.com>
      3180a7fc
    • P
      KVM: x86: optimize delivery of TSC deadline timer interrupt · 9c8fd1ba
      Paolo Bonzini 提交于
      The newly-added tracepoint shows the following results on
      the tscdeadline_latency test:
      
              qemu-kvm-8387  [002]  6425.558974: kvm_vcpu_wakeup:      poll time 10407 ns
              qemu-kvm-8387  [002]  6425.558984: kvm_vcpu_wakeup:      poll time 0 ns
              qemu-kvm-8387  [002]  6425.561242: kvm_vcpu_wakeup:      poll time 10477 ns
              qemu-kvm-8387  [002]  6425.561251: kvm_vcpu_wakeup:      poll time 0 ns
      
      and so on.  This is because we need to go through kvm_vcpu_block again
      after the timer IRQ is injected.  Avoid it by polling once before
      entering kvm_vcpu_block.
      
      On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%)
      from the latency of the TSC deadline timer.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9c8fd1ba
    • P
      KVM: x86: extract blocking logic from __vcpu_run · 362c698f
      Paolo Bonzini 提交于
      Rename the old __vcpu_run to vcpu_run, and extract part of it to a new
      function vcpu_block.
      
      The next patch will add a new condition in vcpu_block, avoid extra
      indentation.
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      362c698f
    • W
      kvm: x86: fix x86 eflags fixed bit · 35fd68a3
      Wanpeng Li 提交于
      Guest can't be booted w/ ept=0, there is a message dumped as below:
      
      If you're running a guest on an Intel machine without unrestricted mode
      support, the failure can be most likely due to the guest entering an invalid
      state for Intel VT. For example, the guest maybe running in big real mode
      which is not supported on less recent Intel processors.
      
      EAX=00000011 EBX=f000d2f6 ECX=00006cac EDX=000f8956
      ESI=bffbdf62 EDI=00000000 EBP=00006c68 ESP=00006c68
      EIP=0000d187 EFL=00000004 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
      ES =e000 000e0000 ffffffff 00809300 DPL=0 DS16 [-WA]
      CS =f000 000f0000 ffffffff 00809b00 DPL=0 CS16 [-RA]
      SS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      DS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      FS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      GS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
      TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
      GDT=     000f6a80 00000037
      IDT=     000f6abe 00000000
      CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
      DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
      DR6=00000000ffff0ff0 DR7=0000000000000400
      EFER=0000000000000000
      Code=01 1e b8 6a 2e 0f 01 16 74 6a 0f 20 c0 66 83 c8 01 0f 22 c0 <66> ea 8f d1 0f 00 08 00 b8 10 00 00 00 8e d8 8e c0 8e d0 8e e0 8e e8 89 c8 ff e2 89 c1 b8X
      
      X86 eflags bit 1 is fixed set, which means that 1 << 1 is set instead of 1,
      this patch fix it.
      Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
      Message-Id: <1428473294-6633-1-git-send-email-wanpeng.li@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      35fd68a3
    • P
      Merge tag 'kvm-s390-next-20150331' of... · 7f22b45d
      Paolo Bonzini 提交于
      Merge tag 'kvm-s390-next-20150331' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      Features and fixes for 4.1 (kvm/next)
      
      1. Assorted changes
      1.1 allow more feature bits for the guest
      1.2 Store breaking event address on program interrupts
      
      2. Interrupt handling rework
      2.1 Fix copy_to_user while holding a spinlock (cc stable)
      2.2 Rework floating interrupts to follow the priorities
      2.3 Allow to inject all local interrupts via new ioctl
      2.4 allow to get/set the full local irq state, e.g. for migration
          and introspection
      7f22b45d
    • P
      Merge tag 'kvm-arm-for-4.1' of... · bf0fb67c
      Paolo Bonzini 提交于
      Merge tag 'kvm-arm-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next'
      
      KVM/ARM changes for v4.1:
      
      - fixes for live migration
      - irqfd support
      - kvm-io-bus & vgic rework to enable ioeventfd
      - page ageing for stage-2 translation
      - various cleanups
      bf0fb67c
    • P
      Merge tag 'kvm-arm-fixes-4.0-rc5' of... · 8999602d
      Paolo Bonzini 提交于
      Merge tag 'kvm-arm-fixes-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next'
      
      Fixes for KVM/ARM for 4.0-rc5.
      
      Fixes page refcounting issues in our Stage-2 page table management code,
      fixes a missing unlock in a gicv3 error path, and fixes a race that can
      cause lost interrupts if signals are pending just prior to entering the
      guest.
      8999602d
  2. 01 4月, 2015 7 次提交
  3. 31 3月, 2015 6 次提交
    • C
      KVM: s390: enable more features that need no hypervisor changes · a3ed8dae
      Christian Borntraeger 提交于
      After some review about what these facilities do, the following
      facilities will work under KVM and can, therefore, be reported
      to the guest if the cpu model and the host cpu provide this bit.
      
      There are plans underway to make the whole bit thing more readable,
      but its not yet finished. So here are some last bit changes and
      we enhance the KVM mask with:
      
      9 The sense-running-status facility is installed in the
        z/Architecture architectural mode.
        ---> handled by SIE or KVM
      
      10 The conditional-SSKE facility is installed in the
         z/Architecture architectural mode.
        ---> handled by SIE. KVM will retry SIE
      
      13 The IPTE-range facility is installed in the
         z/Architecture architectural mode.
        ---> handled by SIE. KVM will retry SIE
      
      36 The enhanced-monitor facility is installed in the
         z/Architecture architectural mode.
        ---> handled by SIE
      
      47 The CMPSC-enhancement facility is installed in the
         z/Architecture architectural mode.
        ---> handled by SIE
      
      48 The decimal-floating-point zoned-conversion facility
         is installed in the z/Architecture architectural mode.
        ---> handled by SIE
      
      49 The execution-hint, load-and-trap, miscellaneous-
         instruction-extensions and processor-assist
        ---> handled by SIE
      
      51 The local-TLB-clearing facility is installed in the
         z/Architecture architectural mode.
        ---> handled by SIE
      
      52 The interlocked-access facility 2 is installed.
        ---> handled by SIE
      
      53 The load/store-on-condition facility 2 and load-and-
         zero-rightmost-byte facility are installed in the
         z/Architecture architectural mode.
        ---> handled by SIE
      
      57 The message-security-assist-extension-5 facility is
        installed in the z/Architecture architectural mode.
        ---> handled by SIE
      
      66 The reset-reference-bits-multiple facility is installed
        in the z/Architecture architectural mode.
        ---> handled by SIE. KVM will retry SIE
      
      80 The decimal-floating-point packed-conversion
         facility is installed in the z/Architecture architectural
         mode.
        ---> handled by SIE
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Tested-by: NMichael Mueller <mimu@linux.vnet.ibm.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      a3ed8dae
    • D
      KVM: s390: store the breaking-event address on pgm interrupts · 2ba45968
      David Hildenbrand 提交于
      If the PER-3 facility is installed, the breaking-event address is to be
      stored in the low core.
      
      There is no facility bit for PER-3 in stfl(e) and Linux always uses the
      value at address 272 no matter if PER-3 is available or not.
      We can't hide its existence from the guest. All program interrupts
      injected via the SIE automatically store this information if the PER-3
      facility is available in the hypervisor. Also the itdb contains the
      address automatically.
      
      As there is no switch to turn this mechanism off, let's simply make it
      consistent and also store the breaking event address in case of manual
      program interrupt injection.
      Reviewed-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
      Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
      2ba45968
    • N
      KVM: arm/arm64: enable KVM_CAP_IOEVENTFD · d44758c0
      Nikolay Nikolaev 提交于
      As the infrastructure for eventfd has now been merged, report the
      ioeventfd capability as being supported.
      Signed-off-by: NNikolay Nikolaev <n.nikolaev@virtualopensystems.com>
      [maz: grouped the case entry with the others, fixed commit log]
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      d44758c0
    • A
      KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus · 950324ab
      Andre Przywara 提交于
      Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
      data to be passed on from syndrome decoding all the way down to the
      VGIC register handlers. Now as we switch the MMIO handling to be
      routed through the KVM MMIO bus, it does not make sense anymore to
      use that structure already from the beginning. So we keep the data in
      local variables until we put them into the kvm_io_bus framework.
      Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
      structure. On that way we replace the data buffer in that structure
      with a pointer pointing to a single location in a local variable, so
      we get rid of some copying on the way.
      With all of the virtual GIC emulation code now being registered with
      the kvm_io_bus, we can remove all of the old MMIO handling code and
      its dispatching functionality.
      
      I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
      because that touches a lot of code lines without any good reason.
      
      This is based on an original patch by Nikolay.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      950324ab
    • A
      KVM: arm/arm64: prepare GICv3 emulation to use kvm_io_bus MMIO handling · fb8f61ab
      Andre Przywara 提交于
      Using the framework provided by the recent vgic.c changes, we
      register a kvm_io_bus device on mapping the virtual GICv3 resources.
      The distributor mapping is pretty straight forward, but the
      redistributors need some more love, since they need to be tagged with
      the respective redistributor (read: VCPU) they are connected with.
      We use the kvm_io_bus framework to register one devices per VCPU.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      fb8f61ab
    • A
      KVM: arm/arm64: merge GICv3 RD_base and SGI_base register frames · 0ba10d53
      Andre Przywara 提交于
      Currently we handle the redistributor registers in two separate MMIO
      regions, one for the overall behaviour and SPIs and one for the
      SGIs/PPIs. That latter forces the creation of _two_ KVM I/O bus
      devices for each redistributor.
      Since the spec mandates those two pages to be contigious, we could as
      well merge them and save the churn with the second KVM I/O bus device.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      0ba10d53
  4. 30 3月, 2015 8 次提交
  5. 28 3月, 2015 4 次提交
    • J
      MIPS: KVM: Wire up MSA capability · d952bd07
      James Hogan 提交于
      Now that the code is in place for KVM to support MIPS SIMD Architecutre
      (MSA) in MIPS guests, wire up the new KVM_CAP_MIPS_MSA capability.
      
      For backwards compatibility, the capability must be explicitly enabled
      in order to detect or make use of MSA from the guest.
      
      The capability is not supported if the hardware supports MSA vector
      partitioning, since the extra support cannot be tested yet and it
      extends the state that the userland program would have to save.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: linux-api@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      d952bd07
    • J
      MIPS: KVM: Expose MSA registers · ab86bd60
      James Hogan 提交于
      Add KVM register numbers for the MIPS SIMD Architecture (MSA) registers,
      and implement access to them with the KVM_GET_ONE_REG / KVM_SET_ONE_REG
      ioctls when the MSA capability is enabled (exposed in a later patch) and
      present in the guest according to its Config3.MSAP bit.
      
      The MSA vector registers use the same register numbers as the FPU
      registers except with a different size (128bits). Since MSA depends on
      Status.FR=1, these registers are inaccessible when Status.FR=0. These
      registers are returned as a single native endian 128bit value, rather
      than least significant half first with each 64-bit half native endian as
      the kernel uses internally.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Cc: linux-api@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      ab86bd60
    • J
      MIPS: KVM: Add MSA exception handling · c2537ed9
      James Hogan 提交于
      Add guest exception handling for MIPS SIMD Architecture (MSA) floating
      point exceptions and MSA disabled exceptions.
      
      MSA floating point exceptions from the guest need passing to the guest
      kernel, so for these a guest MSAFPE is emulated.
      
      MSA disabled exceptions are normally handled by passing a reserved
      instruction exception to the guest (because no guest MSA was supported),
      but the hypervisor can now handle them if the guest has MSA by passing
      an MSA disabled exception to the guest, or if the guest has MSA enabled
      by transparently restoring the guest MSA context and enabling MSA and
      the FPU.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      c2537ed9
    • J
      MIPS: KVM: Emulate MSA bits in COP0 interface · 2b6009d6
      James Hogan 提交于
      Emulate MSA related parts of COP0 interface so that the guest will be
      able to enable/disable MSA (Config5.MSAEn) once the MSA capability has
      been wired up.
      
      As with the FPU (Status.CU1) setting Config5.MSAEn has no immediate
      effect if the MSA state isn't live, as MSA state is restored lazily on
      first use. Changes after the MSA state has been restored take immediate
      effect, so that the guest can start getting MSA disabled exceptions
      right away for guest MSA operations. The MSA state is saved lazily too,
      as MSA may get re-enabled in the near future anyway.
      
      A special case is also added for when Status.CU1 is set while FR=0 and
      the MSA state is live. In this case we are at risk of getting reserved
      instruction exceptions if we try and save the MSA state, so we lose the
      MSA state sooner while MSA is still usable.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      2b6009d6