1. 01 10月, 2015 32 次提交
  2. 28 9月, 2015 1 次提交
  3. 25 9月, 2015 3 次提交
    • P
      KVM: x86: fix off-by-one in reserved bits check · 58c95070
      Paolo Bonzini 提交于
      29ecd660 ("KVM: x86: avoid uninitialized variable warning",
      2015-09-06) introduced a not-so-subtle problem, which probably
      escaped review because it was not part of the patch context.
      
      Before the patch, leaf was always equal to iterator.level.  After,
      it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
      and when is_shadow_zero_bits_set does another "-1" the check on
      reserved bits becomes incorrect.  Using "iterator.level" in the call
      fixes this call trace:
      
      WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
      Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
      [...]
      Call Trace:
        dump_stack+0x4e/0x84
        warn_slowpath_common+0x95/0xe0
        warn_slowpath_null+0x1a/0x20
        handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
        tdp_page_fault+0x231/0x290 [kvm]
        ? emulator_pio_in_out+0x6e/0xf0 [kvm]
        kvm_mmu_page_fault+0x36/0x240 [kvm]
        ? svm_set_cr0+0x95/0xc0 [kvm_amd]
        pf_interception+0xde/0x1d0 [kvm_amd]
        handle_exit+0x181/0xa70 [kvm_amd]
        ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
        kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
        ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
        ? preempt_count_sub+0x9b/0xf0
        ? mutex_lock_killable_nested+0x26f/0x490
        ? preempt_count_sub+0x9b/0xf0
        kvm_vcpu_ioctl+0x358/0x710 [kvm]
        ? __fget+0x5/0x210
        ? __fget+0x101/0x210
        do_vfs_ioctl+0x2f4/0x560
        ? __fget_light+0x29/0x90
        SyS_ioctl+0x4c/0x90
        entry_SYSCALL_64_fastpath+0x16/0x73
      ---[ end trace 37901c8686d84de6 ]---
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Tested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      58c95070
    • P
      KVM: x86: use correct page table format to check nested page table reserved bits · 6fec2144
      Paolo Bonzini 提交于
      Intel CPUID on AMD host or vice versa is a weird case, but it can
      happen.  Handle it by checking the host CPU vendor instead of the
      guest's in reset_tdp_shadow_zero_bits_mask.  For speed, the
      check uses the fact that Intel EPT has an X (executable) bit while
      AMD NPT has NX.
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Tested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6fec2144
    • P
      KVM: svm: do not call kvm_set_cr0 from init_vmcb · 79a8059d
      Paolo Bonzini 提交于
      kvm_set_cr0 may want to call kvm_zap_gfn_range and thus access the
      memslots array (SRCU protected).  Using a mini SRCU critical section
      is ugly, and adding it to kvm_arch_vcpu_create doesn't work because
      the VMX vcpu_create callback calls synchronize_srcu.
      
      Fixes this lockdep splat:
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      4.3.0-rc1+ #1 Not tainted
      -------------------------------
      include/linux/kvm_host.h:488 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      rcu_scheduler_active = 1, debug_locks = 0
      1 lock held by qemu-system-i38/17000:
       #0:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: kvm_zap_gfn_range+0x24/0x1a0 [kvm]
      
      [...]
      Call Trace:
       dump_stack+0x4e/0x84
       lockdep_rcu_suspicious+0xfd/0x130
       kvm_zap_gfn_range+0x188/0x1a0 [kvm]
       kvm_set_cr0+0xde/0x1e0 [kvm]
       init_vmcb+0x760/0xad0 [kvm_amd]
       svm_create_vcpu+0x197/0x250 [kvm_amd]
       kvm_arch_vcpu_create+0x47/0x70 [kvm]
       kvm_vm_ioctl+0x302/0x7e0 [kvm]
       ? __lock_is_held+0x51/0x70
       ? __fget+0x101/0x210
       do_vfs_ioctl+0x2f4/0x560
       ? __fget_light+0x29/0x90
       SyS_ioctl+0x4c/0x90
       entry_SYSCALL_64_fastpath+0x16/0x73
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      79a8059d
  4. 21 9月, 2015 1 次提交
  5. 18 9月, 2015 1 次提交
    • I
      kvm: svm: reset mmu on VCPU reset · ebae871a
      Igor Mammedov 提交于
      When INIT/SIPI sequence is sent to VCPU which before that
      was in use by OS, VMRUN might fail with:
      
       KVM: entry failed, hardware error 0xffffffff
       EAX=00000000 EBX=00000000 ECX=00000000 EDX=000006d3
       ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
       EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
       ES =0000 00000000 0000ffff 00009300
       CS =9a00 0009a000 0000ffff 00009a00
       [...]
       CR0=60000010 CR2=b6f3e000 CR3=01942000 CR4=000007e0
       [...]
       EFER=0000000000000000
      
      with corresponding SVM error:
       KVM: FAILED VMRUN WITH VMCB:
       [...]
       cpl:            0                efer:         0000000000001000
       cr0:            0000000080010010 cr2:          00007fd7fe85bf90
       cr3:            0000000187d0c000 cr4:          0000000000000020
       [...]
      
      What happens is that VCPU state right after offlinig:
      CR0: 0x80050033  EFER: 0xd01  CR4: 0x7e0
        -> long mode with CR3 pointing to longmode page tables
      
      and when VCPU gets INIT/SIPI following transition happens
      CR0: 0 -> 0x60000010 EFER: 0x0  CR4: 0x7e0
        -> paging disabled with stale CR3
      
      However SVM under the hood puts VCPU in Paged Real Mode*
      which effectively translates CR0 0x60000010 -> 80010010 after
      
         svm_vcpu_reset()
             -> init_vmcb()
                 -> kvm_set_cr0()
                     -> svm_set_cr0()
      
      but from  kvm_set_cr0() perspective CR0: 0 -> 0x60000010
      only caching bits are changed and
      commit d81135a5
       ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")'
      regressed svm_vcpu_reset() which relied on MMU being reset.
      
      As result VMRUN after svm_vcpu_reset() tries to run
      VCPU in Paged Real Mode with stale MMU context (longmode page tables),
      which causes some AMD CPUs** to bail out with VMEXIT_INVALID.
      
      Fix issue by unconditionally resetting MMU context
      at init_vmcb() time.
      
      	* AMD64 Architecture Programmer’s Manual,
      	    Volume 2: System Programming, rev: 3.25
      	      15.19 Paged Real Mode
      	** Opteron 1216
      Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
      Fixes: d81135a5
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ebae871a
  6. 16 9月, 2015 2 次提交
    • W
      KVM: vmx: fix VPID is 0000H in non-root operation · 04bb92e4
      Wanpeng Li 提交于
      Reference SDM 28.1:
      
      The current VPID is 0000H in the following situations:
      - Outside VMX operation. (This includes operation in system-management
        mode under the default treatment of SMIs and SMM with VMX operation;
        see Section 34.14.)
      - In VMX root operation.
      - In VMX non-root operation when the “enable VPID” VM-execution control
        is 0.
      
      The VPID should never be 0000H in non-root operation when "enable VPID"
      VM-execution control is 1. However, commit 34a1cd60 ("kvm: x86: vmx:
      move some vmx setting from vmx_init() to hardware_setup()") remove the
      codes which reserve 0000H for VMX root operation.
      
      This patch fix it by again reserving 0000H for VMX root operation.
      
      Cc: stable@vger.kernel.org # 3.19+
      Fixes: 34a1cd60Reported-by: NWincy Van <fanwenyi0529@gmail.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      04bb92e4
    • P
      KVM: add halt_attempted_poll to VCPU stats · 62bea5bf
      Paolo Bonzini 提交于
      This new statistic can help diagnosing VCPUs that, for any reason,
      trigger bad behavior of halt_poll_ns autotuning.
      
      For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
      like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
      10+20+40+80+160+320+480 = 1110 microseconds out of every
      479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
      is consuming about 30% more CPU than it would use without
      polling.  This would show as an abnormally high number of
      attempted polling compared to the successful polls.
      
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      62bea5bf