1. 01 2月, 2018 2 次提交
  2. 27 1月, 2018 2 次提交
  3. 25 1月, 2018 2 次提交
  4. 17 1月, 2018 1 次提交
  5. 12 1月, 2018 2 次提交
  6. 11 1月, 2018 3 次提交
  7. 05 1月, 2018 1 次提交
  8. 21 12月, 2017 1 次提交
  9. 18 12月, 2017 1 次提交
    • W
      KVM: Fix stack-out-of-bounds read in write_mmio · e39d200f
      Wanpeng Li 提交于
      Reported by syzkaller:
      
        BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
        Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298
      
        CPU: 6 PID: 32298 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #18
        Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
        Call Trace:
         dump_stack+0xab/0xe1
         print_address_description+0x6b/0x290
         kasan_report+0x28a/0x370
         write_mmio+0x11e/0x270 [kvm]
         emulator_read_write_onepage+0x311/0x600 [kvm]
         emulator_read_write+0xef/0x240 [kvm]
         emulator_fix_hypercall+0x105/0x150 [kvm]
         em_hypercall+0x2b/0x80 [kvm]
         x86_emulate_insn+0x2b1/0x1640 [kvm]
         x86_emulate_instruction+0x39a/0xb90 [kvm]
         handle_exception+0x1b4/0x4d0 [kvm_intel]
         vcpu_enter_guest+0x15a0/0x2640 [kvm]
         kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
         kvm_vcpu_ioctl+0x479/0x880 [kvm]
         do_vfs_ioctl+0x142/0x9a0
         SyS_ioctl+0x74/0x80
         entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
      to the guest memory, however, write_mmio tracepoint always prints 8 bytes
      through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
      leaks 5 bytes from the kernel stack (CVE-2017-17741).  This patch fixes
      it by just accessing the bytes which we operate on.
      
      Before patch:
      
      syz-executor-5567  [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f
      
      After patch:
      
      syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NDarren Kenny <darren.kenny@oracle.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e39d200f
  10. 17 12月, 2017 2 次提交
  11. 15 12月, 2017 1 次提交
    • L
      KVM/x86: Check input paging mode when cs.l is set · f2981033
      Lan Tianyu 提交于
      Reported by syzkaller:
          WARNING: CPU: 0 PID: 27962 at arch/x86/kvm/emulate.c:5631 x86_emulate_insn+0x557/0x15f0 [kvm]
          Modules linked in: kvm_intel kvm [last unloaded: kvm]
          CPU: 0 PID: 27962 Comm: syz-executor Tainted: G    B   W        4.15.0-rc2-next-20171208+ #32
          Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.01.03.0006.040720161253 04/07/2016
          RIP: 0010:x86_emulate_insn+0x557/0x15f0 [kvm]
          RSP: 0018:ffff8807234476d0 EFLAGS: 00010282
          RAX: 0000000000000000 RBX: ffff88072d0237a0 RCX: ffffffffa0065c4d
          RDX: 1ffff100e5a046f9 RSI: 0000000000000003 RDI: ffff88072d0237c8
          RBP: ffff880723447728 R08: ffff88072d020000 R09: ffffffffa008d240
          R10: 0000000000000002 R11: ffffed00e7d87db3 R12: ffff88072d0237c8
          R13: ffff88072d023870 R14: ffff88072d0238c2 R15: ffffffffa008d080
          FS:  00007f8a68666700(0000) GS:ffff880802200000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 000000002009506c CR3: 000000071fec4005 CR4: 00000000003626f0
          Call Trace:
           x86_emulate_instruction+0x3bc/0xb70 [kvm]
           ? reexecute_instruction.part.162+0x130/0x130 [kvm]
           vmx_handle_exit+0x46d/0x14f0 [kvm_intel]
           ? trace_event_raw_event_kvm_entry+0xe7/0x150 [kvm]
           ? handle_vmfunc+0x2f0/0x2f0 [kvm_intel]
           ? wait_lapic_expire+0x25/0x270 [kvm]
           vcpu_enter_guest+0x720/0x1ef0 [kvm]
           ...
      
      When CS.L is set, vcpu should run in the 64 bit paging mode.
      Current kvm set_sregs function doesn't have such check when
      userspace inputs sreg values. This will lead unexpected behavior.
      This patch is to add checks for CS.L, EFER.LME, EFER.LMA and
      CR4.PAE when get SREG inputs from userspace in order to avoid
      unexpected behavior.
      Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: NTianyu Lan <tianyu.lan@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f2981033
  12. 14 12月, 2017 3 次提交
    • P
      kvm: x86: fix WARN due to uninitialized guest FPU state · 5663d8f9
      Peter Xu 提交于
      ------------[ cut here ]------------
       Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
       WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
       CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G    B      OE    4.15.0-rc2+ #10
       RIP: 0010:ex_handler_fprestore+0x88/0x90
       Call Trace:
        fixup_exception+0x4e/0x60
        do_general_protection+0xff/0x270
        general_protection+0x22/0x30
       RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
       RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
        kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
        kvm_apic_accept_events+0x1c0/0x240 [kvm]
        kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
        kvm_vcpu_ioctl+0x479/0x880 [kvm]
        do_vfs_ioctl+0x142/0x9a0
        SyS_ioctl+0x74/0x80
        do_syscall_64+0x15f/0x600
      
      where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu.
      To fix it, move kvm_load_guest_fpu to the very beginning of
      kvm_arch_vcpu_ioctl_run.
      
      Cc: stable@vger.kernel.org
      Fixes: f775b13eSigned-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5663d8f9
    • W
      KVM: X86: Fix load RFLAGS w/o the fixed bit · d73235d1
      Wanpeng Li 提交于
       *** Guest State ***
       CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7
       CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871
       CR3 = 0x00000000fffbc000
       RSP = 0x0000000000000000  RIP = 0x0000000000000000
       RFLAGS=0x00000000         DR7 = 0x0000000000000400
              ^^^^^^^^^^
      
      The failed vmentry is triggered by the following testcase when ept=Y:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[5];
          int main()
          {
          	r[2] = open("/dev/kvm", O_RDONLY);
          	r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
          	r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
          	struct kvm_regs regs = {
          		.rflags = 0,
          	};
          	ioctl(r[4], KVM_SET_REGS, &regs);
          	ioctl(r[4], KVM_RUN, 0);
          }
      
      X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1
      of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails.
      This patch fixes it by oring X86_EFLAGS_FIXED during ioctl.
      
      Cc: stable@vger.kernel.org
      Suggested-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NQuan Xu <quan.xu0@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d73235d1
    • W
      KVM: MMU: Fix infinite loop when there is no available mmu page · ed52870f
      Wanpeng Li 提交于
      The below test case can cause infinite loop in kvm when ept=0.
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[5];
          int main()
          {
          	r[2] = open("/dev/kvm", O_RDONLY);
          	r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
          	r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
          	ioctl(r[4], KVM_RUN, 0);
          }
      
      It doesn't setup the memory regions, mmu_alloc_shadow/direct_roots() in
      kvm return 1 when kvm fails to allocate root page table which can result
      in beblow infinite loop:
      
          vcpu_run() {
          	for (;;) {
      	    	r = vcpu_enter_guest()::kvm_mmu_reload() returns 1
      	    	if (r <= 0)
      	    		break;
      	    	if (need_resched())
      	    		cond_resched();
            }
          }
      
      This patch fixes it by returning -ENOSPC when there is no available kvm mmu
      page for root page table.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 26eeb53c (KVM: MMU: Bail out immediately if there is no available mmu page)
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ed52870f
  13. 06 12月, 2017 5 次提交
    • R
      KVM: x86: fix APIC page invalidation · b1394e74
      Radim Krčmář 提交于
      Implementation of the unpinned APIC page didn't update the VMCS address
      cache when invalidation was done through range mmu notifiers.
      This became a problem when the page notifier was removed.
      
      Re-introduce the arch-specific helper and call it from ...range_start.
      Reported-by: NFabian Grünbichler <f.gruenbichler@proxmox.com>
      Fixes: 38b99173 ("kvm: vmx: Implement set_apic_access_page_addr")
      Fixes: 369ea824 ("mm/rmap: update to new mmu_notifier semantic v2")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Tested-by: NFabian Grünbichler <f.gruenbichler@proxmox.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      b1394e74
    • J
      KVM: VMX: fix page leak in hardware_setup() · 2895db67
      Jim Mattson 提交于
      vmx_io_bitmap_b should not be allocated twice.
      
      Fixes: 23611332 ("KVM: VMX: refactor setup of global page-sized bitmaps")
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      2895db67
    • A
      KVM: VMX: remove I/O port 0x80 bypass on Intel hosts · d59d51f0
      Andrew Honig 提交于
      This fixes CVE-2017-1000407.
      
      KVM allows guests to directly access I/O port 0x80 on Intel hosts.  If
      the guest floods this port with writes it generates exceptions and
      instability in the host kernel, leading to a crash.  With this change
      guest writes to port 0x80 on Intel will behave the same as they
      currently behave on AMD systems.
      
      Prevent the flooding by removing the code that sets port 0x80 as a
      passthrough port.  This is essentially the same as upstream patch
      99f85a28, except that patch was
      for AMD chipsets and this patch is for Intel.
      Signed-off-by: NAndrew Honig <ahonig@google.com>
      Signed-off-by: NJim Mattson <jmattson@google.com>
      Fixes: fdef3ad1 ("KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      d59d51f0
    • R
      x86,kvm: remove KVM emulator get_fpu / put_fpu · 6ab0b9fe
      Rik van Riel 提交于
      Now that get_fpu and put_fpu do nothing, because the scheduler will
      automatically load and restore the guest FPU context for us while we
      are in this code (deep inside the vcpu_run main loop), we can get rid
      of the get_fpu and put_fpu hooks.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Suggested-by: NDavid Hildenbrand <david@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      6ab0b9fe
    • R
      x86,kvm: move qemu/guest FPU switching out to vcpu_run · f775b13e
      Rik van Riel 提交于
      Currently, every time a VCPU is scheduled out, the host kernel will
      first save the guest FPU/xstate context, then load the qemu userspace
      FPU context, only to then immediately save the qemu userspace FPU
      context back to memory. When scheduling in a VCPU, the same extraneous
      FPU loads and saves are done.
      
      This could be avoided by moving from a model where the guest FPU is
      loaded and stored with preemption disabled, to a model where the
      qemu userspace FPU is swapped out for the guest FPU context for
      the duration of the KVM_RUN ioctl.
      
      This is done under the VCPU mutex, which is also taken when other
      tasks inspect the VCPU FPU context, so the code should already be
      safe for this change. That should come as no surprise, given that
      s390 already has this optimization.
      
      This can fix a bug where KVM calls get_user_pages while owning the
      FPU, and the file system ends up requesting the FPU again:
      
          [258270.527947]  __warn+0xcb/0xf0
          [258270.527948]  warn_slowpath_null+0x1d/0x20
          [258270.527951]  kernel_fpu_disable+0x3f/0x50
          [258270.527953]  __kernel_fpu_begin+0x49/0x100
          [258270.527955]  kernel_fpu_begin+0xe/0x10
          [258270.527958]  crc32c_pcl_intel_update+0x84/0xb0
          [258270.527961]  crypto_shash_update+0x3f/0x110
          [258270.527968]  crc32c+0x63/0x8a [libcrc32c]
          [258270.527975]  dm_bm_checksum+0x1b/0x20 [dm_persistent_data]
          [258270.527978]  node_prepare_for_write+0x44/0x70 [dm_persistent_data]
          [258270.527985]  dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data]
          [258270.527988]  submit_io+0x170/0x1b0 [dm_bufio]
          [258270.527992]  __write_dirty_buffer+0x89/0x90 [dm_bufio]
          [258270.527994]  __make_buffer_clean+0x4f/0x80 [dm_bufio]
          [258270.527996]  __try_evict_buffer+0x42/0x60 [dm_bufio]
          [258270.527998]  dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio]
          [258270.528002]  shrink_slab.part.40+0x1f5/0x420
          [258270.528004]  shrink_node+0x22c/0x320
          [258270.528006]  do_try_to_free_pages+0xf5/0x330
          [258270.528008]  try_to_free_pages+0xe9/0x190
          [258270.528009]  __alloc_pages_slowpath+0x40f/0xba0
          [258270.528011]  __alloc_pages_nodemask+0x209/0x260
          [258270.528014]  alloc_pages_vma+0x1f1/0x250
          [258270.528017]  do_huge_pmd_anonymous_page+0x123/0x660
          [258270.528021]  handle_mm_fault+0xfd3/0x1330
          [258270.528025]  __get_user_pages+0x113/0x640
          [258270.528027]  get_user_pages+0x4f/0x60
          [258270.528063]  __gfn_to_pfn_memslot+0x120/0x3f0 [kvm]
          [258270.528108]  try_async_pf+0x66/0x230 [kvm]
          [258270.528135]  tdp_page_fault+0x130/0x280 [kvm]
          [258270.528149]  kvm_mmu_page_fault+0x60/0x120 [kvm]
          [258270.528158]  handle_ept_violation+0x91/0x170 [kvm_intel]
          [258270.528162]  vmx_handle_exit+0x1ca/0x1400 [kvm_intel]
      
      No performance changes were detected in quick ping-pong tests on
      my 4 socket system, which is expected since an FPU+xstate load is
      on the order of 0.1us, while ping-ponging between CPUs is on the
      order of 20us, and somewhat noisy.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Suggested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      [Fixed a bug where reset_vcpu called put_fpu without preceding load_fpu,
       which happened inside from KVM_CREATE_VCPU ioctl. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      f775b13e
  14. 28 11月, 2017 6 次提交
    • J
      KVM: Let KVM_SET_SIGNAL_MASK work as advertised · 20b7035c
      Jan H. Schönherr 提交于
      KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
      "any unblocked signal received [...] will cause KVM_RUN to return with
      -EINTR" and that "the signal will only be delivered if not blocked by
      the original signal mask".
      
      This, however, is only true, when the calling task has a signal handler
      registered for a signal. If not, signal evaluation is short-circuited for
      SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
      returning or the whole process is terminated.
      
      Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
      to that in do_sigtimedwait() to avoid short-circuiting of signals.
      Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      20b7035c
    • W
      KVM: VMX: Fix vmx->nested freeing when no SMI handler · b7455825
      Wanpeng Li 提交于
      Reported by syzkaller:
      
         ------------[ cut here ]------------
         WARNING: CPU: 5 PID: 2939 at arch/x86/kvm/vmx.c:3844 free_loaded_vmcs+0x77/0x80 [kvm_intel]
         CPU: 5 PID: 2939 Comm: repro Not tainted 4.14.0+ #26
         RIP: 0010:free_loaded_vmcs+0x77/0x80 [kvm_intel]
         Call Trace:
          vmx_free_vcpu+0xda/0x130 [kvm_intel]
          kvm_arch_destroy_vm+0x192/0x290 [kvm]
          kvm_put_kvm+0x262/0x560 [kvm]
          kvm_vm_release+0x2c/0x30 [kvm]
          __fput+0x190/0x370
          task_work_run+0xa1/0xd0
          do_exit+0x4d2/0x13e0
          do_group_exit+0x89/0x140
          get_signal+0x318/0xb80
          do_signal+0x8c/0xb40
          exit_to_usermode_loop+0xe4/0x140
          syscall_return_slowpath+0x206/0x230
          entry_SYSCALL_64_fastpath+0x98/0x9a
      
      The syzkaller testcase will execute VMXON/VMLAUCH instructions, so the
      vmx->nested stuff is populated, it will also issue KVM_SMI ioctl. However,
      the testcase is just a simple c program and not be lauched by something
      like seabios which implements smi_handler. Commit 05cade71 (KVM: nSVM:
      fix SMI injection in guest mode) gets out of guest mode and set nested.vmxon
      to false for the duration of SMM according to SDM 34.14.1 "leave VMX
      operation" upon entering SMM. We can't alloc/free the vmx->nested stuff
      each time when entering/exiting SMM since it will induce more overhead. So
      the function vmx_pre_enter_smm() marks nested.vmxon false even if vmx->nested
      stuff is still populated. What it expected is em_rsm() can mark nested.vmxon
      to be true again. However, the smi_handler/rsm will not execute since there
      is no something like seabios in this scenario. The function free_nested()
      fails to free the vmx->nested stuff since the vmx->nested.vmxon is false
      which results in the above warning.
      
      This patch fixes it by also considering the no SMI handler case, luckily
      vmx->nested.smm.vmxon is marked according to the value of vmx->nested.vmxon
      in vmx_pre_enter_smm(), we can take advantage of it and free vmx->nested
      stuff when L1 goes down.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Fixes: 05cade71 (KVM: nSVM: fix SMI injection in guest mode)
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b7455825
    • W
      KVM: VMX: Fix rflags cache during vCPU reset · c37c2873
      Wanpeng Li 提交于
      Reported by syzkaller:
      
         *** Guest State ***
         CR0: actual=0x0000000080010031, shadow=0x0000000060000010, gh_mask=fffffffffffffff7
         CR4: actual=0x0000000000002061, shadow=0x0000000000000000, gh_mask=ffffffffffffe8f1
         CR3 = 0x000000002081e000
         RSP = 0x000000000000fffa  RIP = 0x0000000000000000
         RFLAGS=0x00023000         DR7 = 0x00000000000000
                ^^^^^^^^^^
         ------------[ cut here ]------------
         WARNING: CPU: 6 PID: 24431 at /home/kernel/linux/arch/x86/kvm//x86.c:7302 kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm]
         CPU: 6 PID: 24431 Comm: reprotest Tainted: G        W  OE   4.14.0+ #26
         RIP: 0010:kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm]
         RSP: 0018:ffff880291d179e0 EFLAGS: 00010202
         Call Trace:
          kvm_vcpu_ioctl+0x479/0x880 [kvm]
          do_vfs_ioctl+0x142/0x9a0
          SyS_ioctl+0x74/0x80
          entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The failed vmentry is triggered by the following beautified testcase:
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[5];
          int main()
          {
              struct kvm_debugregs dr = { 0 };
      
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
              struct kvm_guest_debug debug = {
                      .control = 0xf0403,
                      .arch = {
                              .debugreg[6] = 0x2,
                              .debugreg[7] = 0x2
                      }
              };
              ioctl(r[4], KVM_SET_GUEST_DEBUG, &debug);
              ioctl(r[4], KVM_RUN, 0);
          }
      
      which testcase tries to setup the processor specific debug
      registers and configure vCPU for handling guest debug events through
      KVM_SET_GUEST_DEBUG.  The KVM_SET_GUEST_DEBUG ioctl will get and set
      rflags in order to set TF bit if single step is needed. All regs' caches
      are reset to avail and GUEST_RFLAGS vmcs field is reset to 0x2 during vCPU
      reset. However, the cache of rflags is not reset during vCPU reset. The
      function vmx_get_rflags() returns an unreset rflags cache value since
      the cache is marked avail, it is 0 after boot. Vmentry fails if the
      rflags reserved bit 1 is 0.
      
      This patch fixes it by resetting both the GUEST_RFLAGS vmcs field and
      its cache to 0x2 during vCPU reset.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Tested-by: NDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      c37c2873
    • W
      KVM: X86: Fix softlockup when get the current kvmclock · e70b57a6
      Wanpeng Li 提交于
       watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185]
       CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G           OE   4.14.0-rc4+ #4
       RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm]
       Call Trace:
        get_time_ref_counter+0x5a/0x80 [kvm]
        kvm_hv_process_stimers+0x120/0x5f0 [kvm]
        kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm]
        kvm_vcpu_ioctl+0x33a/0x620 [kvm]
        do_vfs_ioctl+0xa1/0x5d0
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x1e/0xa9
      
      This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
      cpu-hotplug stress simultaneously. __this_cpu_read(cpu_tsc_khz) returns 0
      (set in kvmclock_cpu_down_prep()) when the pCPU is unhotplug which results
      in kvm_get_time_scale() gets into an infinite loop.
      
      This patch fixes it by treating the unhotplug pCPU as not using master clock.
      Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e70b57a6
    • D
      KVM: lapic: Fixup LDR on load in x2apic · 12806ba9
      Dr. David Alan Gilbert 提交于
      In x2apic mode the LDR is fixed based on the ID rather
      than separately loadable like it was before x2.
      When kvm_apic_set_state is called, the base is set, and if
      it has the X2APIC_ENABLE flag set then the LDR is calculated;
      however that value gets overwritten by the memcpy a few lines
      below overwriting it with the value that came from userland.
      
      The symptom is a lack of EOI after loading the state
      (e.g. after a QEMU migration) and is due to the EOI bitmap
      being wrong due to the incorrect LDR.  This was seen with
      a Win2016 guest under Qemu with irqchip=split whose USB mouse
      didn't work after a VM migration.
      
      This corresponds to RH bug:
        https://bugzilla.redhat.com/show_bug.cgi?id=1502591Reported-by: NYiqian Wei <yiwei@redhat.com>
      Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Cc: stable@vger.kernel.org
      [Applied fixup from Liran Alon. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      12806ba9
    • D
      KVM: lapic: Split out x2apic ldr calculation · e872fa94
      Dr. David Alan Gilbert 提交于
      Split out the ldr calculation from kvm_apic_set_x2apic_id
      since we're about to reuse it in the following patch.
      Signed-off-by: NDr. David Alan Gilbert <dgilbert@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e872fa94
  15. 17 11月, 2017 8 次提交