1. 07 4月, 2020 2 次提交
  2. 03 4月, 2020 6 次提交
  3. 31 3月, 2020 11 次提交
  4. 26 3月, 2020 3 次提交
  5. 25 3月, 2020 2 次提交
  6. 24 3月, 2020 6 次提交
    • W
      KVM: LAPIC: Also cancel preemption timer when disarm LAPIC timer · 94be4b85
      Wanpeng Li 提交于
      The timer is disarmed when switching between TSC deadline and other modes,
      we should set everything to disarmed state, however, LAPIC timer can be
      emulated by preemption timer, it still works if vmx->hv_deadline_timer is
      not -1. This patch also cancels preemption timer when disarm LAPIC timer.
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1585031530-19823-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      94be4b85
    • S
      KVM: VMX: Gracefully handle faults on VMXON · 4f6ea0a8
      Sean Christopherson 提交于
      Gracefully handle faults on VMXON, e.g. #GP due to VMX being disabled by
      BIOS, instead of letting the fault crash the system.  Now that KVM uses
      cpufeatures to query support instead of reading MSR_IA32_FEAT_CTL
      directly, it's possible for a bug in a different subsystem to cause KVM
      to incorrectly attempt VMXON[*].  Crashing the system is especially
      annoying if the system is configured such that hardware_enable() will
      be triggered during boot.
      
      Oppurtunistically rename @addr to @vmxon_pointer and use a named param
      to reference it in the inline assembly.
      
      Print 0xdeadbeef in the ultra-"rare" case that reading MSR_IA32_FEAT_CTL
      also faults.
      
      [*] https://lkml.kernel.org/r/20200226231615.13664-1-sean.j.christopherson@intel.comSigned-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200321193751.24985-4-sean.j.christopherson@intel.com>
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4f6ea0a8
    • S
      KVM: VMX: Fold loaded_vmcs_init() into alloc_loaded_vmcs() · d260f9ef
      Sean Christopherson 提交于
      Subsume loaded_vmcs_init() into alloc_loaded_vmcs(), its only remaining
      caller, and drop the VMCLEAR on the shadow VMCS, which is guaranteed to
      be NULL.  loaded_vmcs_init() was previously used by loaded_vmcs_clear(),
      but loaded_vmcs_clear() also subsumed loaded_vmcs_init() to properly
      handle smp_wmb() with respect to VMCLEAR.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200321193751.24985-3-sean.j.christopherson@intel.com>
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d260f9ef
    • S
      KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support · 31603d4f
      Sean Christopherson 提交于
      VMCLEAR all in-use VMCSes during a crash, even if kdump's NMI shootdown
      interrupted a KVM update of the percpu in-use VMCS list.
      
      Because NMIs are not blocked by disabling IRQs, it's possible that
      crash_vmclear_local_loaded_vmcss() could be called while the percpu list
      of VMCSes is being modified, e.g. in the middle of list_add() in
      vmx_vcpu_load_vmcs().  This potential corner case was called out in the
      original commit[*], but the analysis of its impact was wrong.
      
      Skipping the VMCLEARs is wrong because it all but guarantees that a
      loaded, and therefore cached, VMCS will live across kexec and corrupt
      memory in the new kernel.  Corruption will occur because the CPU's VMCS
      cache is non-coherent, i.e. not snooped, and so the writeback of VMCS
      memory on its eviction will overwrite random memory in the new kernel.
      The VMCS will live because the NMI shootdown also disables VMX, i.e. the
      in-progress VMCLEAR will #UD, and existing Intel CPUs do not flush the
      VMCS cache on VMXOFF.
      
      Furthermore, interrupting list_add() and list_del() is safe due to
      crash_vmclear_local_loaded_vmcss() using forward iteration.  list_add()
      ensures the new entry is not visible to forward iteration unless the
      entire add completes, via WRITE_ONCE(prev->next, new).  A bad "prev"
      pointer could be observed if the NMI shootdown interrupted list_del() or
      list_add(), but list_for_each_entry() does not consume ->prev.
      
      In addition to removing the temporary disabling of VMCLEAR, open code
      loaded_vmcs_init() in __loaded_vmcs_clear() and reorder VMCLEAR so that
      the VMCS is deleted from the list only after it's been VMCLEAR'd.
      Deleting the VMCS before VMCLEAR would allow a race where the NMI
      shootdown could arrive between list_del() and vmcs_clear() and thus
      neither flow would execute a successful VMCLEAR.  Alternatively, more
      code could be moved into loaded_vmcs_init(), but that gets rather silly
      as the only other user, alloc_loaded_vmcs(), doesn't need the smp_wmb()
      and would need to work around the list_del().
      
      Update the smp_*() comments related to the list manipulation, and
      opportunistically reword them to improve clarity.
      
      [*] https://patchwork.kernel.org/patch/1675731/#3720461
      
      Fixes: 8f536b76 ("KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump")
      Cc: stable@vger.kernel.org
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200321193751.24985-2-sean.j.christopherson@intel.com>
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      31603d4f
    • Z
      KVM: x86: Expose fast short REP MOV for supported cpuid · e3747407
      Zhenyu Wang 提交于
      For CPU supporting fast short REP MOV (XF86_FEATURE_FSRM) e.g Icelake,
      Tigerlake, expose it in KVM supported cpuid as well.
      Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
      Message-Id: <20200323092236.3703-1-zhenyuw@linux.intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e3747407
    • N
      KVM: VMX: don't allow memory operands for inline asm that modifies SP · 428b8f1d
      Nick Desaulniers 提交于
      THUNK_TARGET defines [thunk_target] as having "rm" input constraints
      when CONFIG_RETPOLINE is not set, which isn't constrained enough for
      this specific case.
      
      For inline assembly that modifies the stack pointer before using this
      input, the underspecification of constraints is dangerous, and results
      in an indirect call to a previously pushed flags register.
      
      In this case `entry`'s stack slot is good enough to satisfy the "m"
      constraint in "rm", but the inline assembly in
      handle_external_interrupt_irqoff() modifies the stack pointer via
      push+pushf before using this input, which in this case results in
      calling what was the previous state of the flags register, rather than
      `entry`.
      
      Be more specific in the constraints by requiring `entry` be in a
      register, and not a memory operand.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com
      Debugged-by: NAlexander Potapenko <glider@google.com>
      Debugged-by: NPaolo Bonzini <pbonzini@redhat.com>
      Debugged-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
      Message-Id: <20200323191243.30002-1-ndesaulniers@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      428b8f1d
  7. 23 3月, 2020 2 次提交
    • H
      KVM: LAPIC: Mark hrtimer for period or oneshot mode to expire in hard interrupt context · edec6e01
      He Zhe 提交于
      apic->lapic_timer.timer was initialized with HRTIMER_MODE_ABS_HARD but
      started later with HRTIMER_MODE_ABS, which may cause the following warning
      in PREEMPT_RT kernel.
      
      WARNING: CPU: 1 PID: 2957 at kernel/time/hrtimer.c:1129 hrtimer_start_range_ns+0x348/0x3f0
      CPU: 1 PID: 2957 Comm: qemu-system-x86 Not tainted 5.4.23-rt11 #1
      Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1a 09/18/2018
      RIP: 0010:hrtimer_start_range_ns+0x348/0x3f0
      Code: 4d b8 0f 94 c1 0f b6 c9 e8 35 f1 ff ff 4c 8b 45
            b0 e9 3b fd ff ff e8 d7 3f fa ff 48 98 4c 03 34
            c5 a0 26 bf 93 e9 a1 fd ff ff <0f> 0b e9 fd fc ff
            ff 65 8b 05 fa b7 90 6d 89 c0 48 0f a3 05 60 91
      RSP: 0018:ffffbc60026ffaf8 EFLAGS: 00010202
      RAX: 0000000000000001 RBX: ffff9d81657d4110 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000006cc7987bcf RDI: ffff9d81657d4110
      RBP: ffffbc60026ffb58 R08: 0000000000000001 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000006cc7987bcf
      R13: 0000000000000000 R14: 0000006cc7987bcf R15: ffffbc60026d6a00
      FS: 00007f401daed700(0000) GS:ffff9d81ffa40000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000ffffffff CR3: 0000000fa7574000 CR4: 00000000003426e0
      Call Trace:
      ? kvm_release_pfn_clean+0x22/0x60 [kvm]
      start_sw_timer+0x85/0x230 [kvm]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      kvm_lapic_switch_to_sw_timer+0x72/0x80 [kvm]
      vmx_pre_block+0x1cb/0x260 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_vmexit+0x1b/0x30 [kvm_intel]
      ? vmx_vmexit+0xf/0x30 [kvm_intel]
      ? vmx_sync_pir_to_irr+0x9e/0x100 [kvm_intel]
      ? kvm_apic_has_interrupt+0x46/0x80 [kvm]
      kvm_arch_vcpu_ioctl_run+0x85b/0x1fa0 [kvm]
      ? _raw_spin_unlock_irqrestore+0x18/0x50
      ? _copy_to_user+0x2c/0x30
      kvm_vcpu_ioctl+0x235/0x660 [kvm]
      ? rt_spin_unlock+0x2c/0x50
      do_vfs_ioctl+0x3e4/0x650
      ? __fget+0x7a/0xa0
      ksys_ioctl+0x67/0x90
      __x64_sys_ioctl+0x1a/0x20
      do_syscall_64+0x4d/0x120
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x7f4027cc54a7
      Code: 00 00 90 48 8b 05 e9 59 0c 00 64 c7 00 26 00 00
            00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00
            00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff
            73 01 c3 48 8b 0d b9 59 0c 00 f7 d8 64 89 01 48
      RSP: 002b:00007f401dae9858 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00005558bd029690 RCX: 00007f4027cc54a7
      RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000d
      RBP: 00007f4028b72000 R08: 00005558bc829ad0 R09: 00000000ffffffff
      R10: 00005558bcf90ca0 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 00005558bce1c840
      --[ end trace 0000000000000002 ]--
      Signed-off-by: NHe Zhe <zhe.he@windriver.com>
      Message-Id: <1584687967-332859-1-git-send-email-zhe.he@windriver.com>
      Reviewed-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      edec6e01
    • T
      KVM: SVM: Issue WBINVD after deactivating an SEV guest · 2e2409af
      Tom Lendacky 提交于
      Currently, CLFLUSH is used to flush SEV guest memory before the guest is
      terminated (or a memory hotplug region is removed). However, CLFLUSH is
      not enough to ensure that SEV guest tagged data is flushed from the cache.
      
      With 33af3a7e ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations"), the
      original WBINVD was removed. This then exposed crashes at random times
      because of a cache flush race with a page that had both a hypervisor and
      a guest tag in the cache.
      
      Restore the WBINVD when destroying an SEV guest and add a WBINVD to the
      svm_unregister_enc_region() function to ensure hotplug memory is flushed
      when removed. The DF_FLUSH can still be avoided at this point.
      
      Fixes: 33af3a7e ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations")
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <c8bf9087ca3711c5770bdeaafa3e45b717dc5ef4.1584720426.git.thomas.lendacky@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2e2409af
  8. 21 3月, 2020 2 次提交
    • P
      KVM: SVM: document KVM_MEM_ENCRYPT_OP, let userspace detect if SEV is available · 2da1ed62
      Paolo Bonzini 提交于
      Userspace has no way to query if SEV has been disabled with the
      sev module parameter of kvm-amd.ko.  Actually it has one, but it
      is a hack: do ioctl(KVM_MEM_ENCRYPT_OP, NULL) and check if it
      returns EFAULT.  Make it a little nicer by returning zero for
      SEV enabled and NULL argument, and while at it document the
      ioctl arguments.
      
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2da1ed62
    • P
      KVM: x86: remove bogus user-triggerable WARN_ON · d3329454
      Paolo Bonzini 提交于
      The WARN_ON is essentially comparing a user-provided value with 0.  It is
      trivial to trigger it just by passing garbage to KVM_SET_CLOCK.  Guests
      can break if you do so, but the same applies to every KVM_SET_* ioctl.
      So, if it hurts when you do like this, just do not do it.
      
      Reported-by: syzbot+00be5da1d75f1cc95f6b@syzkaller.appspotmail.com
      Fixes: 9446e6fc ("KVM: x86: fix WARN_ON check of an unsigned less than zero")
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d3329454
  9. 18 3月, 2020 6 次提交