1. 03 11月, 2022 4 次提交
    • T
      KVM: VMX: Enable Notify VM exit · 0cbdfd9b
      Tao Xu 提交于
      mainline inclusion
      from mainline-v6.0-rc1
      commit 2f4073e0
      category: feature
      feature: Notify VM exit
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5
      CVE: N/A
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
      commit/?id=2f4073e0
      
      Intel-SIG: commit 2f4073e0 ("KVM: VMX: Enable Notify VM exit")
      
      -------------------------------------
      
      KVM: VMX: Enable Notify VM exit
      
      There are cases that malicious virtual machines can cause CPU stuck (due
      to event windows don't open up), e.g., infinite loop in microcode when
      nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
      IRQ) can be delivered. It leads the CPU to be unavailable to host or
      other VMs.
      
      VMM can enable notify VM exit that a VM exit generated if no event
      window occurs in VM non-root mode for a specified amount of time (notify
      window).
      
      Feature enabling:
      - The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to
        enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust
        the expected notify window.
      - Add a new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT so that user space
        can query and enable this feature in per-VM scope. The argument is a
        64bit value: bits 63:32 are used for notify window, and bits 31:0 are
        for flags. Current supported flags:
        - KVM_X86_NOTIFY_VMEXIT_ENABLED: enable the feature with the notify
          window provided.
        - KVM_X86_NOTIFY_VMEXIT_USER: exit to userspace once the exits happen.
      - It's safe to even set notify window to zero since an internal hardware
        threshold is added to vmcs.notify_window.
      
      VM exit handling:
      - Introduce a vcpu state notify_window_exits to records the count of
        notify VM exits and expose it through the debugfs.
      - Notify VM exit can happen incident to delivery of a vector event.
        Allow it in KVM.
      - Exit to userspace unconditionally for handling when VM_CONTEXT_INVALID
        bit is set.
      
      Nested handling
      - Nested notify VM exits are not supported yet. Keep the same notify
        window control in vmcs02 as vmcs01, so that L1 can't escape the
        restriction of notify VM exits through launching L2 VM.
      
      Notify VM exit is defined in latest Intel Architecture Instruction Set
      Extensions Programming Reference, chapter 9.2.
      Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: NTao Xu <tao3.xu@intel.com>
      Co-developed-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20220524135624.22988-5-chenyi.qiang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAichun Shi <aichun.shi@intel.com>
      0cbdfd9b
    • C
      KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault · af5a4488
      Chenyi Qiang 提交于
      mainline inclusion
      from mainline-v6.0-rc1
      commit ed235117
      category: feature
      feature: Notify VM exit
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5
      CVE: N/A
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
      commit/?id=ed235117
      
      Intel-SIG: commit ed235117 ("KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault")
      
      -------------------------------------
      
      KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault
      
      For the triple fault sythesized by KVM, e.g. the RSM path or
      nested_vmx_abort(), if KVM exits to userspace before the request is
      serviced, userspace could migrate the VM and lose the triple fault.
      
      Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault with a
      new event KVM_VCPUEVENT_VALID_FAULT_FAULT so that userspace can save and
      restore the triple fault event. This extension is guarded by a new KVM
      capability KVM_CAP_TRIPLE_FAULT_EVENT.
      
      Note that in the set_vcpu_events path, userspace is able to set/clear
      the triple fault request through triple_fault.pending field.
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20220524135624.22988-2-chenyi.qiang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAichun Shi <aichun.shi@intel.com>
      af5a4488
    • C
      KVM: VMX: Enable bus lock VM exit · 26bba696
      Chenyi Qiang 提交于
      mainline inclusion
      from mainline-v5.12-rc1
      commit fe6b6bc8
      category: feature
      feature: KVM Bus Lock VM Exit
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB
      CVE: N/A
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
      commit/?id=fe6b6bc8
      
      Intel-SIG: commit fe6b6bc8 ("KVM: VMX: Enable bus lock VM exit")
      
      -------------------------------------
      
      KVM: VMX: Enable bus lock VM exit
      
      Virtual Machine can exploit bus locks to degrade the performance of
      system. Bus lock can be caused by split locked access to writeback(WB)
      memory or by using locks on uncacheable(UC) memory. The bus lock is
      typically >1000 cycles slower than an atomic operation within a cache
      line. It also disrupts performance on other cores (which must wait for
      the bus lock to be released before their memory operations can
      complete).
      
      To address the threat, bus lock VM exit is introduced to notify the VMM
      when a bus lock was acquired, allowing it to enforce throttling or other
      policy based mitigations.
      
      A VMM can enable VM exit due to bus locks by setting a new "Bus Lock
      Detection" VM-execution control(bit 30 of Secondary Processor-based VM
      execution controls). If delivery of this VM exit was preempted by a
      higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC
      access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of
      exit reason in vmcs field is set to 1.
      
      In current implementation, the KVM exposes this capability through
      KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap
      (i.e. off and exit) and enable it explicitly (disabled by default). If
      bus locks in guest are detected by KVM, exit to user space even when
      current exit reason is handled by KVM internally. Set a new field
      KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there
      is a bus lock detected in guest.
      
      Document for Bus Lock VM exit is now available at the latest "Intel
      Architecture Instruction Set Extensions Programming Reference".
      
      Document Link:
      https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.htmlCo-developed-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAichun Shi <aichun.shi@intel.com>
      26bba696
    • C
      KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run · 580aa8e4
      Chenyi Qiang 提交于
      mainline inclusion
      from mainline-v5.12-rc1
      commit 15aad3be
      category: feature
      feature: KVM Bus Lock VM Exit
      bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB
      CVE: N/A
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
      commit/?id=15aad3be
      
      Intel-SIG: commit 15aad3be ("KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run")
      
      -------------------------------------
      
      KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run
      
      Reset the vcpu->run->flags at the beginning of kvm_arch_vcpu_ioctl_run.
      It can avoid every thunk of code that needs to set the flag clear it,
      which increases the odds of missing a case and ending up with a flag in
      an undefined state.
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20201106090315.18606-3-chenyi.qiang@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAichun Shi <aichun.shi@intel.com>
      580aa8e4
  2. 02 11月, 2022 1 次提交
  3. 08 10月, 2022 3 次提交
  4. 20 9月, 2022 2 次提交
  5. 01 9月, 2022 1 次提交
  6. 19 7月, 2022 1 次提交
    • S
      KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded · 85fc72e3
      Sean Christopherson 提交于
      stable inclusion
      from stable-v5.10.112
      commit 342454231ee5f2c2782f5510cab2e7a968486fef
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5HL0X
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=342454231ee5f2c2782f5510cab2e7a968486fef
      
      --------------------------------
      
      commit 1d0e8480 upstream.
      
      Resolve nx_huge_pages to true/false when kvm.ko is loaded, leaving it as
      -1 is technically undefined behavior when its value is read out by
      param_get_bool(), as boolean values are supposed to be '0' or '1'.
      
      Alternatively, KVM could define a custom getter for the param, but the
      auto value doesn't depend on the vendor module in any way, and printing
      "auto" would be unnecessarily unfriendly to the user.
      
      In addition to fixing the undefined behavior, resolving the auto value
      also fixes the scenario where the auto value resolves to N and no vendor
      module is loaded.  Previously, -1 would result in Y being printed even
      though KVM would ultimately disable the mitigation.
      
      Rename the existing MMU module init/exit helpers to clarify that they're
      invoked with respect to the vendor module, and add comments to document
      why KVM has two separate "module init" flows.
      
        =========================================================================
        UBSAN: invalid-load in kernel/params.c:320:33
        load of value 255 is not a valid value for type '_Bool'
        CPU: 6 PID: 892 Comm: tail Not tainted 5.17.0-rc3+ #799
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        Call Trace:
         <TASK>
         dump_stack_lvl+0x34/0x44
         ubsan_epilogue+0x5/0x40
         __ubsan_handle_load_invalid_value.cold+0x43/0x48
         param_get_bool.cold+0xf/0x14
         param_attr_show+0x55/0x80
         module_attr_show+0x1c/0x30
         sysfs_kf_seq_show+0x93/0xc0
         seq_read_iter+0x11c/0x450
         new_sync_read+0x11b/0x1a0
         vfs_read+0xf0/0x190
         ksys_read+0x5f/0xe0
         do_syscall_64+0x3b/0xc0
         entry_SYSCALL_64_after_hwframe+0x44/0xae
         </TASK>
        =========================================================================
      
      Fixes: b8e8c830 ("kvm: mmu: ITLB_MULTIHIT mitigation")
      Cc: stable@vger.kernel.org
      Reported-by: NBruno Goncalves <bgoncalv@redhat.com>
      Reported-by: NJan Stancek <jstancek@redhat.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20220331221359.3912754-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
      85fc72e3
  7. 18 7月, 2022 1 次提交
  8. 08 7月, 2022 2 次提交
  9. 06 7月, 2022 1 次提交
  10. 28 6月, 2022 1 次提交
  11. 07 6月, 2022 1 次提交
  12. 10 5月, 2022 2 次提交
  13. 28 4月, 2022 1 次提交
  14. 19 4月, 2022 2 次提交
  15. 14 1月, 2022 1 次提交
  16. 07 1月, 2022 1 次提交
  17. 30 12月, 2021 3 次提交
  18. 15 11月, 2021 2 次提交
  19. 19 10月, 2021 3 次提交
  20. 15 10月, 2021 4 次提交
  21. 13 10月, 2021 1 次提交
  22. 06 7月, 2021 2 次提交
    • W
      KVM: X86: Fix x86_emulator slab cache leak · f9a6de85
      Wanpeng Li 提交于
      stable inclusion
      from stable-5.10.46
      commit 3a9934d6b8dd8a91d61ed2d0d538fa27cb9192a3
      bugzilla: 168323
      CVE: NA
      
      --------------------------------
      
      commit dfdc0a71 upstream.
      
      Commit c9b8b07c (KVM: x86: Dynamically allocate per-vCPU emulation context)
      tries to allocate per-vCPU emulation context dynamically, however, the
      x86_emulator slab cache is still exiting after the kvm module is unload
      as below after destroying the VM and unloading the kvm module.
      
      grep x86_emulator /proc/slabinfo
      x86_emulator          36     36   2672   12    8 : tunables    0    0    0 : slabdata      3      3      0
      
      This patch fixes this slab cache leak by destroying the x86_emulator slab cache
      when the kvm module is unloaded.
      
      Fixes: c9b8b07c (KVM: x86: Dynamically allocate per-vCPU emulation context)
      Cc: stable@vger.kernel.org
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1623387573-5969-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      f9a6de85
    • S
      KVM: x86: Immediately reset the MMU context when the SMM flag is cleared · a557ec4c
      Sean Christopherson 提交于
      stable inclusion
      from stable-5.10.46
      commit 669a8866e468fd020d34eb00e08cb41d3774b71b
      bugzilla: 168323
      CVE: NA
      
      --------------------------------
      
      commit 78fcb2c9 upstream.
      
      Immediately reset the MMU context when the vCPU's SMM flag is cleared so
      that the SMM flag in the MMU role is always synchronized with the vCPU's
      flag.  If RSM fails (which isn't correctly emulated), KVM will bail
      without calling post_leave_smm() and leave the MMU in a bad state.
      
      The bad MMU role can lead to a NULL pointer dereference when grabbing a
      shadow page's rmap for a page fault as the initial lookups for the gfn
      will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will
      use the shadow page's SMM flag, which comes from the MMU (=1).  SMM has
      an entirely different set of memslots, and so the initial lookup can find
      a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1).
      
        general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
        KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
        CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline]
        RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947
        Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44
        RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40
        RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
        RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002
        R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000
        R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000
        FS:  000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline]
         mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604
         __direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline]
         direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769
         kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline]
         kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065
         vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122
         vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428
         vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494
         kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722
         kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460
         vfs_ioctl fs/ioctl.c:51 [inline]
         __do_sys_ioctl fs/ioctl.c:1069 [inline]
         __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055
         do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
         entry_SYSCALL_64_after_hwframe+0x44/0xae
        RIP: 0033:0x440ce9
      
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+fb0b6a7e8713aeb0319c@syzkaller.appspotmail.com
      Fixes: 9ec19493 ("KVM: x86: clear SMM flags before loading state while leaving SMM")
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210609185619.992058-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      a557ec4c