1. 20 6月, 2022 14 次提交
  2. 15 6月, 2022 10 次提交
  3. 10 6月, 2022 8 次提交
  4. 09 6月, 2022 8 次提交
    • P
      KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE · e3cdaab5
      Paolo Bonzini 提交于
      Commit 74fd41ed ("KVM: x86: nSVM: support PAUSE filtering when L0
      doesn't intercept PAUSE") introduced passthrough support for nested pause
      filtering, (when the host doesn't intercept PAUSE) (either disabled with
      kvm module param, or disabled with '-overcommit cpu-pm=on')
      
      Before this commit, L1 KVM didn't intercept PAUSE at all; afterwards,
      the feature was exposed as supported by KVM cpuid unconditionally, thus
      if L1 could try to use it even when the L0 KVM can't really support it.
      
      In this case the fallback caused KVM to intercept each PAUSE instruction;
      in some cases, such intercept can slow down the nested guest so much
      that it can fail to boot.  Instead, before the problematic commit KVM
      was already setting both thresholds to 0 in vmcb02, but after the first
      userspace VM exit shrink_ple_window was called and would reset the
      pause_filter_count to the default value.
      
      To fix this, change the fallback strategy - ignore the guest threshold
      values, but use/update the host threshold values unless the guest
      specifically requests disabling PAUSE filtering (either simple or
      advanced).
      
      Also fix a minor bug: on nested VM exit, when PAUSE filter counter
      were copied back to vmcb01, a dirty bit was not set.
      
      Thanks a lot to Suravee Suthikulpanit for debugging this!
      
      Fixes: 74fd41ed ("KVM: x86: nSVM: support PAUSE filtering when L0 doesn't intercept PAUSE")
      Reported-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Tested-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Co-developed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220518072709.730031-1-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e3cdaab5
    • M
      KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put · ba8ec273
      Maxim Levitsky 提交于
      Now that these functions are always called with preemption disabled,
      remove the preempt_disable()/preempt_enable() pair inside them.
      
      No functional change intended.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606180829.102503-8-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ba8ec273
    • M
      KVM: x86: disable preemption while updating apicv inhibition · 66c768d3
      Maxim Levitsky 提交于
      Currently nothing prevents preemption in kvm_vcpu_update_apicv.
      
      On SVM, If the preemption happens after we update the
      vcpu->arch.apicv_active, the preemption itself will
      'update' the inhibition since the AVIC will be first disabled
      on vCPU unload and then enabled, when the current task
      is loaded again.
      
      Then we will try to update it again, which will lead to a warning
      in __avic_vcpu_load, that the AVIC is already enabled.
      
      Fix this by disabling preemption in this code.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606180829.102503-6-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      66c768d3
    • M
      KVM: x86: SVM: fix avic_kick_target_vcpus_fast · 603ccef4
      Maxim Levitsky 提交于
      There are two issues in avic_kick_target_vcpus_fast
      
      1. It is legal to issue an IPI request with APIC_DEST_NOSHORT
         and a physical destination of 0xFF (or 0xFFFFFFFF in case of x2apic),
         which must be treated as a broadcast destination.
      
         Fix this by explicitly checking for it.
         Also don’t use ‘index’ in this case as it gives no new information.
      
      2. It is legal to issue a logical IPI request to more than one target.
         Index field only provides index in physical id table of first
         such target and therefore can't be used before we are sure
         that only a single target was addressed.
      
         Instead, parse the ICRL/ICRH, double check that a unicast interrupt
         was requested, and use that info to figure out the physical id
         of the target vCPU.
         At that point there is no need to use the index field as well.
      
      In addition to fixing the above	issues,	also skip the call to
      kvm_apic_match_dest.
      
      It is possible to do this now, because now as long as AVIC is not
      inhibited, it is guaranteed that none of the vCPUs changed their
      apic id from its default value.
      
      This fixes boot of windows guest with AVIC enabled because it uses
      IPI with 0xFF destination and no destination shorthand.
      
      Fixes: 7223fd2d ("KVM: SVM: Use target APIC ID to complete AVIC IRQs when possible")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606180829.102503-5-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      603ccef4
    • M
      KVM: x86: SVM: remove avic's broken code that updated APIC ID · f5f9089f
      Maxim Levitsky 提交于
      AVIC is now inhibited if the guest changes the apic id,
      and therefore this code is no longer needed.
      
      There are several ways this code was broken, including:
      
      1. a vCPU was only allowed to change its apic id to an apic id
      of an existing vCPU.
      
      2. After such change, the vCPU whose apic id entry was overwritten,
      could not correctly change its own apic id, because its own
      entry is already overwritten.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606180829.102503-4-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f5f9089f
    • M
      KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base · 3743c2f0
      Maxim Levitsky 提交于
      Neither of these settings should be changed by the guest and it is
      a burden to support it in the acceleration code, so just inhibit
      this code instead.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606180829.102503-3-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3743c2f0
    • M
      KVM: x86: document AVIC/APICv inhibit reasons · a9603ae0
      Maxim Levitsky 提交于
      These days there are too many AVIC/APICv inhibit
      reasons, and it doesn't hurt to have some documentation
      for them.
      Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606180829.102503-2-mlevitsk@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a9603ae0
    • Y
      KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs · d2263de1
      Yuan Yao 提交于
      Assign shadow_me_value, not shadow_me_mask, to PAE root entries,
      a.k.a. shadow PDPTRs, when host memory encryption is supported.  The
      "mask" is the set of all possible memory encryption bits, e.g. MKTME
      KeyIDs, whereas "value" holds the actual value that needs to be
      stuffed into host page tables.
      
      Using shadow_me_mask results in a failed VM-Entry due to setting
      reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to
      physical addresses with non-zero MKTME bits sending to_shadow_page()
      into the weeds:
      
      set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
      BUG: unable to handle page fault for address: ffd43f00063049e8
      PGD 86dfd8067 P4D 0
      Oops: 0000 [#1] PREEMPT SMP
      RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm]
       kvm_mmu_free_roots+0xd1/0x200 [kvm]
       __kvm_mmu_unload+0x29/0x70 [kvm]
       kvm_mmu_unload+0x13/0x20 [kvm]
       kvm_arch_destroy_vm+0x8a/0x190 [kvm]
       kvm_put_kvm+0x197/0x2d0 [kvm]
       kvm_vm_release+0x21/0x30 [kvm]
       __fput+0x8e/0x260
       ____fput+0xe/0x10
       task_work_run+0x6f/0xb0
       do_exit+0x327/0xa90
       do_group_exit+0x35/0xa0
       get_signal+0x911/0x930
       arch_do_signal_or_restart+0x37/0x720
       exit_to_user_mode_prepare+0xb2/0x140
       syscall_exit_to_user_mode+0x16/0x30
       do_syscall_64+0x4e/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: e54f1ff2 ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask")
      Signed-off-by: NYuan Yao <yuan.yao@intel.com>
      Reviewed-by: NKai Huang <kai.huang@intel.com>
      Message-Id: <20220608012015.19566-1-yuan.yao@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d2263de1