- 26 1月, 2021 9 次提交
-
-
由 Paolo Bonzini 提交于
VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS, which may need to be loaded outside guest mode. Therefore we cannot WARN in that case. However, that part of nested_get_vmcs12_pages is _not_ needed at vmentry time. Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling, so that both vmentry and migration (and in the latter case, independent of is_guest_mode) do the parts that are needed. Cc: <stable@vger.kernel.org> # 5.10.x: f2c7ef3b: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Revert the dirty/available tracking of GPRs now that KVM copies the GPRs to the GHCB on any post-VMGEXIT VMRUN, even if a GPR is not dirty. Per commit de3cd117 ("KVM: x86: Omit caching logic for always-available GPRs"), tracking for GPRs noticeably impacts KVM's code footprint. This reverts commit 1c04d8c9. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210122235049.3107620-3-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the per-GPR dirty checks when synchronizing GPRs to the GHCB, the GRPs' dirty bits are set from time zero and never cleared, i.e. will always be seen as dirty. The obvious alternative would be to clear the dirty bits when appropriate, but removing the dirty checks is desirable as it allows reverting GPR dirty+available tracking, which adds overhead to all flavors of x86 VMs. Note, unconditionally writing the GPRs in the GHCB is tacitly allowed by the GHCB spec, which allows the hypervisor (or guest) to provide unnecessary info; it's the guest's responsibility to consume only what it needs (the hypervisor is untrusted after all). The guest and hypervisor can supply additional state if desired but must not rely on that additional state being provided. Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Fixes: 291bd20d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT") Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210122235049.3107620-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maxim Levitsky 提交于
Even when we are outside the nested guest, some vmcs02 fields may not be in sync vs vmcs12. This is intentional, even across nested VM-exit, because the sync can be delayed until the nested hypervisor performs a VMCLEAR or a VMREAD/VMWRITE that affects those rarely accessed fields. However, during KVM_GET_NESTED_STATE, the vmcs12 has to be up to date to be able to restore it. To fix that, call copy_vmcs02_to_vmcs12_rare() before the vmcs12 contents are copied to userspace. Fixes: 7952d769 ("KVM: nVMX: Sync rarely accessed guest fields only when needed") Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210114205449.8715-2-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Lorenzo Brescia 提交于
On VMX, if we exit and then re-enter immediately without leaving the vmx_vcpu_run() function, the kvm_entry event is not logged. That means we will see one (or more) kvm_exit, without its (their) corresponding kvm_entry, as shown here: CPU-1979 [002] 89.871187: kvm_entry: vcpu 1 CPU-1979 [002] 89.871218: kvm_exit: reason MSR_WRITE CPU-1979 [002] 89.871259: kvm_exit: reason MSR_WRITE It also seems possible for a kvm_entry event to be logged, but then we leave vmx_vcpu_run() right away (if vmx->emulation_required is true). In this case, we will have a spurious kvm_entry event in the trace. Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run() (where trace_kvm_exit() already is). A trace obtained with this patch applied looks like this: CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0 CPU-14295 [000] 8388.395392: kvm_exit: reason MSR_WRITE CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0 CPU-14295 [000] 8388.395503: kvm_exit: reason EXTERNAL_INTERRUPT Of course, not calling trace_kvm_entry() in common x86 code any longer means that we need to adjust the SVM side of things too. Signed-off-by: NLorenzo Brescia <lorenzo.brescia@edu.unito.it> Signed-off-by: NDario Faggioli <dfaggioli@suse.com> Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jay Zhou 提交于
The injection process of smi has two steps: Qemu KVM Step1: cpu->interrupt_request &= \ ~CPU_INTERRUPT_SMI; kvm_vcpu_ioctl(cpu, KVM_SMI) call kvm_vcpu_ioctl_smi() and kvm_make_request(KVM_REQ_SMI, vcpu); Step2: kvm_vcpu_ioctl(cpu, KVM_RUN, 0) call process_smi() if kvm_check_request(KVM_REQ_SMI, vcpu) is true, mark vcpu->arch.smi_pending = true; The vcpu->arch.smi_pending will be set true in step2, unfortunately if vcpu paused between step1 and step2, the kvm_run->immediate_exit will be set and vcpu has to exit to Qemu immediately during step2 before mark vcpu->arch.smi_pending true. During VM migration, Qemu will get the smi pending status from KVM using KVM_GET_VCPU_EVENTS ioctl at the downtime, then the smi pending status will be lost. Signed-off-by: NJay Zhou <jianjay.zhou@huawei.com> Signed-off-by: NShengen Zhuang <zhuangshengen@huawei.com> Message-Id: <20210118084720.1585-1-jianjay.zhou@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
The HW_REF_CPU_CYCLES event on the fixed counter 2 is pseudo-encoded as 0x0300 in the intel_perfmon_event_map[]. Correct its usage. Fixes: 62079d8a ("KVM: PMU: add proper support for fixed counter 2") Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20201230081916.63417-1-like.xu@linux.intel.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Like Xu 提交于
Since we know vPMU will not work properly when (1) the guest bit_width(s) of the [gp|fixed] counters are greater than the host ones, or (2) guest requested architectural events exceeds the range supported by the host, so we can setup a smaller left shift value and refresh the guest cpuid entry, thus fixing the following UBSAN shift-out-of-bounds warning: shift exponent 197 is too large for 64-bit type 'long long unsigned int' Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395 intel_pmu_refresh.cold+0x75/0x99 arch/x86/kvm/vmx/pmu_intel.c:348 kvm_vcpu_after_set_cpuid+0x65a/0xf80 arch/x86/kvm/cpuid.c:177 kvm_vcpu_ioctl_set_cpuid2+0x160/0x440 arch/x86/kvm/cpuid.c:308 kvm_arch_vcpu_ioctl+0x11b6/0x2d70 arch/x86/kvm/x86.c:4709 kvm_vcpu_ioctl+0x7b9/0xdb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3386 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+ae488dc136a4cc6ba32b@syzkaller.appspotmail.com Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210118025800.34620-1-like.xu@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add compile-time asserts in rsvd_bits() to guard against KVM passing in garbage hardcoded values, and cap the upper bound at '63' for dynamic values to prevent generating a mask that would overflow a u64. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210113204515.3473079-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 1月, 2021 2 次提交
-
-
由 Marc Zyngier 提交于
When running on v8.0 HW, make sure we don't try to advertise events in the 0x4000-0x403f range. Cc: stable@vger.kernel.org Fixes: 88865bec ("KVM: arm64: Mask out filtered events in PCMEID{0,1}_EL1") Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210121105636.1478491-1-maz@kernel.org
-
由 Steven Price 提交于
KASAN in HW_TAGS mode will store MTE tags in the top byte of the pointer. When computing the offset for TPIDR_EL2 we don't want anything in the top byte, so remove the tag to ensure the computation is correct no matter what the tag. Fixes: 94ab5b61 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: NSteven Price <steven.price@arm.com> [maz: added comment] Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210108161254.53674-1-steven.price@arm.com
-
- 14 1月, 2021 4 次提交
-
-
由 Alexandru Elisei 提交于
The reg_to_encoding() macro is a wrapper over sys_reg() and conveniently takes a sys_reg_desc or a sys_reg_params argument and returns the 32 bit register encoding. Use it instead of calling sys_reg() directly. Signed-off-by: NAlexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210106144218.110665-1-alexandru.elisei@arm.com
-
由 David Brazdil 提交于
The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should not return, as dictated by the PSCI spec. However, there is firmware out there which breaks this assumption, leading to a hyp panic. Make KVM more robust to broken firmware by allowing these to return. Signed-off-by: NDavid Brazdil <dbrazdil@google.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201229160059.64135-1-dbrazdil@google.com
-
由 Marc Zyngier 提交于
Now that all PMU registers are gated behind a .visibility callback, remove the other checks against an absent PMU. Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
由 Marc Zyngier 提交于
It appears that while we are now able to properly hide PMU registers from the guest when a PMU isn't available (either because none has been configured, the host doesn't have the PMU support compiled in, or that the HW doesn't have one at all), we are still exposing more than we should to userspace. Introduce a visibility callback gating all the PMU registers, which covers both usrespace and guest. Signed-off-by: NMarc Zyngier <maz@kernel.org>
-
- 09 1月, 2021 1 次提交
-
-
由 Vineet Gupta 提交于
HSDK has hardware floating point and the common use case is with glibc+hf so enable that as default. Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 08 1月, 2021 19 次提交
-
-
由 Arnd Bergmann 提交于
dtc points out that the interrupts for some devices are not parsable: picoxcell-pc3x2.dtsi:45.19-49.5: Warning (interrupts_property): /paxi/gem@30000: Missing interrupt-parent picoxcell-pc3x2.dtsi:51.21-55.5: Warning (interrupts_property): /paxi/dmac@40000: Missing interrupt-parent picoxcell-pc3x2.dtsi:57.21-61.5: Warning (interrupts_property): /paxi/dmac@50000: Missing interrupt-parent picoxcell-pc3x2.dtsi:233.21-237.5: Warning (interrupts_property): /rwid-axi/axi2pico@c0000000: Missing interrupt-parent There are two VIC instances, so it's not clear which one needs to be used. I found the BSP sources that reference VIC0, so use that: https://github.com/r1mikey/meta-picoxcell/blob/master/recipes-kernel/linux/linux-picochip-3.0/0001-picoxcell-support-for-Picochip-picoXcell-SoC.patchAcked-by: NJamie Iles <jamie@jamieiles.com> Link: https://lore.kernel.org/r/20201230152010.3914962-1-arnd@kernel.org' Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Fenghua Yu 提交于
Shakeel Butt reported in [1] that a user can request a task to be moved to a resource group even if the task is already in the group. It just wastes time to do the move operation which could be costly to send IPI to a different CPU. Add a sanity check to ensure that the move operation only happens when the task is not already in the resource group. [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/ Fixes: e02737d5 ("x86/intel_rdt: Add tasks files") Reported-by: NShakeel Butt <shakeelb@google.com> Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.chatre@intel.com
-
由 Fenghua Yu 提交于
Currently, when moving a task to a resource group the PQR_ASSOC MSR is updated with the new closid and rmid in an added task callback. If the task is running, the work is run as soon as possible. If the task is not running, the work is executed later in the kernel exit path when the kernel returns to the task again. Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task is running is the right thing to do. Queueing work for a task that is not running is unnecessary (the PQR_ASSOC MSR is already updated when the task is scheduled in) and causing system resource waste with the way in which it is implemented: Work to update the PQR_ASSOC register is queued every time the user writes a task id to the "tasks" file, even if the task already belongs to the resource group. This could result in multiple pending work items associated with a single task even if they are all identical and even though only a single update with most recent values is needed. Specifically, even if a task is moved between different resource groups while it is sleeping then it is only the last move that is relevant but yet a work item is queued during each move. This unnecessary queueing of work items could result in significant system resource waste, especially on tasks sleeping for a long time. For example, as demonstrated by Shakeel Butt in [1] writing the same task id to the "tasks" file can quickly consume significant memory. The same problem (wasted system resources) occurs when moving a task between different resource groups. As pointed out by Valentin Schneider in [2] there is an additional issue with the way in which the queueing of work is done in that the task_struct update is currently done after the work is queued, resulting in a race with the register update possibly done before the data needed by the update is available. To solve these issues, update the PQR_ASSOC MSR in a synchronous way right after the new closid and rmid are ready during the task movement, only if the task is running. If a moved task is not running nothing is done since the PQR_ASSOC MSR will be updated next time the task is scheduled. This is the same way used to update the register when tasks are moved as part of resource group removal. [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/ [2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com [ bp: Massage commit message and drop the two update_task_closid_rmid() variants. ] Fixes: e02737d5 ("x86/intel_rdt: Add tasks files") Reported-by: NShakeel Butt <shakeelb@google.com> Reported-by: NValentin Schneider <valentin.schneider@arm.com> Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Reviewed-by: NJames Morse <james.morse@arm.com> Reviewed-by: NValentin Schneider <valentin.schneider@arm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
-
由 Tom Lendacky 提交于
Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence, where the guest vCPU register state is updated and then the vCPU is VMRUN to begin execution of the AP. For an SEV-ES guest, this won't work because the guest register state is encrypted. Following the GHCB specification, the hypervisor must not alter the guest register state, so KVM must track an AP/vCPU boot. Should the guest want to park the AP, it must use the AP Reset Hold exit event in place of, for example, a HLT loop. First AP boot (first INIT-SIPI-SIPI sequence): Execute the AP (vCPU) as it was initialized and measured by the SEV-ES support. It is up to the guest to transfer control of the AP to the proper location. Subsequent AP boot: KVM will expect to receive an AP Reset Hold exit event indicating that the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to awaken it. When the AP Reset Hold exit event is received, KVM will place the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI sequence, KVM will make the vCPU runnable. It is again up to the guest to then transfer control of the AP to the proper location. To differentiate between an actual HLT and an AP Reset Hold, a new MP state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is placed in upon receiving the AP Reset Hold exit event. Additionally, to communicate the AP Reset Hold exit event up to userspace (if needed), a new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD. A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds a new function that, for non SEV-ES guests, invokes the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests, implements the logic above. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maxim Levitsky 提交于
It is possible to exit the nested guest mode, entered by svm_set_nested_state prior to first vm entry to it (e.g due to pending event) if the nested run was not pending during the migration. In this case we must not switch to the nested msr permission bitmap. Also add a warning to catch similar cases in the future. Fixes: a7d5c7ce ("KVM: nSVM: delay MSR permission processing to first nested VM run") Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-2-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maxim Levitsky 提交于
We overwrite most of vmcb fields while doing so, so we must mark it as dirty. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-5-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Maxim Levitsky 提交于
The code to store it on the migration exists, but no code was restoring it. One of the side effects of fixing this is that L1->L2 injected events are no longer lost when migration happens with nested run pending. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-3-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
The tdp_mmu_roots and tdp_mmu_pages in struct kvm_arch should only contain pages with tdp_mmu_page set to true. tdp_mmu_pages should not contain any pages with a non-zero root_count and tdp_mmu_roots should only contain pages with a positive root_count, unless a thread holds the MMU lock and is in the process of modifying the list. Various functions expect these invariants to be maintained, but they are not explictily documented. Add to the comments on both fields to document the above invariants. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210107001935.3732070-2-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Many TDP MMU functions which need to perform some action on all TDP MMU roots hold a reference on that root so that they can safely drop the MMU lock in order to yield to other threads. However, when releasing the reference on the root, there is a bug: the root will not be freed even if its reference count (root_count) is reduced to 0. To simplify acquiring and releasing references on TDP MMU root pages, and to ensure that these roots are properly freed, move the get/put operations into another TDP MMU root iterator macro. Moving the get/put operations into an iterator macro also helps simplify control flow when a root does need to be freed. Note that using the list_for_each_entry_safe macro would not have been appropriate in this situation because it could keep a pointer to the next root across an MMU lock release + reacquire, during which time that root could be freed. Reported-by: NMaciej S. Szmigiero <maciej.szmigiero@oracle.com> Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Fixes: faaf05b0 ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU") Fixes: 063afacd ("kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU") Fixes: a6a0b05d ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Fixes: 14881998 ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU") Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210107001935.3732070-1-bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Stephen Zhang 提交于
Signed-off-by: NStephen Zhang <stephenzhangzsd@gmail.com> Message-Id: <1608277897-1932-1-git-send-email-stephenzhangzsd@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Since we know that e >= s, we can reassociate the left shift, changing the shifted number from 1 to 2 in exchange for decreasing the right hand side by 1. Reported-by: syzbot+e87846c48bf72bc85311@syzkaller.appspotmail.com Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Uros Bizjak 提交于
Commit 16809ecd moved __svm_vcpu_run the prototype to svm.h, but forgot to remove the original from svm.c. Fixes: 16809ecd ("KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests") Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Message-Id: <20201220200339.65115-1-ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nathan Chancellor 提交于
When using LLVM's integrated assembler (LLVM_IAS=1) while building x86_64_defconfig + CONFIG_KVM=y + CONFIG_KVM_AMD=y, the following build error occurs: $ make LLVM=1 LLVM_IAS=1 arch/x86/kvm/svm/sev.o arch/x86/kvm/svm/sev.c:2004:15: error: too few operands for instruction asm volatile(__ex("vmsave") : : "a" (__sme_page_pa(sd->save_area)) : "memory"); ^ arch/x86/kvm/svm/sev.c:28:17: note: expanded from macro '__ex' #define __ex(x) __kvm_handle_fault_on_reboot(x) ^ ./arch/x86/include/asm/kvm_host.h:1646:10: note: expanded from macro '__kvm_handle_fault_on_reboot' "666: \n\t" \ ^ <inline asm>:2:2: note: instantiated into assembly here vmsave ^ 1 error generated. This happens because LLVM currently does not support calling vmsave without the fixed register operand (%rax for 64-bit and %eax for 32-bit). This will be fixed in LLVM 12 but the kernel currently supports LLVM 10.0.1 and newer so this needs to be handled. Add the proper register using the _ASM_AX macro, which matches the vmsave call in vmenter.S. Fixes: 86137773 ("KVM: SVM: Provide support for SEV-ES vCPU loading") Link: https://reviews.llvm.org/D93524 Link: https://github.com/ClangBuiltLinux/linux/issues/1216Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Message-Id: <20201219063711.3526947-1-natechancellor@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Check only the terminal leaf for a "!PRESENT || MMIO" SPTE when looking for reserved bits on valid, non-MMIO SPTEs. The get_walk() helpers terminate their walks if a not-present or MMIO SPTE is encountered, i.e. the non-terminal SPTEs have already been verified to be regular SPTEs. This eliminates an extra check-and-branch in a relatively hot loop. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-5-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Bump the size of the sptes array by one and use the raw level of the SPTE to index into the sptes array. Using the SPTE level directly improves readability by eliminating the need to reason out why the level is being adjusted when indexing the array. The array is on the stack and is not explicitly initialized; bumping its size is nothing more than a superficial adjustment to the stack frame. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-4-seanjc@google.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Get the so called "root" level from the low level shadow page table walkers instead of manually attempting to calculate it higher up the stack, e.g. in get_mmio_spte(). When KVM is using PAE shadow paging, the starting level of the walk, from the callers perspective, is not the CR3 root but rather the PDPTR "root". Checking for reserved bits from the CR3 root causes get_mmio_spte() to consume uninitialized stack data due to indexing into sptes[] for a level that was not filled by get_walk(). This can result in false positives and/or negatives depending on what garbage happens to be on the stack. Opportunistically nuke a few extra newlines. Fixes: 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU") Reported-by: NRichard Herbert <rherbert@sympatico.ca> Cc: Ben Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-3-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Return -1 from the get_walk() helpers if the shadow walk doesn't fill at least one spte, which can theoretically happen if the walk hits a not-present PDPTR. Returning the root level in such a case will cause get_mmio_spte() to return garbage (uninitialized stack data). In practice, such a scenario should be impossible as KVM shouldn't get a reserved-bit page fault with a not-present PDPTR. Note, using mmu->root_level in get_walk() is wrong for other reasons, too, but that's now a moot point. Fixes: 95fb5b02 ("kvm: x86/mmu: Support MMIO in the TDP MMU") Cc: Ben Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vineet Gupta 提交于
Linux 5.11.rcX was failing to boot on ARC HSDK board. Turns out we have a couple of issues, this being the first one, and I'm to blame as I didn't pay attention during review. TIF_NOTIFY_SIGNAL support requires checking multiple TIF_* bits in kernel return code path. Old code only needed to check a single bit so BBIT0 <TIF_SIGPENDING> worked. New code needs to check multiple bits so AND <bit-mask> instruction. So needs to use bit mask variant _TIF_SIGPENDING Cc: Jens Axboe <axboe@kernel.dk> Fixes: 53855e12 ("arc: add support for TIF_NOTIFY_SIGNAL") Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/34Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
- 07 1月, 2021 1 次提交
-
-
由 Catalin Marinas 提交于
For consistency with __uaccess_{disable,enable}_hw_pan(), move the PSTATE.TCO setting into dedicated __uaccess_{disable,enable}_tco() functions. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NVincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com>
-
- 06 1月, 2021 4 次提交
-
-
由 Ying-Tsun Huang 提交于
In mtrr_type_lookup(), if the input memory address region is not in the MTRR, over 4GB, and not over the top of memory, a write-back attribute is returned. These condition checks are for ensuring the input memory address region is actually mapped to the physical memory. However, if the end address is just aligned with the top of memory, the condition check treats the address is over the top of memory, and write-back attribute is not returned. And this hits in a real use case with NVDIMM: the nd_pmem module tries to map NVDIMMs as cacheable memories when NVDIMMs are connected. If a NVDIMM is the last of the DIMMs, the performance of this NVDIMM becomes very low since it is aligned with the top of memory and its memory type is uncached-minus. Move the input end address change to inclusive up into mtrr_type_lookup(), before checking for the top of memory in either mtrr_type_lookup_{variable,fixed}() helpers. [ bp: Massage commit message. ] Fixes: 0cc705f5 ("x86/mm/mtrr: Clean up mtrr_type_lookup()") Signed-off-by: NYing-Tsun Huang <ying-tsun.huang@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201215070721.4349-1-ying-tsun.huang@amd.com
-
由 Nathan Chancellor 提交于
Commit eff8728f ("vmlinux.lds.h: Add PGO and AutoFDO input sections") added ".text.unlikely.*" and ".text.hot.*" due to an LLVM change [1]. After another LLVM change [2], these sections are seen in some PowerPC builds, where there is a orphan section warning then build failure: $ make -skj"$(nproc)" \ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- LLVM=1 O=out \ distclean powernv_defconfig zImage.epapr ld.lld: warning: kernel/built-in.a(panic.o):(.text.unlikely.) is being placed in '.text.unlikely.' ... ld.lld: warning: address (0xc000000000009314) of section .text is not a multiple of alignment (256) ... ERROR: start_text address is c000000000009400, should be c000000000008000 ERROR: try to enable LD_HEAD_STUB_CATCH config option ERROR: see comments in arch/powerpc/tools/head_check.sh ... Explicitly handle these sections like in the main linker script so there is no more build failure. [1]: https://reviews.llvm.org/D79600 [2]: https://reviews.llvm.org/D92493 Fixes: 83a092cf ("powerpc: Link warning for orphan sections") Cc: stable@vger.kernel.org Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/1218 Link: https://lore.kernel.org/r/20210104205952.1399409-1-natechancellor@gmail.com
-
由 Randy Dunlap 提交于
fs/dax.c uses copy_user_page() but ARC does not provide that interface, resulting in a build error. Provide copy_user_page() in <asm/page.h>. ../fs/dax.c: In function 'copy_cow_page_dax': ../fs/dax.c:702:2: error: implicit declaration of function 'copy_user_page'; did you mean 'copy_to_user_page'? [-Werror=implicit-function-declaration] Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Cc: Dan Williams <dan.j.williams@intel.com> #Acked-by: Vineet Gupta <vgupta@synopsys.com> # v1 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: linux-fsdevel@vger.kernel.org Cc: linux-nvdimm@lists.01.org #Reviewed-by: Ira Weiny <ira.weiny@intel.com> # v2 Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
-
由 Peter Gonda 提交于
The IN and OUT instructions with port address as an immediate operand only use an 8-bit immediate (imm8). The current VC handler uses the entire 32-bit immediate value but these instructions only set the first bytes. Cast the operand to an u8 for that. [ bp: Massage commit message. ] Fixes: 25189d08 ("x86/sev-es: Add support for handling IOIO exceptions") Signed-off-by: NPeter Gonda <pgonda@google.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NDavid Rientjes <rientjes@google.com> Link: https://lkml.kernel.org/r/20210105163311.221490-1-pgonda@google.com
-