- 17 9月, 2022 1 次提交
-
-
由 知远pimo 提交于
mainline inclusion from mainline-v6.0 commit de8c8e52 category: feature bugzilla: https://gitee.com/openeuler/open-source-summer/issues/I56FG1?from=project-issue CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de8c8e52836d0082188508548d0b939f49f7f0e6 ------------------------------------------------- Move ptep_clear() to the include/linux/pgtable.h and add page table check relate hooks to some helpers, it's prepare for support page table check feature on new architecture. Optimize the implementation of ptep_clear(), page table hooks added page table check stubs, the interface control should be at stubs, there is no rationale for doing a IS_ENABLED() check here. For architectures that do not enable CONFIG_PAGE_TABLE_CHECK, they will call a fallback page table check stubs[1] when getting their page table helpers[2] in include/linux/pgtable.h. [1] page table check stubs defined in include/linux/page_table_check.h [2] ptep_clear() ptep_get_and_clear() pmdp_huge_get_and_clear() pudp_huge_get_and_clear() Link: https://lkml.kernel.org/r/20220507110114.4128854-4-tongtiangen@huawei.comSigned-off-by: NTong Tiangen <tongtiangen@huawei.com> Acked-by: NPasha Tatashin <pasha.tatashin@soleen.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> (cherry picked from commit de8c8e52) Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZeng Zhimin <im_zzm@126.com>
-
- 16 9月, 2022 2 次提交
-
-
由 知远pimo 提交于
mainline inclusion from mainline-v6.0 commit e5a55401 category: feature bugzilla: https://gitee.com/openeuler/open-source-summer/issues/I56FG1?from=project-issue CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5a554014618308f046af99ab9c950165ed6cb11 ------------------------------------------------- The pxx_user_accessible_page() checks the PTE bit, it's architecture-specific code, move them into x86's pgtable.h. These helpers are being moved out to make the page table check framework platform independent. Link: https://lkml.kernel.org/r/20220507110114.4128854-3-tongtiangen@huawei.comSigned-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NTong Tiangen <tongtiangen@huawei.com> Acked-by: NPasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> (cherry picked from commit e5a55401) Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZeng Zhimin <im_zzm@126.com>
-
由 知远pimo 提交于
mainline inclusion from mainline-v6.0 commit d283d422 category: feature bugzilla: https://gitee.com/openeuler/open-source-summer/issues/I56FG1?from=project-issue CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d283d422c6c4f0264fe8ecf5ae80036bf73f4594 ------------------------------------------------- Add page table check hooks into routines that modify user page tables. Link: https://lkml.kernel.org/r/20211221154650.1047963-5-pasha.tatashin@soleen.comSigned-off-by: NPasha Tatashin <pasha.tatashin@soleen.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NZeng Zhimin <im_zzm@126.com>
-
- 01 9月, 2022 1 次提交
-
-
由 Wei Huang 提交于
mainline inclusion from mainline-5.15 commit cb0f722a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- When the 5-level page table CPU flag is set in the host, but the guest has CR4.LA57=0 (including the case of a 32-bit guest), the top level of the shadow NPT page tables will be fixed, consisting of one pointer to a lower-level table and 511 non-present entries. Extend the existing code that creates the fixed PML4 or PDP table, to provide a fixed PML5 table if needed. This is not needed on EPT because the number of layers in the tables is specified in the EPTP instead of depending on the host CR4. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NWei Huang <wei.huang2@amd.com> Message-Id: <20210818165549.3771014-3-wei.huang2@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
- 31 8月, 2022 8 次提交
-
-
由 Suravee Suthikulpanit 提交于
mainline inclusion from mainline-5.18-rc1 commit 4a204f78 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- Expand KVM's mask for the AVIC host physical ID to the full 12 bits defined by the architecture. The number of bits consumed by hardware is model specific, e.g. early CPUs ignored bits 11:8, but there is no way for KVM to enumerate the "true" size. So, KVM must allow using all bits, else it risks rejecting completely legal x2APIC IDs on newer CPUs. This means KVM relies on hardware to not assign x2APIC IDs that exceed the "true" width of the field, but presumably hardware is smart enough to tie the width to the max x2APIC ID. KVM also relies on hardware to support at least 8 bits, as the legacy xAPIC ID is writable by software. But, those assumptions are unavoidable due to the lack of any way to enumerate the "true" width. Cc: stable@vger.kernel.org Cc: Maxim Levitsky <mlevitsk@redhat.com> Suggested-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Fixes: 44a95dae ("KVM: x86: Detect and Initialize AVIC support") Signed-off-by: NSuravee Suthikulpanit <suravee.suthikulpanit@amd.com> Message-Id: <20220211000851.185799-1-suravee.suthikulpanit@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Maxim Levitsky 提交于
mainline inclusion from mainline-v5.17 commit 39150352 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- asm/svm.h is the correct place for all values that are defined in the SVM spec, and that includes AVIC. Also add some values from the spec that were not defined before and will be soon useful. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220207155447.840194-10-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-v5.13 commit 03ca4589 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- Disallow loading KVM SVM if 5-level paging is supported. In theory, NPT for L1 should simply work, but there unknowns with respect to how the guest's MAXPHYADDR will be handled by hardware. Nested NPT is more problematic, as running an L1 VMM that is using 2-level page tables requires stacking single-entry PDP and PML4 tables in KVM's NPT for L2, as there are no equivalent entries in L1's NPT to shadow. Barring hardware magic, for 5-level paging, KVM would need stack another layer to handle PML5. Opportunistically rename the lm_root pointer, which is used for the aforementioned stacking when shadowing 2-level L1 NPT, to pml4_root to call out that it's specifically for PML4. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210505204221.1934471-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Yazen Ghannam 提交于
mainline inclusion from mainline-v5.17 commit 91f75eb4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is Reserved/Read-as-Zero. For example: Bank # | CPUx | CPUy 0 LS LS 1 RAZ UMC 2 CS CS 3 SMU RAZ Future AMD systems will lay out MCA bank types such that the type of bank number "i" may be different across CPUs. For example: Bank # | CPUx | CPUy 0 LS LS 1 RAZ UMC 2 CS NBIO 3 SMU RAZ Change the structures that cache MCA bank types to be per-CPU and update smca_get_bank_type() to handle this change. Move some SMCA-specific structures to amd.c from mce.h, since they no longer need to be global. Break out the "count" for bank types from struct smca_hwid, since this should provide a per-CPU count rather than a system-wide count. Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The values in this array should not change at runtime. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.comSigned-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Yazen Ghannam 提交于
mainline inclusion from mainline-v5.17 commit 0b746e8c category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- The address translation code used for current AMD systems is non-architectural. So move it to EDAC. Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211028175728.121452-2-yazen.ghannam@amd.comSigned-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Mukul Joshi 提交于
mainline inclusion from mainline-v5.16 commit f38ce910 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- Export smca_get_bank_type for use in the AMD GPU driver to determine MCA bank while handling correctable and uncorrectable errors in GPU UMC. Signed-off-by: NMukul Joshi <mukul.joshi@amd.com> Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Yazen Ghannam 提交于
mainline inclusion from mainline-v5.17 commit 5176a93a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- Add HWID and McaType values for new SMCA bank types, and add their error descriptions to edac_mce_amd. The "PHY" bank types all have the same error descriptions, and the NBIF and SHUB bank types have the same error descriptions. So reuse the same arrays where appropriate. [ bp: Remove useless comments over hwid types. ] Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-2-yazen.ghannam@amd.comSigned-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
由 Muralidhara M K 提交于
mainline inclusion from mainline-v5.14 commit 94a311ce category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NGRU CVE: NA ------------------------------------------------- Add the (HWID, MCATYPE) tuples and names for new SMCA bank types. Also, add their respective error descriptions to the MCE decoding module edac_mce_amd. Also while at it, optimize the string names for some SMCA banks. [ bp: Drop repeated comments, explain why UMC_V2 is a separate entry. ] Signed-off-by: NMuralidhara M K <muralimk@amd.com> Signed-off-by: NNaveen Krishna Chatradhi <nchatrad@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NYazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20210526164601.66228-1-nchatrad@amd.comSigned-off-by: NXie Haocheng <haocheng.xie@amd.com>
-
- 29 8月, 2022 4 次提交
-
-
由 Chao Gao 提交于
mainline inclusion from mainline-v6.0-rc1 commit d588bb9b category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d588bb9be1da6aa750aa64875fe57369db983d8b Intel-SIG: commit d588bb9b ("KVM: VMX: enable IPI virtualization") ------------------------------------- KVM: VMX: enable IPI virtualization With IPI virtualization enabled, the processor emulates writes to APIC registers that would send IPIs. The processor sets the bit corresponding to the vector in target vCPU's PIR and may send a notification (IPI) specified by NDST and NV fields in target vCPU's Posted-Interrupt Descriptor (PID). It is similar to what IOMMU engine does when dealing with posted interrupt from devices. A PID-pointer table is used by the processor to locate the PID of a vCPU with the vCPU's APIC ID. The table size depends on maximum APIC ID assigned for current VM session from userspace. Allocating memory for PID-pointer table is deferred to vCPU creation, because irqchip mode and VM-scope maximum APIC ID is settled at that point. KVM can skip PID-pointer table allocation if !irqchip_in_kernel(). Like VT-d PI, if a vCPU goes to blocked state, VMM needs to switch its notification vector to wakeup vector. This can ensure that when an IPI for blocked vCPUs arrives, VMM can get control and wake up blocked vCPUs. And if a VCPU is preempted, its posted interrupt notification is suppressed. Note that IPI virtualization can only virualize physical-addressing, flat mode, unicast IPIs. Sending other IPIs would still cause a trap-like APIC-write VM-exit and need to be handled by VMM. Signed-off-by: NChao Gao <chao.gao@intel.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419154510.11938-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Zeng Guang 提交于
mainline inclusion from mainline-v6.0-rc1 commit 35875316 category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=35875316384b71d23dc2a45a969732fc8cab16af Intel-SIG: commit 35875316 ("KVM: x86: Allow userspace to set maximum VCPU id for VM") ------------------------------------- KVM: x86: Allow userspace to set maximum VCPU id for VM Introduce new max_vcpu_ids in KVM for x86 architecture. Userspace can assign maximum possible vcpu id for current VM session using KVM_CAP_MAX_VCPU_ID of KVM_ENABLE_CAP ioctl(). This is done for x86 only because the sole use case is to guide memory allocation for PID-pointer table, a structure needed to enable VMX IPI. By default, max_vcpu_ids set as KVM_MAX_VCPU_ID. Suggested-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419154444.11888-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Robert Hoo 提交于
mainline inclusion from mainline-v6.0-rc1 commit 1ad4e543 category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1ad4e5438c67a01620ed67cea959de89f4430515 Intel-SIG: commit 1ad4e543 ("KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config") ------------------------------------- KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config Check VMX features on tertiary execution control in VMCS config setup. Sub-features in tertiary execution control to be enabled are adjusted according to hardware capabilities although no sub-feature is enabled in this patch. EVMCSv1 doesn't support tertiary VM-execution control, so disable it when EVMCSv1 is in use. And define the auxiliary functions for Tertiary control field here, using the new BUILD_CONTROLS_SHADOW(). Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Signed-off-by: NRobert Hoo <robert.hu@linux.intel.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419153400.11642-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Robert Hoo 提交于
mainline inclusion from mainline-v6.0-rc1 commit 465932db category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=465932db25f3664893b66152c7b190afd28c32db Intel-SIG: commit 465932db ("x86/cpu: Add new VMX feature, Tertiary VM-Execution control") ------------------------------------- x86/cpu: Add new VMX feature, Tertiary VM-Execution control A new 64-bit control field "tertiary processor-based VM-execution controls", is defined [1]. It's controlled by bit 17 of the primary processor-based VM-execution controls. Different from its brother VM-execution fields, this tertiary VM- execution controls field is 64 bit. So it occupies 2 vmx_feature_leafs, TERTIARY_CTLS_LOW and TERTIARY_CTLS_HIGH. Its companion VMX capability reporting MSR,MSR_IA32_VMX_PROCBASED_CTLS3 (0x492), is also semantically different from its brothers, whose 64 bits consist of all allow-1, rather than 32-bit allow-0 and 32-bit allow-1 [1][2]. Therefore, its init_vmx_capabilities() is a little different from others. [1] ISE 6.2 "VMCS Changes" https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html [2] SDM Vol3. Appendix A.3 Reviewed-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Signed-off-by: NRobert Hoo <robert.hu@linux.intel.com> Signed-off-by: NZeng Guang <guang.zeng@intel.com> Message-Id: <20220419153240.11549-1-guang.zeng@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
- 19 8月, 2022 2 次提交
-
-
由 Andi Kleen 提交于
mainline inclusion from mainline-v5.14-rc2 commit 28188cc4 category: feature feature: SPR PMU uncore support bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5BECO Intel-SIG: commit 28188cc4 x86/cpu: Fix core name for Sapphire Rapids This commit is backported as a dependency for SPR PMU uncore support. ------------------------------------- Sapphire Rapids uses Golden Cove, not Willow Cove. Fixes: 53375a5a ("x86/cpu: Resort and comment Intel models") Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210513163904.3083274-1-ak@linux.intel.comSigned-off-by: NYunying Sun <yunying.sun@intel.com>
-
由 Peter Zijlstra 提交于
mainline inclusion from mainline-v5.13-rc1 commit 53375a5a category: feature feature: SPR PMU uncore support bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5BECO Intel-SIG: commit 53375a5a x86/cpu: Resort and comment Intel models This commit is backported as a dependency for SPR PMU uncore support. ------------------------------------- The INTEL_FAM6 list has become a mess again. Try and bring some sanity back into it. Where previously we had one microarch per year and a number of SKUs within that, this no longer seems to be the case. We now get different uarch names that share a 'core' design. Add the core name starting at skylake and reorder to keep the cores in chronological order. Furthermore, Intel marketed the names {Amber, Coffee, Whiskey} Lake, but those are in fact steppings of Kaby Lake, add comments for them. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/YE+HhS8i0gshHD3W@hirez.programming.kicks-ass.netSigned-off-by: NYunying Sun <yunying.sun@intel.com>
-
- 18 8月, 2022 1 次提交
-
-
由 Tong Tiangen 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5GB28 CVE: NA ------------------------------- x86/powerpc has it's implementation of copy_mc_to_user(), we add generic fallback in include/linux/uaccess.h prepare for other architechures to enable CONFIG_ARCH_HAS_COPY_MC. Signed-off-by: NTong Tiangen <tongtiangen@huawei.com>
-
- 15 8月, 2022 2 次提交
-
-
由 Fenghua Yu 提交于
mainline inclusion from mainline-v5.12-rc4 commit ebb1064e category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5G10C CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=ebb1064e Intel-SIG: commit ebb1064e x86/traps: Handle #DB for bus lock -------------------------------- Bus locks degrade performance for the whole system, not just for the CPU that requested the bus lock. Two CPU features "#AC for split lock" and "#DB for bus lock" provide hooks so that the operating system may choose one of several mitigation strategies. bus lock feature to cover additional situations with new options to mitigate. split_lock_detect= #AC for split lock #DB for bus lock off Do nothing Do nothing warn Kernel OOPs Warn once per task and Warn once per task and and continues to run. disable future checking When both features are supported, warn in #AC fatal Kernel OOPs Send SIGBUS to user. Send SIGBUS to user When both features are supported, fatal in #AC ratelimit:N Do nothing Limit bus lock rate to N per second in the current non-root user. Default option is "warn". Hardware only generates #DB for bus lock detect when CPL>0 to avoid nested #DB from multiple bus locks while the first #DB is being handled. So no need to handle #DB for bus lock detected in the kernel. while #AC for split lock is enabled by split lock detection bit 29 in TEST_CTRL MSR. Both breakpoint and bus lock in the same instruction can trigger one #DB. The bus lock is handled before the breakpoint in the #DB handler. Delivery of #DB for bus lock in userspace clears DR6[11], which is set by the #DB handler right after reading DR6. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210322135325.682257-3-fenghua.yu@intel.com (cherry picked from commit ebb1064e) Signed-off-by: NEthan Zhao <haifeng.zhao@linux.intel.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Fenghua Yu 提交于
mainline inclusion from mainline-v5.12-rc4 commit f21d4d3b category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5G10C CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=f21d4d3b Intel-SIG: commit f21d4d3b x86/cpufeatures: Enumerate #DB for bus lock detection -------------------------------- A bus lock is acquired through either a split locked access to writeback (WB) memory or any locked access to non-WB memory. This is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores. Some CPUs have the ability to notify the kernel by a #DB trap after a user instruction acquires a bus lock and is executed. This allows the kernel to enforce user application throttling or mitigation. Both breakpoint and bus lock can trigger the #DB trap in the same instruction and the ordering of handling them is the kernel #DB handler's choice. The CPU feature flag to be shown in /proc/cpuinfo will be "bus_lock_detect". Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210322135325.682257-2-fenghua.yu@intel.com (cherry picked from commit f21d4d3b) Signed-off-by: NEthan Zhao <haifeng.zhao@linux.intel.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
- 05 8月, 2022 2 次提交
-
-
由 Kan Liang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 61b985e3 category: feature feature: SPR PMU core event enhancement bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I596BF Intel-SIG: commit 61b985e3 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") ------------------------------------- Add perf core PMU support for the Intel Sapphire Rapids server, which is the successor of the Intel Ice Lake server. The enabling code is based on Ice Lake, but there are several new features introduced. The event encoding is changed and simplified, e.g., the event codes which are below 0x90 are restricted to counters 0-3. The event codes which above 0x90 are likely to have no restrictions. The event constraints, extra_regs(), and hardware cache events table are changed accordingly. A new Precise Distribution (PDist) facility is introduced, which further minimizes the skid when a precise event is programmed on the GP counter 0. Enable the Precise Distribution (PDist) facility with :ppp event. For this facility to work, the period must be initialized with a value larger than 127. Add spr_limit_period() to apply the limit for :ppp event. Two new data source fields, data block & address block, are added in the PEBS Memory Info Record for the load latency event. To enable the feature, - An auxiliary event has to be enabled together with the load latency event on Sapphire Rapids. A new flag PMU_FL_MEM_LOADS_AUX is introduced to indicate the case. A new event, mem-loads-aux, is exposed to sysfs for the user tool. Add a check in hw_config(). If the auxiliary event is not detected, return an unique error -ENODATA. - The union perf_mem_data_src is extended to support the new fields. - Ice Lake and earlier models do not support block information, but the fields may be set by HW on some machines. Add pebs_no_block to explicitly indicate the previous platforms which don't support the new block fields. Accessing the new block fields are ignored on those platforms. A new store Latency facility is introduced, which leverages the PEBS facility where it can provide additional information about sampled stores. The additional information includes the data address, memory auxiliary info (e.g. Data Source, STLB miss) and the latency of the store access. To enable the facility, the new event (0x02cd) has to be programed on the GP counter 0. A new flag PERF_X86_EVENT_PEBS_STLAT is introduced to indicate the event. The store_latency_data() is introduced to parse the memory auxiliary info. The layout of access latency field of PEBS Memory Info Record has been changed. Two latency, instruction latency (bit 15:0) and cache access latency (bit 47:32) are recorded. - The cache access latency is similar to previous memory access latency. For loads, the latency starts by the actual cache access until the data is returned by the memory subsystem. For stores, the latency starts when the demand write accesses the L1 data cache and lasts until the cacheline write is completed in the memory subsystem. The cache access latency is stored in low 32bits of the sample type PERF_SAMPLE_WEIGHT_STRUCT. - The instruction latency starts by the dispatch of the load operation for execution and lasts until completion of the instruction it belongs to. Add a new flag PMU_FL_INSTR_LATENCY to indicate the instruction latency support. The instruction latency is stored in the bit 47:32 of the sample type PERF_SAMPLE_WEIGHT_STRUCT. Extends the PERF_METRICS MSR to feature TMA method level 2 metrics. The lower half of the register is the TMA level 1 metrics (legacy). The upper half is also divided into four 8-bit fields for the new level 2 metrics. Expose all eight Topdown metrics events to user space. The full description for the SPR features can be found at Intel Architecture Instruction Set Extensions and Future Features Programming Reference, 319433-041. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-5-git-send-email-kan.liang@linux.intel.comSigned-off-by: NYunying Sun <yunying.sun@intel.com> Signed-off-by: NJun Tian <jun.j.tian@intel.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Kan Liang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 1ab5f235 category: feature feature: SPR PMU core event enhancement bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I596BF Intel-SIG: commit 1ab5f235 ("perf/x86/intel: Filter unsupported Topdown metrics event") ------------------------------------- Intel Sapphire Rapids server will introduce 8 metrics events. Intel Ice Lake only supports 4 metrics events. A perf tool user may mistakenly use the unsupported events via RAW format on Ice Lake. The user can still get a value from the unsupported Topdown metrics event once the following Sapphire Rapids enabling patch is applied. To enable the 8 metrics events on Intel Sapphire Rapids, the INTEL_TD_METRIC_MAX has to be updated, which impacts the is_metric_event(). The is_metric_event() is a generic function. On Ice Lake, the newly added SPR metrics events will be mistakenly accepted as metric events on creation. At runtime, the unsupported Topdown metrics events will be updated. Add a variable num_topdown_events in x86_pmu to indicate the available number of the Topdown metrics event on the platform. Apply the number into is_metric_event(). Only the supported Topdown metrics events should be created as metrics events. Apply the num_topdown_events in icl_update_topdown_event() as well. The function can be reused by the following patch. Suggested-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-4-git-send-email-kan.liang@linux.intel.comSigned-off-by: NYunying Sun <yunying.sun@intel.com> Signed-off-by: NJun Tian <jun.j.tian@intel.com> Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
- 04 8月, 2022 17 次提交
-
-
由 Borislav Petkov 提交于
stable inclusion from stable-v5.10.114 commit 2ab14625b879eec22854f1dbe61d51570b427513 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5IY1V Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2ab14625b879eec22854f1dbe61d51570b427513 -------------------------------- commit f9e14dbb upstream. When resuming from system sleep state, restore_processor_state() restores the boot CPU MSRs. These MSRs could be emulated by microcode. If microcode is not loaded yet, writing to emulated MSRs leads to unchecked MSR access error: ... PM: Calling lapic_suspend+0x0/0x210 unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0...0) at rIP: ... (native_write_msr) Call Trace: <TASK> ? restore_processor_state x86_acpi_suspend_lowlevel acpi_suspend_enter suspend_devices_and_enter pm_suspend.cold state_store kobj_attr_store sysfs_kf_write kernfs_fop_write_iter new_sync_write vfs_write ksys_write __x64_sys_write do_syscall_64 entry_SYSCALL_64_after_hwframe RIP: 0033:0x7fda13c260a7 To ensure microcode emulated MSRs are available for restoration, load the microcode on the boot CPU before restoring these MSRs. [ Pawan: write commit message and productize it. ] Fixes: e2a1256b ("x86/speculation: Restore speculation related MSRs during S3 resume") Reported-by: NKyle D. Pelton <kyle.d.pelton@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com> Tested-by: NKyle D. Pelton <kyle.d.pelton@intel.com> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=215841 Link: https://lore.kernel.org/r/4350dfbf785cd482d3fafa72b2b49c83102df3ce.1650386317.git.pawan.kumar.gupta@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Jim Mattson 提交于
mainline inclusion from mainline-v5.18-rc1 commit fa31a4d6 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit fa31a4d6 x86/cpufeatures: Put the AMX macros in the word 18 block. -------------------------------- These macros are for bits in CPUID.(EAX=7,ECX=0):EDX, not for bits in CPUID(EAX=7,ECX=1):EAX. Put them with their brethren. [ bp: Sort word 18 bits properly, as caught by Like Xu <like.xu.linux@gmail.com> ] Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20220203194308.2469117-1-jmattson@google.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Dave Hansen 提交于
mainline inclusion from mainline-v5.16-rc1 commit 30d02551 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 30d02551 x86/fpu: Optimize out sigframe xfeatures when in init state. -------------------------------- tl;dr: AMX state is ~8k. Signal frames can have space for this ~8k and each signal entry writes out all 8k even if it is zeros. Skip writing zeros for AMX to speed up signal delivery by about 4% overall when AMX is in its init state. This is a user-visible change to the sigframe ABI. == Hardware XSAVE Background == XSAVE state components may be tracked by the processor as being in their initial configuration. Software can detect which features are in this configuration by looking at the XSTATE_BV field in an XSAVE buffer or with the XGETBV(1) instruction. Both the XSAVE and XSAVEOPT instructions enumerate features s being in the initial configuration via the XSTATE_BV field in the XSAVE header, However, XSAVEOPT declines to actually write features in their initial configuration to the buffer. XSAVE writes the feature unconditionally, regardless of whether it is in the initial configuration or not. Basically, XSAVE users never need to inspect XSTATE_BV to determine if the feature has been written to the buffer. XSAVEOPT users *do* need to inspect XSTATE_BV. They might also need to clear out the buffer if they want to make an isolated change to the state, like modifying one register. == Software Signal / XSAVE Background == Signal frames have historically been written with XSAVE itself. Each state is written in its entirety, regardless of being in its initial configuration. In other words, the signal frame ABI uses the XSAVE behavior, not the XSAVEOPT behavior. == Problem == This means that any application which has acquired permission to use AMX via ARCH_REQ_XCOMP_PERM will write 8k of state to the signal frame. This 8k write will occur even when AMX was in its initial configuration and software *knows* this because of XSTATE_BV. This problem also exists to a lesser degree with AVX-512 and its 2k of state. However, AVX-512 use does not require ARCH_REQ_XCOMP_PERM and is more likely to have existing users which would be impacted by any change in behavior. == Solution == Stop writing out AMX xfeatures which are in their initial state to the signal frame. This effectively makes the signal frame XSAVE buffer look as if it were written with a combination of XSAVEOPT and XSAVE behavior. Userspace which handles XSAVEOPT- style buffers should be able to handle this naturally. For now, include only the AMX xfeatures: XTILE and XTILEDATA in this new behavior. These require new ABI to use anyway, which makes their users very unlikely to be broken. This XSAVEOPT-like behavior should be expected for all future dynamic xfeatures. It may also be extended to legacy features like AVX-512 in the future. Only attempt this optimization on systems with dynamic features. Disable dynamic feature support (XFD) if XGETBV1 is unavailable by adding a CPUID dependency. This has been measured to reduce the *overall* cycle cost of signal delivery by about 4%. Fixes: 2308ee57 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: N"Chang S. Bae" <chang.seok.bae@intel.com> Link: https://lore.kernel.org/r/20211102224750.FA412E26@davehans-spike.ostc.intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit 2308ee57 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 2308ee57 x86/fpu/amx: Enable the AMX feature in 64-bit mode. -------------------------------- Add the AMX state components in XFEATURE_MASK_USER_SUPPORTED and the TILE_DATA component to the dynamic states and update the permission check table accordingly. This is only effective on 64 bit kernels as for 32bit kernels XFEATURE_MASK_TILE is defined as 0. TILE_DATA is caller-saved state and the only dynamic state. Add build time sanity check to ensure the assumption that every dynamic feature is caller- saved. Make AMX state depend on XFD as it is dynamic feature. Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211021225527.10184-24-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit eec2113e category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit eec2113e x86/fpu/amx: Define AMX state components and have it used for boot-time checks. -------------------------------- The XSTATE initialization uses check_xstate_against_struct() to sanity check the size of XSTATE-enabled features. AMX is a XSAVE-enabled feature, and its size is not hard-coded but discoverable at run-time via CPUID. The AMX state is composed of state components 17 and 18, which are all user state components. The first component is the XTILECFG state of a 64-byte tile-related control register. The state component 18, called XTILEDATA, contains the actual tile data, and the state size varies on implementations. The architectural maximum, as defined in the CPUID(0x1d, 1): EAX[15:0], is a byte less than 64KB. The first implementation supports 8KB. Check the XTILEDATA state size dynamically. The feature introduces the new tile register, TMM. Define one register struct only and read the number of registers from CPUID. Cross-check the overall size with CPUID again. Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211021225527.10184-21-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit 500afbf6 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 500afbf6 x86/fpu/xstate: Add fpstate_realloc()/free(). -------------------------------- The fpstate embedded in struct fpu is the default state for storing the FPU registers. It's sized so that the default supported features can be stored. For dynamically enabled features the register buffer is too small. The #NM handler detects first use of a feature which is disabled in the XFD MSR. After handling permission checks it recalculates the size for kernel space and user space state and invokes fpstate_realloc() which tries to reallocate fpstate and install it. Provide the allocator function which checks whether the current buffer size is sufficient and if not allocates one. If allocation is successful the new fpstate is initialized with the new features and sizes and the now enabled features is removed from the task's XFD mask. realloc_fpstate() uses vzalloc(). If use of this mechanism grows to re-allocate buffers larger than 64KB, a more sophisticated allocation scheme that includes purpose-built reclaim capability might be justified. Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211021225527.10184-19-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit 783e87b4 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 783e87b4 x86/fpu/xstate: Add XFD #NM handler. -------------------------------- If the XFD MSR has feature bits set then #NM will be raised when user space attempts to use an instruction related to one of these features. When the task has no permissions to use that feature, raise SIGILL, which is the same behavior as #UD. If the task has permissions, calculate the new buffer size for the extended feature set and allocate a larger fpstate. In the unlikely case that vzalloc() fails, SIGSEGV is raised. The allocation function will be added in the next step. Provide a stub which fails for now. [ tglx: Updated serialization ] Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211021225527.10184-18-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit 8bf26758 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 8bf26758 x86/fpu: Add XFD state to fpstate. -------------------------------- Add storage for XFD register state to struct fpstate. This will be used to store the XFD MSR state. This will be used for switching the XFD MSR when FPU content is restored. Add a per-CPU variable to cache the current MSR value so the MSR has only to be written when the values are different. Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-15-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit dae1bd58 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit dae1bd58 x86/msr-index: Add MSRs for XFD. -------------------------------- XFD introduces two MSRs: - IA32_XFD to enable/disable a feature controlled by XFD - IA32_XFD_ERR to expose to the #NM trap handler which feature was tried to be used for the first time. Both use the same xstate-component bitmap format, used by XCR0. Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-14-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit c3511016 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit c3511016 x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit. -------------------------------- Intel's eXtended Feature Disable (XFD) feature is an extension of the XSAVE architecture. XFD allows the kernel to enable a feature state in XCR0 and to receive a #NM trap when a task uses instructions accessing that state. This is going to be used to postpone the allocation of a larger XSTATE buffer for a task to the point where it is actually using a related instruction after the permission to use that facility has been granted. XFD is not used by the kernel, but only applied to userspace. This is a matter of policy as the kernel knows how a fpstate is reallocated and the XFD state. The compacted XSAVE format is adjustable for dynamic features. Make XFD depend on XSAVES. Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-13-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.16-rc1 commit 9e798e9a category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 9e798e9a x86/fpu: Prepare fpu_clone() for dynamically enabled features. -------------------------------- The default portion of the parent's FPU state is saved in a child task. With dynamic features enabled, the non-default portion is not saved in a child's fpstate because these register states are defined to be caller-saved. The new task's fpstate is therefore the default buffer. Fork inherits the permission of the parent. Also, do not use memcpy() when TIF_NEED_FPU_LOAD is set because it is invalid when the parent has dynamic features. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-11-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.16-rc1 commit 23686ef2 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 23686ef2 x86/fpu: Add basic helpers for dynamically enabled features. -------------------------------- To allow building up the infrastructure required to support dynamically enabled FPU features, add: - XFEATURES_MASK_DYNAMIC This constant will hold xfeatures which can be dynamically enabled. - fpu_state_size_dynamic() A static branch for 64-bit and a simple 'return false' for 32-bit. This helper allows to add dynamic-feature-specific changes to common code which is shared between 32-bit and 64-bit without #ifdeffery. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-8-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Chang S. Bae 提交于
mainline inclusion from mainline-v5.16-rc1 commit db8268df category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit db8268df x86/arch_prctl: Add controls for dynamic XSTATE components. -------------------------------- Dynamically enabled XSTATE features are by default disabled for all processes. A process has to request permission to use such a feature. To support this implement a architecture specific prctl() with the options: - ARCH_GET_XCOMP_SUPP Copies the supported feature bitmap into the user space provided u64 storage. The pointer is handed in via arg2 - ARCH_GET_XCOMP_PERM Copies the process wide permitted feature bitmap into the user space provided u64 storage. The pointer is handed in via arg2 - ARCH_REQ_XCOMP_PERM Request permission for a feature set. A feature set can be mapped to a facility, e.g. AMX, and can require one or more XSTATE components to be enabled. The feature argument is the number of the highest XSTATE component which is required for a facility to work. The request argument is not a user supplied bitmap because that makes filtering harder (think seccomp) and even impossible because to support 32bit tasks the argument would have to be a pointer. The permission mechanism works this way: Task asks for permission for a facility and kernel checks whether that's supported. If supported it does: 1) Check whether permission has already been granted 2) Compute the size of the required kernel and user space buffer (sigframe) size. 3) Validate that no task has a sigaltstack installed which is smaller than the resulting sigframe size 4) Add the requested feature bit(s) to the permission bitmap of current->group_leader->fpu and store the sizes in the group leaders fpu struct as well. If that is successful then the feature is still not enabled for any of the tasks. The first usage of a related instruction will result in a #NM trap. The trap handler validates the permission bit of the tasks group leader and if permitted it installs a larger kernel buffer and transfers the permission and size info to the new fpstate container which makes all the FPU functions which require per task information aware of the extended feature set. [ tglx: Adopted to new base code, added missing serialization, massaged namings, comments and changelog ] Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-7-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.16-rc1 commit c33f0a81 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit c33f0a81 x86/fpu: Add fpu_state_config::legacy_features. -------------------------------- The upcoming prctl() which is required to request the permission for a dynamically enabled feature will also provide an option to retrieve the supported features. If the CPU does not support XSAVE, the supported features would be 0 even when the CPU supports FP and SSE. Provide separate storage for the legacy feature set to avoid that and fill in the bits in the legacy init function. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-6-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.16-rc1 commit 6f6a7c09 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 6f6a7c09 x86/fpu: Add members to struct fpu to cache permission information. -------------------------------- Dynamically enabled features can be requested by any thread of a running process at any time. The request does neither enable the feature nor allocate larger buffers. It just stores the permission to use the feature by adding the features to the permission bitmap and by calculating the required sizes for kernel and user space. The reallocation of the kernel buffer happens when the feature is used for the first time which is caught by an exception. The permission bitmap is then checked and if the feature is permitted, then it becomes fully enabled. If not, the task dies similarly to a task which uses an undefined instruction. The size information is precomputed to allow proper sigaltstack size checks once the feature is permitted, but not yet in use because otherwise this would open race windows where too small stacks could be installed causing a later fail on signal delivery. Initialize them to the default feature set and sizes. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-5-chang.seok.bae@intel.comSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.16-rc1 commit 582b01b6 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit 582b01b6 x86/fpu: Remove old KVM FPU interface. -------------------------------- No more users. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211022185313.074853631@linutronix.deSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-
由 Thomas Gleixner 提交于
mainline inclusion from mainline-v5.16-rc1 commit d69c1382 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I590ZC CVE: NA Intel-SIG: commit d69c1382 x86/kvm: Convert FPU handling to a single swap buffer. -------------------------------- For the upcoming AMX support it's necessary to do a proper integration with KVM. Currently KVM allocates two FPU structs which are used for saving the user state of the vCPU thread and restoring the guest state when entering vcpu_run() and doing the reverse operation before leaving vcpu_run(). With the new fpstate mechanism this can be reduced to one extra buffer by swapping the fpstate pointer in current::thread::fpu. This makes the upcoming support for AMX and XFD simpler because then fpstate information (features, sizes, xfd) are always consistent and it does not require any nasty workarounds. Convert the KVM FPU code over to this new scheme. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211022185313.019454292@linutronix.deSigned-off-by: NLin Wang <lin.x.wang@intel.com>
-