- 07 11月, 2022 1 次提交
-
-
由 LeoLiuoc 提交于
zhaoxin inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5SMFS CVE: NA -------------------------------------------- Set CONFIG_HW_RANDOM_ZHAOXIN to 'm' by default in openeuler_configs Signed-off-by: Nleoliuoc <leoliu-oc@zhaoxin.com>
-
- 04 11月, 2022 1 次提交
-
-
由 LeoLiu-oc 提交于
zhaoxin inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5NYQF CVE: NA -------------------------------------------- Add Zhaoxin feature bits on Zhaoxin CPUs. Signed-off-by: NLeoLiu-oc <LeoLiu-oc@zhaoxin.com>
-
- 03 11月, 2022 14 次提交
-
-
由 Chen Zhongjin 提交于
mainline inclusion from mainline-v6.0-rc3 commit fc2e426b category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q4RA CVE: NA -------------------------------- When meeting ftrace trampolines in ORC unwinding, unwinder uses address of ftrace_{regs_}call address to find the ORC entry, which gets next frame at sp+176. If there is an IRQ hitting at sub $0xa8,%rsp, the next frame should be sp+8 instead of 176. It makes unwinder skip correct frame and throw warnings such as "wrong direction" or "can't access registers", etc, depending on the content of the incorrect frame address. By adding the base address ftrace_{regs_}caller with the offset *ip - ops->trampoline*, we can get the correct address to find the ORC entry. Also change "caller" to "tramp_addr" to make variable name conform to its content. [ mingo: Clarified the changelog a bit. ] Fixes: 6be7fa3c ("ftrace, orc, x86: Handle ftrace dynamically allocated trampolines") Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NSteven Rostedt (Google) <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220819084334.244016-1-chenzhongjin@huawei.comSigned-off-by: NChen Zhongjin <chenzhongjin@huawei.com> Reviewed-by: NYang Jihong <yangjihong1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v6.0-rc1 commit ffa6482e category: feature feature: KVM Bus Lock Debug Exception bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=ffa6482e Intel-SIG: commit ffa6482e ("x86/bus_lock: Don't assume the init value of DEBUGCTLMSR.BUS_LOCK_DETECT to be zero") ------------------------------------- x86/bus_lock: Don't assume the init value of DEBUGCTLMSR.BUS_LOCK_DETECT to be zero It's possible that this kernel has been kexec'd from a kernel that enabled bus lock detection, or (hypothetically) BIOS/firmware has set DEBUGCTLMSR_BUS_LOCK_DETECT. Disable bus lock detection explicitly if not wanted. Fixes: ebb1064e ("x86/traps: Handle #DB for bus lock") Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Signed-off-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NTony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20220802033206.21333-1-chenyi.qiang@intel.comSigned-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Paolo Bonzini 提交于
mainline inclusion from mainline-v5.13-rc2 commit 76ea438b category: feature feature: KVM Bus Lock Debug Exception bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=76ea438b Intel-SIG: commit 76ea438b ("KVM: X86: Expose bus lock debug exception to guest") ------------------------------------- KVM: X86: Expose bus lock debug exception to guest Bus lock debug exception is an ability to notify the kernel by an #DB trap after the instruction acquires a bus lock and is executed when CPL>0. This allows the kernel to enforce user application throttling or mitigations. Existence of bus lock debug exception is enumerated via CPUID.(EAX=7,ECX=0).ECX[24]. Software can enable these exceptions by setting bit 2 of the MSR_IA32_DEBUGCTL. Expose the CPUID to guest and emulate the MSR handling when guest enables it. Support for this feature was originally developed by Xiaoyao Li and Chenyi Qiang, but code has since changed enough that this patch has nothing in common with theirs, except for this commit message. Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210202090433.13441-4-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.13-rc2 commit e8ea85fb category: feature feature: KVM Bus Lock Debug Exception bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=e8ea85fb Intel-SIG: commit e8ea85fb ("KVM: X86: Add support for the emulation of DR6_BUS_LOCK bit") ------------------------------------- KVM: X86: Add support for the emulation of DR6_BUS_LOCK bit Bus lock debug exception introduces a new bit DR6_BUS_LOCK (bit 11 of DR6) to indicate that bus lock #DB exception is generated. The set/clear of DR6_BUS_LOCK is similar to the DR6_RTM. The processor clears DR6_BUS_LOCK when the exception is generated. For all other #DB, the processor sets this bit to 1. Software #DB handler should set this bit before returning to the interrupted task. In VMM, to avoid breaking the CPUs without bus lock #DB exception support, activate the DR6_BUS_LOCK conditionally in DR6_FIXED_1 bits. When intercepting the #DB exception caused by bus locks, bit 11 of the exit qualification is set to identify it. The VMM should emulate the exception by clearing the bit 11 of the guest DR6. Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210202090433.13441-3-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 9a3ecd5e category: feature feature: KVM Bus Lock Debug Exception bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=9a3ecd5e Intel-SIG: commit 9a3ecd5e ("KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW") ------------------------------------- KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW DR6_INIT contains the 1-reserved bits as well as the bit that is cleared to 0 when the condition (e.g. RTM) happens. The value can be used to initialize dr6 and also be the XOR mask between the #DB exit qualification (or payload) and DR6. Concerning that DR6_INIT is used as initial value only once, rename it to DR6_ACTIVE_LOW and apply it in other places, which would make the incoming changes for bus lock debug exception more simple. Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210202090433.13441-2-chenyi.qiang@intel.com> [Define DR6_FIXED_1 from DR6_ACTIVE_LOW and DR6_VOLATILE. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Paolo Bonzini 提交于
mainline inclusion from mainline-v5.11-rc1 commit 8cce12b3 category: feature feature: KVM bus lock debug exception bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RHW7 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=8cce12b3 Intel-SIG: commit 8cce12b3 ("KVM: nSVM: set fixed bits by hand") ------------------------------------- KVM: nSVM: set fixed bits by hand SVM generally ignores fixed-1 bits. Set them manually so that we do not end up by mistake without those bits set in struct kvm_vcpu; it is part of userspace API that KVM always returns value with the bits set. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Tao Xu 提交于
mainline inclusion from mainline-v6.0-rc1 commit 2f4073e0 category: feature feature: Notify VM exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=2f4073e0 Intel-SIG: commit 2f4073e0 ("KVM: VMX: Enable Notify VM exit") ------------------------------------- KVM: VMX: Enable Notify VM exit There are cases that malicious virtual machines can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. VMM can enable notify VM exit that a VM exit generated if no event window occurs in VM non-root mode for a specified amount of time (notify window). Feature enabling: - The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust the expected notify window. - Add a new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT so that user space can query and enable this feature in per-VM scope. The argument is a 64bit value: bits 63:32 are used for notify window, and bits 31:0 are for flags. Current supported flags: - KVM_X86_NOTIFY_VMEXIT_ENABLED: enable the feature with the notify window provided. - KVM_X86_NOTIFY_VMEXIT_USER: exit to userspace once the exits happen. - It's safe to even set notify window to zero since an internal hardware threshold is added to vmcs.notify_window. VM exit handling: - Introduce a vcpu state notify_window_exits to records the count of notify VM exits and expose it through the debugfs. - Notify VM exit can happen incident to delivery of a vector event. Allow it in KVM. - Exit to userspace unconditionally for handling when VM_CONTEXT_INVALID bit is set. Nested handling - Nested notify VM exits are not supported yet. Keep the same notify window control in vmcs02 as vmcs01, so that L1 can't escape the restriction of notify VM exits through launching L2 VM. Notify VM exit is defined in latest Intel Architecture Instruction Set Extensions Programming Reference, chapter 9.2. Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NTao Xu <tao3.xu@intel.com> Co-developed-by: NChenyi Qiang <chenyi.qiang@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220524135624.22988-5-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v6.0-rc1 commit ed235117 category: feature feature: Notify VM exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5 CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=ed235117 Intel-SIG: commit ed235117 ("KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault") ------------------------------------- KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault For the triple fault sythesized by KVM, e.g. the RSM path or nested_vmx_abort(), if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault with a new event KVM_VCPUEVENT_VALID_FAULT_FAULT so that userspace can save and restore the triple fault event. This extension is guarded by a new KVM capability KVM_CAP_TRIPLE_FAULT_EVENT. Note that in the set_vcpu_events path, userspace is able to set/clear the triple fault request through triple_fault.pending field. Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220524135624.22988-2-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Hao Xiang 提交于
mainline inclusion from mainline-v5.15-rc7 commit d61863c6 category: feature feature: KVM Bus Lock VM Exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=d61863c6 Intel-SIG: commit d61863c6 ("KVM: VMX: Remove redundant handling of bus lock vmexit") ------------------------------------- KVM: VMX: Remove redundant handling of bus lock vmexit Hardware may or may not set exit_reason.bus_lock_detected on BUS_LOCK VM-Exits. Dealing with KVM_RUN_X86_BUS_LOCK in handle_bus_lock_vmexit could be redundant when exit_reason.basic is EXIT_REASON_BUS_LOCK. We can remove redundant handling of bus lock vmexit. Unconditionally Set exit_reason.bus_lock_detected in handle_bus_lock_vmexit(), and deal with KVM_RUN_X86_BUS_LOCK only in vmx_handle_exit(). Signed-off-by: NHao Xiang <hao.xiang@linux.alibaba.com> Message-Id: <1634299161-30101-1-git-send-email-hao.xiang@linux.alibaba.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.15-rc4 commit 24a996ad category: feature feature: KVM Bus Lock VM Exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=24a996ad Intel-SIG: commit 24a996ad ("KVM: nVMX: Fix nested bus lock VM exit") ------------------------------------- KVM: nVMX: Fix nested bus lock VM exit Nested bus lock VM exits are not supported yet. If L2 triggers bus lock VM exit, it will be directed to L1 VMM, which would cause unexpected behavior. Therefore, handle L2's bus lock VM exits in L0 directly. Fixes: fe6b6bc8 ("KVM: VMX: Enable bus lock VM exit") Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NXiaoyao Li <xiaoyao.li@intel.com> Message-Id: <20210914095041.29764-1-chenyi.qiang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit fe6b6bc8 category: feature feature: KVM Bus Lock VM Exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=fe6b6bc8 Intel-SIG: commit fe6b6bc8 ("KVM: VMX: Enable bus lock VM exit") ------------------------------------- KVM: VMX: Enable bus lock VM exit Virtual Machine can exploit bus locks to degrade the performance of system. Bus lock can be caused by split locked access to writeback(WB) memory or by using locks on uncacheable(UC) memory. The bus lock is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores (which must wait for the bus lock to be released before their memory operations can complete). To address the threat, bus lock VM exit is introduced to notify the VMM when a bus lock was acquired, allowing it to enforce throttling or other policy based mitigations. A VMM can enable VM exit due to bus locks by setting a new "Bus Lock Detection" VM-execution control(bit 30 of Secondary Processor-based VM execution controls). If delivery of this VM exit was preempted by a higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of exit reason in vmcs field is set to 1. In current implementation, the KVM exposes this capability through KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap (i.e. off and exit) and enable it explicitly (disabled by default). If bus locks in guest are detected by KVM, exit to user space even when current exit reason is handled by KVM internally. Set a new field KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there is a bus lock detected in guest. Document for Bus Lock VM exit is now available at the latest "Intel Architecture Instruction Set Extensions Programming Reference". Document Link: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.htmlCo-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Chenyi Qiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 15aad3be category: feature feature: KVM Bus Lock VM Exit bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5RJCB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=15aad3be Intel-SIG: commit 15aad3be ("KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run") ------------------------------------- KVM: X86: Reset the vcpu->run->flags at the beginning of vcpu_run Reset the vcpu->run->flags at the beginning of kvm_arch_vcpu_ioctl_run. It can avoid every thunk of code that needs to set the flag clear it, which increases the odds of missing a case and ending up with a flag in an undefined state. Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20201106090315.18606-3-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Yang Zhong 提交于
mainline inclusion from mainline-v5.12-rc1 commit 1085a6b5 category: feature feature: SPR New Instructions Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5O6WB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=1085a6b5 Intel-SIG: commit 1085a6b5 ("KVM: Expose AVX_VNNI instruction to guset") ------------------------------------- KVM: Expose AVX_VNNI instruction to guset Expose AVX (VEX-encoded) versions of the Vector Neural Network Instructions to guest. The bit definition: CPUID.(EAX=7,ECX=1):EAX[bit 4] AVX_VNNI The following instructions are available when this feature is present in the guest. 1. VPDPBUS: Multiply and Add Unsigned and Signed Bytes 2. VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation 3. VPDPWSSD: Multiply and Add Signed Word Integers 4. VPDPWSSDS: Multiply and Add Signed Integers with Saturation This instruction is currently documented in the latest "extensions" manual (ISE). It will appear in the "main" manual (SDM) in the future. Signed-off-by: NYang Zhong <yang.zhong@intel.com> Reviewed-by: NTony Luck <tony.luck@intel.com> Message-Id: <20210105004909.42000-3-yang.zhong@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
由 Cathy Zhang 提交于
mainline inclusion from mainline-v5.11-rc1 commit 2224fc9e category: feature feature: SPR New Instructions Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5O6WB CVE: N/A Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ commit/?id=2224fc9e Intel-SIG: commit 2224fc9e ("KVM: x86: Expose AVX512_FP16 for supported CPUID") ------------------------------------- KVM: x86: Expose AVX512_FP16 for supported CPUID AVX512_FP16 is supported by Intel processors, like Sapphire Rapids. It could gain better performance for it's faster compared to FP32 if the precision or magnitude requirements are met. It's availability is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23]. Expose it in KVM supported CPUID, then guest could make use of it; no new registers are used, only new instructions. Signed-off-by: NCathy Zhang <cathy.zhang@intel.com> Signed-off-by: NKyung Min Park <kyung.min.park@intel.com> Acked-by: NDave Hansen <dave.hansen@intel.com> Reviewed-by: NTony Luck <tony.luck@intel.com> Message-Id: <20201208033441.28207-3-kyung.min.park@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAichun Shi <aichun.shi@intel.com>
-
- 02 11月, 2022 9 次提交
-
-
由 Feng Tang 提交于
Commit b50db709 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduced to solve problem that sometimes TSC clocksource is wrongly judged as unstable by watchdog like 'jiffies', HPET, etc. In it, the hardware socket number is a key factor for judging whether to disable the watchdog for TSC, and 'nr_online_nodes' was chosen as an estimation due to it is needed in early boot phase before registering 'tsc-early' clocksource, where all none-boot CPUs are not brought up yet. In recent patch review, Dave Hansen pointed out there are many cases that 'nr_online_nodes' could have issue, like: * numa emulation (numa=fake=4 etc.) * numa=off * platforms with CPU+DRAM nodes, CPU-less HBM nodes, CPU-less persistent memory nodes. Peter Zijlstra suggested to use logical package ids, but it is only usable after smp_init() and all CPUs are initialized. One solution is to skip the watchdog for 'tsc-early' clocksource, and move the check after smp_init(), while before 'tsc' clocksoure is registered, where topology_max_packages() could be used as a much more accurate socket number. Signed-off-by: NFeng Tang <feng.tang@intel.com>
-
由 Jiri Slaby 提交于
stable inclusion from stable-v5.10.133 commit ecc0d92a9f6cc3f74b67d2c9887d0c800018e661 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ecc0d92a9f6cc3f74b67d2c9887d0c800018e661 -------------------------------- commit 3131ef39 upstream. The build on x86_32 currently fails after commit 9bb2ec60 (objtool: Update Retpoline validation) with: arch/x86/kernel/../../x86/xen/xen-head.S:35: Error: no such instruction: `annotate_unret_safe' ANNOTATE_UNRET_SAFE is defined in nospec-branch.h. And head_32.S is missing this include. Fix this. Fixes: 9bb2ec60 ("objtool: Update Retpoline validation") Signed-off-by: NJiri Slaby <jslaby@suse.cz> Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/63e23f80-033f-f64e-7522-2816debbc367@kernel.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 38a80a3ca2cb069dd5608703b015a206a672aae5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=38a80a3ca2cb069dd5608703b015a206a672aae5 -------------------------------- commit d4b5a5c9 upstream. Make sure we can see the text changes when booting with 'debug-alternative'. Example output: [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00 Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.orgSigned-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit 3d13ee0d411a078ca1538d823c2c759b8b266fb1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3d13ee0d411a078ca1538d823c2c759b8b266fb1 -------------------------------- commit bbe2df3f upstream. Try and replace retpoline thunk calls with: LFENCE CALL *%\reg for spectre_v2=retpoline,amd. Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail. Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted. Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically: Jncc.d8 1f LFENCE JMP *%\reg 1: is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org [cascardo: RETPOLINE_AMD was renamed to RETPOLINE_LFENCE] Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.133 commit b0e2dc950654162bc68cec530156251e7ad3f03a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b0e2dc950654162bc68cec530156251e7ad3f03a -------------------------------- commit 2f0cbb2a upstream. Handle the rare cases where the compiler (clang) does an indirect conditional tail-call using: Jcc __x86_indirect_thunk_\reg For the !RETPOLINE case this can be rewritten to fit the original (6 byte) instruction like: Jncc.d8 1f JMP *%\reg NOP 1: Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NBorislav Petkov <bp@suse.de> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Tested-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.296470217@infradead.orgSigned-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Borislav Petkov 提交于
stable inclusion from stable-v5.10.133 commit e6f8dc86a1c15b862486a61abcb54b88e8c177e3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e6f8dc86a1c15b862486a61abcb54b88e8c177e3 -------------------------------- commit 6e8c83d2 upstream. Now that the different instruction-inspecting functions return a value, test that and return early from callers if error has been encountered. While at it, do not call insn_get_modrm() when calling insn_get_displacement() because latter will make sure to call insn_get_modrm() if ModRM hasn't been parsed yet. Signed-off-by: NBorislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-6-bp@alien8.deSigned-off-by: NBen Hutchings <ben@decadent.org.uk> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Juergen Gross 提交于
stable inclusion from stable-v5.10.132 commit 06a5dc3911a3b29acefd53470bdeccb88deb155e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=06a5dc3911a3b29acefd53470bdeccb88deb155e -------------------------------- commit 230ec83d upstream. x86_has_pat_wp() is using a wrong test, as it relies on the normal PAT configuration used by the kernel. In case the PAT MSR has been setup by another entity (e.g. Xen hypervisor) it might return false even if the PAT configuration is allowing WP mappings. This due to the fact that when running as Xen PV guest the PAT MSR is setup by the hypervisor and cannot be changed by the guest. This results in the WP related entry to be at a different position when running as Xen PV guest compared to the bare metal or fully virtualized case. The correct way to test for WP support is: 1. Get the PTE protection bits needed to select WP mode by reading __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR setting this might return protection bits for a stronger mode, e.g. UC-) 2. Translate those bits back into the real cache mode selected by those PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)] 3. Test for the cache mode to be _PAGE_CACHE_MODE_WP Fixes: f88a68fa ("x86/mm: Extend early_memremap() support with additional attrs") Signed-off-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14 Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Vitaly Kuznetsov 提交于
stable inclusion from stable-v5.10.132 commit eb58fd350a851b5cda9f4c9a2cefb15c7ccf33f3 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=eb58fd350a851b5cda9f4c9a2cefb15c7ccf33f3 -------------------------------- [ Upstream commit 8a414f94 ] 'vector' and 'trig_mode' fields of 'struct kvm_lapic_irq' are left uninitialized in kvm_pv_kick_cpu_op(). While these fields are normally not needed for APIC_DM_REMRD, they're still referenced by __apic_accept_irq() for trace_kvm_apic_accept_irq(). Fully initialize the structure to avoid consuming random stack memory. Fixes: a183b638 ("KVM: x86: make apic_accept_irq tracepoint more generic") Reported-by: syzbot+d6caa905917d353f0d07@syzkaller.appspotmail.com Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Message-Id: <20220708125147.593975-1-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Coiby Xu 提交于
stable inclusion from stable-v5.10.132 commit eb360267e1e972475023d06546e18365a222698c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YS3T Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=eb360267e1e972475023d06546e18365a222698c -------------------------------- [ Upstream commit af16df54 ] Currently, an unsigned kernel could be kexec'ed when IMA arch specific policy is configured unless lockdown is enabled. Enforce kernel signature verification check in the kexec_file_load syscall when IMA arch specific policy is configured. Fixes: 99d5cadf ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE") Reported-and-suggested-by: NMimi Zohar <zohar@linux.ibm.com> Signed-off-by: NCoiby Xu <coxu@redhat.com> Signed-off-by: NMimi Zohar <zohar@linux.ibm.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 27 10月, 2022 6 次提交
-
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.144 commit 35371fd68807f41a4072c01c166de5425a2a47e5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WL0J CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=35371fd68807f41a4072c01c166de5425a2a47e5 -------------------------------- commit 1f001e9d upstream. Use the return thunk in ftrace trampolines, if needed. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NJosh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: NBorislav Petkov <bp@suse.de> [cascardo: use memcpy(text_gen_insn) as there is no __text_gen_insn] Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NOvidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.144 commit 4586df06a02049f4315c25b947c6dde2627c0d18 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WL0J CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=4586df06a02049f4315c25b947c6dde2627c0d18 -------------------------------- commit e52fc2cf upstream. Return trampoline must not use indirect branch to return; while this preserves the RSB, it is fundamentally incompatible with IBT. Instead use a retpoline like ROP gadget that defeats IBT while not unbalancing the RSB. And since ftrace_stub is no longer a plain RET, don't use it to copy from. Since RET is a trivial instruction, poke it directly. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org [cascardo: remove ENDBR] Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> [OP: adjusted context for 5.10-stable] Signed-off-by: NOvidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
stable inclusion from stable-v5.10.144 commit 33015556a943d6cbb18c555925a54b8c0e46f521 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5WL0J CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=33015556a943d6cbb18c555925a54b8c0e46f521 -------------------------------- This reverts commit 00b136bb6254e0abf6aaafe62c4da5f6c4fea4cb. This temporarily reverts the backport of upstream commit 1f001e9d. It was not correct to copy the ftrace stub as it would contain a relative jump to the return thunk which would not apply to the context where it was being copied to, leading to ftrace support to be broken. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: NOvidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NLin Yujun <linyujun809@huawei.com> Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jakub Sitnicki 提交于
stable inclusion from stable-v5.10.127 commit a51c199e4d2bbb8748c12e2ce846024dea012e57 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5XDDK Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a51c199e4d2bbb8748c12e2ce846024dea012e57 -------------------------------- [ Upstream commit ff672c67 ] On x86-64 the tail call count is passed from one BPF function to another through %rax. Additionally, on function entry, the tail call count value is stored on stack right after the BPF program stack, due to register shortage. The stored count is later loaded from stack either when performing a tail call - to check if we have not reached the tail call limit - or before calling another BPF function call in order to pass it via %rax. In the latter case, we miscalculate the offset at which the tail call count was stored on function entry. The JIT does not take into account that the allocated BPF program stack is always a multiple of 8 on x86, while the actual stack depth does not have to be. This leads to a load from an offset that belongs to the BPF stack, as shown in the example below: SEC("tc") int entry(struct __sk_buff *skb) { /* Have data on stack which size is not a multiple of 8 */ volatile char arr[1] = {}; return subprog_tail(skb); } int entry(struct __sk_buff * skb): 0: (b4) w2 = 0 1: (73) *(u8 *)(r10 -1) = r2 2: (85) call pc+1#bpf_prog_ce2f79bb5f3e06dd_F 3: (95) exit int entry(struct __sk_buff * skb): 0xffffffffa0201788: nop DWORD PTR [rax+rax*1+0x0] 0xffffffffa020178d: xor eax,eax 0xffffffffa020178f: push rbp 0xffffffffa0201790: mov rbp,rsp 0xffffffffa0201793: sub rsp,0x8 0xffffffffa020179a: push rax 0xffffffffa020179b: xor esi,esi 0xffffffffa020179d: mov BYTE PTR [rbp-0x1],sil 0xffffffffa02017a1: mov rax,QWORD PTR [rbp-0x9] !!! tail call count 0xffffffffa02017a8: call 0xffffffffa02017d8 !!! is at rbp-0x10 0xffffffffa02017ad: leave 0xffffffffa02017ae: ret Fix it by rounding up the BPF stack depth to a multiple of 8, when calculating the tail call count offset on stack. Fixes: ebf7d1f5 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220616162037.535469-2-jakub@cloudflare.comSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Ashish Kalra 提交于
stable inclusion from stable-v5.10.124 commit 401bef1f95de92c3a8c6eece46e02fa88d7285ee category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6E7 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=401bef1f95de92c3a8c6eece46e02fa88d7285ee -------------------------------- commit d22d2474 upstream. For some sev ioctl interfaces, the length parameter that is passed maybe less than or equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data that PSP firmware returns. In this case, kmalloc will allocate memory that is the size of the input rather than the size of the data. Since PSP firmware doesn't fully overwrite the allocated buffer, these sev ioctl interface may return uninitialized kernel slab memory. Reported-by: NAndy Nguyen <theflow@google.com> Suggested-by: NDavid Rientjes <rientjes@google.com> Suggested-by: NPeter Gonda <pgonda@google.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Fixes: eaf78265 ("KVM: SVM: Move SEV code to separate file") Fixes: 2c07ded0 ("KVM: SVM: add support for SEV attestation command") Fixes: 4cfdd47d ("KVM: SVM: Add KVM_SEV SEND_START command") Fixes: d3d1af85 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command") Fixes: eba04b20 ("KVM: x86: Account a variety of miscellaneous allocations") Signed-off-by: NAshish Kalra <ashish.kalra@amd.com> Reviewed-by: NPeter Gonda <pgonda@google.com> Message-Id: <20220516154310.3685678-1-Ashish.Kalra@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [sudip: adjust context] Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
由 Sean Christopherson 提交于
stable inclusion from stable-v5.10.124 commit d6be031a2f5e27f27f3648bac98d2a35874eaddc category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5L6E7 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d6be031a2f5e27f27f3648bac98d2a35874eaddc -------------------------------- commit eba04b20 upstream. Switch to GFP_KERNEL_ACCOUNT for a handful of allocations that are clearly associated with a single task/VM. Note, there are a several SEV allocations that aren't accounted, but those can (hopefully) be fixed by using the local stack for memory. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210331023025.2485960-3-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [sudip: adjust context] Signed-off-by: NSudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 19 10月, 2022 2 次提交
-
-
由 Li Lingfeng 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I53Q6M CVE: NA -------------------------------- openEuler need detect conflict of opening block device, so enable it as default. Signed-off-by: NLi Lingfeng <lilingfeng3@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NChao Liu <liuchao173@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Zijlstra 提交于
stable inclusion from stable-v5.10.122 commit 320acaf84a6469492f3355b75562f7472b91aaf2 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5W6OE Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=320acaf84a6469492f3355b75562f7472b91aaf2 -------------------------------- [ Upstream commit a6a5eb26 ] As x86 uses the <asm-generic/bitops/instrumented-*.h> headers, the regular forms of all bitops are instrumented with explicit calls to KASAN and KCSAN checks. As these are explicit calls, these are not suppressed by the noinstr function attribute. This can result in calls to those check functions in noinstr code, which objtool warns about: vmlinux.o: warning: objtool: enter_from_user_mode+0x24: call to __kcsan_check_access() leaves .noinstr.text section vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0x28: call to __kcsan_check_access() leaves .noinstr.text section vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0x24: call to __kcsan_check_access() leaves .noinstr.text section vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0x24: call to __kcsan_check_access() leaves .noinstr.text section Prevent this by using the arch_*() bitops, which are the underlying bitops without explciit instrumentation. [null: Changelog] Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220502111216.290518605@infradead.orgSigned-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 13 10月, 2022 1 次提交
-
-
由 Youquan Song 提交于
mainline inclusion from mainline-v6.1-rc1 commit 2738c69a category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5V3IO CVE: NA Intel-SIG: commit 2738c69a EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs. Backport to decode DDR error by MCA bank registers in replace of firmware. -------------------------------- Current i10nm_edac only supports firmware decoder (ACPI DSM methods). MCA bank registers of Ice Lake or Tremont CPUs contain the information to decode DDR memory errors. To get better decoding performance, add the driver decoder (decoding DDR memory errors via extracting error information from MCA bank registers) for Ice Lake and Tremont CPUs. Co-developed-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: NYouquan Song <youquan.song@intel.com> Signed-off-by: NTony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/all/20220901194310.115427-1-tony.luck@intel.com/Signed-off-by: NYouquan Song <youquan.song@intel.com>
-
- 10 10月, 2022 6 次提交
-
-
由 Jason Zeng 提交于
Intel inclusion category: feature feature: IPI Virtualization bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5ODSC CVE: N/A ------------------------------------------------- The introduction of VMX tertiary features like IPI virtualization causes the change of the size of struct cpu_info_x86. This patch tries to put the tertiary features on a separate data structure. Signed-off-by: NJason Zeng <jason.zeng@intel.com>
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-6.0-rc1 commit e0a5915f category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5USAM CVE: NA Intel-SIG: commit e0a5915f x86/sgx: Drop 'page_index' from sgx_backing. Backport for SGX EDMM support. -------------------------------- Storing the 'page_index' value in the sgx_backing struct is dead code and no longer needed. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220708162124.8442-1-kristen@linux.intel.comSigned-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-
由 Kristen Carlson Accardi 提交于
mainline inclusion from mainline-5.19-rc1 commit 0c9782e2 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5USAM CVE: NA Intel-SIG: commit 0c9782e2 x86/sgx: Set active memcg prior to shmem allocation. Backport for SGX EDMM support. -------------------------------- When the system runs out of enclave memory, SGX can reclaim EPC pages by swapping to normal RAM. These backing pages are allocated via a per-enclave shared memory area. Since SGX allows unlimited over commit on EPC memory, the reclaimer thread can allocate a large number of backing RAM pages in response to EPC memory pressure. When the shared memory backing RAM allocation occurs during the reclaimer thread context, the shared memory is charged to the root memory control group, and the shmem usage of the enclave is not properly accounted for, making cgroups ineffective at limiting the amount of RAM an enclave can consume. For example, when using a cgroup to launch a set of test enclaves, the kernel does not properly account for 50% - 75% of shmem page allocations on average. In the worst case, when nearly all allocations occur during the reclaimer thread, the kernel accounts less than a percent of the amount of shmem used by the enclave's cgroup to the correct cgroup. SGX stores a list of mm_structs that are associated with an enclave. Pick one of them during reclaim and charge that mm's memcg with the shmem allocation. The one that gets picked is arbitrary, but this list almost always only has one mm. The cases where there is more than one mm with different memcg's are not worth considering. Create a new function - sgx_encl_alloc_backing(). This function is used whenever a new backing storage page needs to be allocated. Previously the same function was used for page allocation as well as retrieving a previously allocated page. Prior to backing page allocation, if there is a mm_struct associated with the enclave that is requesting the allocation, it is set as the active memory control group. [ dhansen: - fix merge conflict with ELDU fixes - check against actual ksgxd_tsk, not ->mm ] Cc: stable@vger.kernel.org Signed-off-by: NKristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NShakeel Butt <shakeelb@google.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Link: https://lkml.kernel.org/r/20220520174248.4918-1-kristen@linux.intel.comSigned-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-
由 Reinette Chatre 提交于
mainline inclusion from mainline-6.0-rc1 commit a0506b3b category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5USAM CVE: NA Intel-SIG: commit a0506b3b x86/sgx: Free up EPC pages directly to support large page ranges. Backport for SGX EDMM support. -------------------------------- The page reclaimer ensures availability of EPC pages across all enclaves. In support of this it runs independently from the individual enclaves in order to take locks from the different enclaves as it writes pages to swap. When needing to load a page from swap an EPC page needs to be available for its contents to be loaded into. Loading an existing enclave page from swap does not reclaim EPC pages directly if none are available, instead the reclaimer is woken when the available EPC pages are found to be below a watermark. When iterating over a large number of pages in an oversubscribed environment there is a race between the reclaimer woken up and EPC pages reclaimed fast enough for the page operations to proceed. Ensure there are EPC pages available before attempting to load a page that may potentially be pulled from swap into an available EPC page. Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NJarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/a0d8f037c4a075d56bf79f432438412985f7ff7a.1652137848.git.reinette.chatre@intel.comSigned-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-
由 Reinette Chatre 提交于
mainline inclusion from mainline-6.0-rc1 commit 9849bb27 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5USAM CVE: NA Intel-SIG: commit 9849bb27 x86/sgx: Support complete page removal. Backport for SGX EDMM support. -------------------------------- The SGX2 page removal flow was introduced in previous patch and is as follows: 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM using the ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPES introduced in previous patch. 2) Approve the page removal by running ENCLU[EACCEPT] from within the enclave. 3) Initiate actual page removal using the ioctl() SGX_IOC_ENCLAVE_REMOVE_PAGES introduced here. Support the final step of the SGX2 page removal flow with ioctl() SGX_IOC_ENCLAVE_REMOVE_PAGES. With this ioctl() the user specifies a page range that should be removed. All pages in the provided range should have the SGX_PAGE_TYPE_TRIM page type and the request will fail with EPERM (Operation not permitted) if a page that does not have the correct type is encountered. Page removal can fail on any page within the provided range. Support partial success by returning the number of pages that were successfully removed. Since actual page removal will succeed even if ENCLU[EACCEPT] was not run from within the enclave the ENCLU[EMODPR] instruction with RWX permissions is used as a no-op mechanism to ensure ENCLU[EACCEPT] was successfully run from within the enclave before the enclave page is removed. If the user omits running SGX_IOC_ENCLAVE_REMOVE_PAGES the pages will still be removed when the enclave is unloaded. Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NJarkko Sakkinen <jarkko@kernel.org> Tested-by: NHaitao Huang <haitao.huang@intel.com> Tested-by: NVijay Dhanraj <vijay.dhanraj@intel.com> Tested-by: NJarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/b75ee93e96774e38bb44a24b8e9bbfb67b08b51b.1652137848.git.reinette.chatre@intel.comSigned-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-
由 Reinette Chatre 提交于
mainline inclusion from mainline-6.0-rc1 commit 45d546b8 category: feature bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5USAM CVE: NA Intel-SIG: commit 45d546b8 x86/sgx: Support modifying SGX page type. Backport for SGX EDMM support. -------------------------------- Every enclave contains one or more Thread Control Structures (TCS). The TCS contains meta-data used by the hardware to save and restore thread specific information when entering/exiting the enclave. With SGX1 an enclave needs to be created with enough TCSs to support the largest number of threads expecting to use the enclave and enough enclave pages to meet all its anticipated memory demands. In SGX1 all pages remain in the enclave until the enclave is unloaded. SGX2 introduces a new function, ENCLS[EMODT], that is used to change the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS) page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later removal). With the existing support of dynamically adding regular enclave pages to an initialized enclave and changing the page type to TCS it is possible to dynamically increase the number of threads supported by an enclave. Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step of dynamically removing pages from an initialized enclave. The complete page removal flow is: 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM using the SGX_IOC_ENCLAVE_MODIFY_TYPES ioctl() introduced here. 2) Approve the page removal by running ENCLU[EACCEPT] from within the enclave. 3) Initiate actual page removal using the ioctl() introduced in the following patch. Add ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPES to support changing SGX enclave page types within an initialized enclave. With SGX_IOC_ENCLAVE_MODIFY_TYPES the user specifies a page range and the enclave page type to be applied to all pages in the provided range. The ioctl() itself can return an error code based on failures encountered by the kernel. It is also possible for SGX specific failures to be encountered. Add a result output parameter to communicate the SGX return code. It is possible for the enclave page type change request to fail on any page within the provided range. Support partial success by returning the number of pages that were successfully changed. After the page type is changed the page continues to be accessible from the kernel perspective with page table entries and internal state. The page may be moved to swap. Any access until ENCLU[EACCEPT] will encounter a page fault with SGX flag set in error code. Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NJarkko Sakkinen <jarkko@kernel.org> Tested-by: NJarkko Sakkinen <jarkko@kernel.org> Tested-by: NHaitao Huang <haitao.huang@intel.com> Tested-by: NVijay Dhanraj <vijay.dhanraj@intel.com> Link: https://lkml.kernel.org/r/babe39318c5bf16fc65fbfb38896cdee72161575.1652137848.git.reinette.chatre@intel.comSigned-off-by: NZhiquan Li <zhiquan1.li@intel.com>
-