提交 · 826b7373d1cc294273bd533a87658469398d944e · openeuler / Kernel

10 5月, 2022 14 次提交

KVM: x86: Forcibly leave nested virt when SMM state is toggled · 826b7373

由 Sean Christopherson 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.97
commit 080dbe7e9b86a0392d8dffc00d9971792afc121f
bugzilla: https://gitee.com/openeuler/kernel/issues/I55O0O

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=080dbe7e9b86a0392d8dffc00d9971792afc121f

--------------------------------

commit f7e57078 upstream.

Forcibly leave nested virtualization operation if userspace toggles SMM
state via KVM_SET_VCPU_EVENTS or KVM_SYNC_X86_EVENTS.  If userspace
forces the vCPU out of SMM while it's post-VMXON and then injects an SMI,
vmx_enter_smm() will overwrite vmx->nested.smm.vmxon and end up with both
vmxon=false and smm.vmxon=false, but all other nVMX state allocated.

Don't attempt to gracefully handle the transition as (a) most transitions
are nonsencial, e.g. forcing SMM while L2 is running, (b) there isn't
sufficient information to handle all transitions, e.g. SVM wants access
to the SMRAM save state, and (c) KVM_SET_VCPU_EVENTS must precede
KVM_SET_NESTED_STATE during state restore as the latter disallows putting
the vCPU into L2 if SMM is active, and disallows tagging the vCPU as
being post-VMXON in SMM if SMM is not active.

Abuse of KVM_SET_VCPU_EVENTS manifests as a WARN and memory leak in nVMX
due to failure to free vmcs01's shadow VMCS, but the bug goes far beyond
just a memory leak, e.g. toggling SMM on while L2 is active puts the vCPU
in an architecturally impossible state.

  WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
  WARNING: CPU: 0 PID: 3606 at free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
  Modules linked in:
  CPU: 1 PID: 3606 Comm: syz-executor725 Not tainted 5.17.0-rc1-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:free_loaded_vmcs arch/x86/kvm/vmx/vmx.c:2665 [inline]
  RIP: 0010:free_loaded_vmcs+0x158/0x1a0 arch/x86/kvm/vmx/vmx.c:2656
  Code: <0f> 0b eb b3 e8 8f 4d 9f 00 e9 f7 fe ff ff 48 89 df e8 92 4d 9f 00
  Call Trace:
   <TASK>
   kvm_arch_vcpu_destroy+0x72/0x2f0 arch/x86/kvm/x86.c:11123
   kvm_vcpu_destroy arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 [inline]
   kvm_destroy_vcpus+0x11f/0x290 arch/x86/kvm/../../../virt/kvm/kvm_main.c:460
   kvm_free_vcpus arch/x86/kvm/x86.c:11564 [inline]
   kvm_arch_destroy_vm+0x2e8/0x470 arch/x86/kvm/x86.c:11676
   kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1217 [inline]
   kvm_put_kvm+0x4fa/0xb00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1250
   kvm_vm_release+0x3f/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1273
   __fput+0x286/0x9f0 fs/file_table.c:311
   task_work_run+0xdd/0x1a0 kernel/task_work.c:164
   exit_task_work include/linux/task_work.h:32 [inline]
   do_exit+0xb29/0x2a30 kernel/exit.c:806
   do_group_exit+0xd2/0x2f0 kernel/exit.c:935
   get_signal+0x4b0/0x28c0 kernel/signal.c:2862
   arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
   handle_signal_work kernel/entry/common.c:148 [inline]
   exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
   exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
   __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
   syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
   do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
   entry_SYSCALL_64_after_hwframe+0x44/0xae
   </TASK>

Cc: stable@vger.kernel.org
Reported-by: syzbot+8112db3ab20e70d50c31@syzkaller.appspotmail.com
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20220125220358.2091737-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Backported-by: NTadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

826b7373

powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending · 7799ab27

由 Athira Rajeev 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 55402a4618721f350a9ab660bb42717d8aa18e7c
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=55402a4618721f350a9ab660bb42717d8aa18e7c

--------------------------------

[ Upstream commit fb6433b4 ]

Running selftest with CONFIG_PPC_IRQ_SOFT_MASK_DEBUG enabled in kernel
triggered below warning:

[  172.851380] ------------[ cut here ]------------
[  172.851391] WARNING: CPU: 8 PID: 2901 at arch/powerpc/include/asm/hw_irq.h:246 power_pmu_disable+0x270/0x280
[  172.851402] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink sunrpc xfs libcrc32c pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse
[  172.851442] CPU: 8 PID: 2901 Comm: lost_exception_ Not tainted 5.16.0-rc5-03218-g798527287598 #2
[  172.851451] NIP:  c00000000013d600 LR: c00000000013d5a4 CTR: c00000000013b180
[  172.851458] REGS: c000000017687860 TRAP: 0700   Not tainted  (5.16.0-rc5-03218-g798527287598)
[  172.851465] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48004884  XER: 20040000
[  172.851482] CFAR: c00000000013d5b4 IRQMASK: 1
[  172.851482] GPR00: c00000000013d5a4 c000000017687b00 c000000002a10600 0000000000000004
[  172.851482] GPR04: 0000000082004000 c0000008ba08f0a8 0000000000000000 00000008b7ed0000
[  172.851482] GPR08: 00000000446194f6 0000000000008000 c00000000013b118 c000000000d58e68
[  172.851482] GPR12: c00000000013d390 c00000001ec54a80 0000000000000000 0000000000000000
[  172.851482] GPR16: 0000000000000000 0000000000000000 c000000015d5c708 c0000000025396d0
[  172.851482] GPR20: 0000000000000000 0000000000000000 c00000000a3bbf40 0000000000000003
[  172.851482] GPR24: 0000000000000000 c0000008ba097400 c0000000161e0d00 c00000000a3bb600
[  172.851482] GPR28: c000000015d5c700 0000000000000001 0000000082384090 c0000008ba0020d8
[  172.851549] NIP [c00000000013d600] power_pmu_disable+0x270/0x280
[  172.851557] LR [c00000000013d5a4] power_pmu_disable+0x214/0x280
[  172.851565] Call Trace:
[  172.851568] [c000000017687b00] [c00000000013d5a4] power_pmu_disable+0x214/0x280 (unreliable)
[  172.851579] [c000000017687b40] [c0000000003403ac] perf_pmu_disable+0x4c/0x60
[  172.851588] [c000000017687b60] [c0000000003445e4] __perf_event_task_sched_out+0x1d4/0x660
[  172.851596] [c000000017687c50] [c000000000d1175c] __schedule+0xbcc/0x12a0
[  172.851602] [c000000017687d60] [c000000000d11ea8] schedule+0x78/0x140
[  172.851608] [c000000017687d90] [c0000000001a8080] sys_sched_yield+0x20/0x40
[  172.851615] [c000000017687db0] [c0000000000334dc] system_call_exception+0x18c/0x380
[  172.851622] [c000000017687e10] [c00000000000c74c] system_call_common+0xec/0x268

The warning indicates that MSR_EE being set(interrupt enabled) when
there was an overflown PMC detected. This could happen in
power_pmu_disable since it runs under interrupt soft disable
condition ( local_irq_save ) and not with interrupts hard disabled.
commit 2c9ac51b ("powerpc/perf: Fix PMU callbacks to clear
pending PMI before resetting an overflown PMC") intended to clear
PMI pending bit in Paca when disabling the PMU. It could happen
that PMC gets overflown while code is in power_pmu_disable
callback function. Hence add a check to see if PMI pending bit
is set in Paca before clearing it via clear_pmi_pending.

Fixes: 2c9ac51b ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC")
Reported-by: NSachin Sant <sachinp@linux.ibm.com>
Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: NSachin Sant <sachinp@linux.ibm.com>
Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220122033429.25395-1-atrajeev@linux.vnet.ibm.comSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7799ab27

powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 · b2ab5c73

由 Naveen N. Rao 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 129c71829d7f46423d95c19e8d87ce956d4c6e1c
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=129c71829d7f46423d95c19e8d87ce956d4c6e1c

--------------------------------

[ Upstream commit 3f5f766d ]

Johan reported the below crash with test_bpf on ppc64 e5500:

  test_bpf: #296 ALU_END_FROM_LE 64: 0x0123456789abcdef -> 0x67452301 jited:1
  Oops: Exception in kernel mode, sig: 4 [#1]
  BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
  Modules linked in: test_bpf(+)
  CPU: 0 PID: 76 Comm: insmod Not tainted 5.14.0-03771-g98c2059e008a-dirty #1
  NIP:  8000000000061c3c LR: 80000000006dea64 CTR: 8000000000061c18
  REGS: c0000000032d3420 TRAP: 0700   Not tainted (5.14.0-03771-g98c2059e008a-dirty)
  MSR:  0000000080089000 <EE,ME>  CR: 88002822  XER: 20000000 IRQMASK: 0
  <...>
  NIP [8000000000061c3c] 0x8000000000061c3c
  LR [80000000006dea64] .__run_one+0x104/0x17c [test_bpf]
  Call Trace:
   .__run_one+0x60/0x17c [test_bpf] (unreliable)
   .test_bpf_init+0x6a8/0xdc8 [test_bpf]
   .do_one_initcall+0x6c/0x28c
   .do_init_module+0x68/0x28c
   .load_module+0x2460/0x2abc
   .__do_sys_init_module+0x120/0x18c
   .system_call_exception+0x110/0x1b8
   system_call_common+0xf0/0x210
  --- interrupt: c00 at 0x101d0acc
  <...>
  ---[ end trace 47b2bf19090bb3d0 ]---

  Illegal instruction

The illegal instruction turned out to be 'ldbrx' emitted for
BPF_FROM_[L|B]E, which was only introduced in ISA v2.06. Guard use of
the same and implement an alternative approach for older processors.

Fixes: 156d0e29 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Reported-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: NJohan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d1e51c6fdf572062cf3009a751c3406bda01b832.1641468127.git.naveen.n.rao@linux.vnet.ibm.comSigned-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b2ab5c73

powerpc/32: Fix boot failure with GCC latent entropy plugin · 63f12c40

由 Christophe Leroy 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit b4c9b6afa3a737b5d02828d1f7183ebde282907c
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b4c9b6afa3a737b5d02828d1f7183ebde282907c

--------------------------------

commit bba49665 upstream.

Boot fails with GCC latent entropy plugin enabled.

This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.

As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o

Fixes: 38addce8 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable@vger.kernel.org # v4.9+
Reported-by: NErhard Furtner <erhard_f@mailbox.org>
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.1640178426.git.christophe.leroy@csgroup.euSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

63f12c40

powerpc/32s: Fix kasan_init_region() for KASAN · d80b2ad7

由 Christophe Leroy 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 50f5d0a8bd0ed41ac9477cfbcebe8d15e9efd35c
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=50f5d0a8bd0ed41ac9477cfbcebe8d15e9efd35c

--------------------------------

commit d37823c3 upstream.

It has been reported some configuration where the kernel doesn't
boot with KASAN enabled.

This is due to wrong BAT allocation for the KASAN area:

	---[ Data Block Address Translation ]---
	0: 0xc0000000-0xcfffffff 0x00000000       256M Kernel rw      m
	1: 0xd0000000-0xdfffffff 0x10000000       256M Kernel rw      m
	2: 0xe0000000-0xefffffff 0x20000000       256M Kernel rw      m
	3: 0xf8000000-0xf9ffffff 0x2a000000        32M Kernel rw      m
	4: 0xfa000000-0xfdffffff 0x2c000000        64M Kernel rw      m

A BAT must have both virtual and physical addresses alignment matching
the size of the BAT. This is not the case for BAT 4 above.

Fix kasan_init_region() by using block_size() function that is in
book3s32/mmu.c. To be able to reuse it here, make it non static and
change its name to bat_block_size() in order to avoid name conflict
with block_size() defined in <linux/blkdev.h>

Also reuse find_free_bat() to avoid an error message from setbat()
when no BAT is available.

And allocate memory outside of linear memory mapping to avoid
wasting that precious space.

With this change we get correct alignment for BATs and KASAN shadow
memory is allocated outside the linear memory space.

	---[ Data Block Address Translation ]---
	0: 0xc0000000-0xcfffffff 0x00000000       256M Kernel rw
	1: 0xd0000000-0xdfffffff 0x10000000       256M Kernel rw
	2: 0xe0000000-0xefffffff 0x20000000       256M Kernel rw
	3: 0xf8000000-0xfbffffff 0x7c000000        64M Kernel rw
	4: 0xfc000000-0xfdffffff 0x7a000000        32M Kernel rw

Fixes: 7974c473 ("powerpc/32s: Implement dedicated kasan_init_region()")
Cc: stable@vger.kernel.org
Reported-by: NMaxime Bizon <mbizon@freebox.fr>
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: NMaxime Bizon <mbizon@freebox.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.1641818726.git.christophe.leroy@csgroup.euSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d80b2ad7

powerpc/32s: Allocate one 256k IBAT instead of two consecutives 128k IBATs · b682a9a2

由 Christophe Leroy 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 5d3af1dfdf0feb9bdcdebabf858842be808dd73f
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5d3af1dfdf0feb9bdcdebabf858842be808dd73f

--------------------------------

commit 37eb7ca9 upstream.

Today we have the following IBATs allocated:

	---[ Instruction Block Address Translation ]---
	0: 0xc0000000-0xc03fffff 0x00000000         4M Kernel   x     m
	1: 0xc0400000-0xc05fffff 0x00400000         2M Kernel   x     m
	2: 0xc0600000-0xc06fffff 0x00600000         1M Kernel   x     m
	3: 0xc0700000-0xc077ffff 0x00700000       512K Kernel   x     m
	4: 0xc0780000-0xc079ffff 0x00780000       128K Kernel   x     m
	5: 0xc07a0000-0xc07bffff 0x007a0000       128K Kernel   x     m
	6:         -
	7:         -

The two 128K should be a single 256K instead.

When _etext is not aligned to 128Kbytes, the system will allocate
all necessary BATs to the lower 128Kbytes boundary, then allocate
an additional 128Kbytes BAT for the remaining block.

Instead, align the top to 128Kbytes so that the function directly
allocates a 256Kbytes last block:

	---[ Instruction Block Address Translation ]---
	0: 0xc0000000-0xc03fffff 0x00000000         4M Kernel   x     m
	1: 0xc0400000-0xc05fffff 0x00400000         2M Kernel   x     m
	2: 0xc0600000-0xc06fffff 0x00600000         1M Kernel   x     m
	3: 0xc0700000-0xc077ffff 0x00700000       512K Kernel   x     m
	4: 0xc0780000-0xc07bffff 0x00780000       256K Kernel   x     m
	5:         -
	6:         -
	7:         -
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ab58b296832b0ec650e2203200e060adbcb2677d.1637930421.git.christophe.leroy@csgroup.euSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b682a9a2

x86/MCE/AMD: Allow thresholding interface updates after init · 29e9ffc7

由 Yazen Ghannam 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 08f090bb9b6951a510437ef26ad78ffb3ee17142
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=08f090bb9b6951a510437ef26ad78ffb3ee17142

--------------------------------

commit 1f52b0ab upstream.

Changes to the AMD Thresholding sysfs code prevents sysfs writes from
updating the underlying registers once CPU init is completed, i.e.
"threshold_banks" is set.

Allow the registers to be updated if the thresholding interface is
already initialized or if in the init path. Use the "set_lvt_off" value
to indicate if running in the init path, since this value is only set
during init.

Fixes: a037f3ca ("x86/mce/amd: Make threshold bank setting hotplug robust")
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220117161328.19148-1-yazen.ghannam@amd.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

29e9ffc7

arm64: errata: Fix exec handling in erratum 1418040 workaround · 22ab494b

由 D Scott Phillips 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit bf0d4ae5c6c28ac37655ea33926fa3cf1498169f
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=bf0d4ae5c6c28ac37655ea33926fa3cf1498169f

--------------------------------

commit 38e0257e upstream.

The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0
when executing compat threads. The workaround is applied when switching
between tasks, but the need for the workaround could also change at an
exec(), when a non-compat task execs a compat binary or vice versa. Apply
the workaround in arch_setup_new_exec().

This leaves a small window of time between SET_PERSONALITY and
arch_setup_new_exec where preemption could occur and confuse the old
workaround logic that compares TIF_32BIT between prev and next. Instead, we
can just read cntkctl to make sure it's in the state that the next task
needs. I measured cntkctl read time to be about the same as a mov from a
general-purpose register on N1. Update the workaround logic to examine the
current value of cntkctl instead of the previous task's compat state.

Fixes: d49f7d73 ("arm64: Move handling of erratum 1418040 into C code")
Cc: <stable@vger.kernel.org> # 5.9.x
Signed-off-by: ND Scott Phillips <scott@os.amperecomputing.com>
Reviewed-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

Conflicts:
	arch/arm64/kernel/process.c
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

22ab494b

KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS · 83e122dd

由 Like Xu 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit e92cac1dd803aca5bc326ec22bdcd4f56855d7ce
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e92cac1dd803aca5bc326ec22bdcd4f56855d7ce

--------------------------------

commit 4c282e51 upstream.

Do a runtime CPUID update for a vCPU if MSR_IA32_XSS is written, as the
size in bytes of the XSAVE area is affected by the states enabled in XSS.

Fixes: 20300099 ("kvm: vmx: add MSR logic for XSAVES")
Cc: stable@vger.kernel.org
Signed-off-by: NLike Xu <likexu@tencent.com>
[sean: split out as a separate patch, adjust Fixes tag]
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20220126172226.2298529-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

83e122dd

perf/x86/intel/uncore: Fix CAS_COUNT_WRITE issue for ICX · 0442358b

由 Zhengjun Xing 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 7a32d17fb73a607dcb0797cdd6edbccd76fa059a
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7a32d17fb73a607dcb0797cdd6edbccd76fa059a

--------------------------------

commit 96fd2e89 upstream.

The user recently report a perf issue in the ICX platform, when test by
perf event “uncore_imc_x/cas_count_write”,the write bandwidth is always
very small (only 0.38MB/s), it is caused by the wrong "umask" for the
"cas_count_write" event. When double-checking, find "cas_count_read"
also is wrong.

The public document for ICX uncore:

3rd Gen Intel® Xeon® Processor Scalable Family, Codename Ice Lake,Uncore
Performance Monitoring Reference Manual, Revision 1.00, May 2021

On 2.4.7, it defines Unit Masks for CAS_COUNT:
RD b00001111
WR b00110000

So corrected both "cas_count_read" and "cas_count_write" for ICX.

Old settings:
 hswep_uncore_imc_events
	INTEL_UNCORE_EVENT_DESC(cas_count_read,  "event=0x04,umask=0x03")
	INTEL_UNCORE_EVENT_DESC(cas_count_write, "event=0x04,umask=0x0c")

New settings:
 snr_uncore_imc_events
	INTEL_UNCORE_EVENT_DESC(cas_count_read,  "event=0x04,umask=0x0f")
	INTEL_UNCORE_EVENT_DESC(cas_count_write, "event=0x04,umask=0x30")

Fixes: 2b3b76b5 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Signed-off-by: NZhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAdrian Hunter <adrian.hunter@intel.com>
Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20211223144826.841267-1-zhengjun.xing@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0442358b

Revert "KVM: SVM: avoid infinite loop on NPF from bad address" · feeadf29

由 Sean Christopherson 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit a2c8e1d9e41b7d916257653d3bbe36418c4e7b88
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a2c8e1d9e41b7d916257653d3bbe36418c4e7b88

--------------------------------

commit 31c25585 upstream.

Revert a completely broken check on an "invalid" RIP in SVM's workaround
for the DecodeAssists SMAP errata.  kvm_vcpu_gfn_to_memslot() obviously
expects a gfn, i.e. operates in the guest physical address space, whereas
RIP is a virtual (not even linear) address.  The "fix" worked for the
problematic KVM selftest because the test identity mapped RIP.

Fully revert the hack instead of trying to translate RIP to a GPA, as the
non-SEV case is now handled earlier, and KVM cannot access guest page
tables to translate RIP.

This reverts commit e72436bc.

Fixes: e72436bc ("KVM: SVM: avoid infinite loop on NPF from bad address")
Reported-by: NLiam Merwick <liam.merwick@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSean Christopherson <seanjc@google.com>
Reviewed-by: NLiam Merwick <liam.merwick@oracle.com>
Message-Id: <20220120010719.711476-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

feeadf29

s390/hypfs: include z/VM guests with access control group set · 57224967

由 Vasily Gorbik 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit 6520fedfcebb618bd3ff517222f9f0c72104728b
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6520fedfcebb618bd3ff517222f9f0c72104728b

--------------------------------

commit 663d34c8 upstream.

Currently if z/VM guest is allowed to retrieve hypervisor performance
data globally for all guests (privilege class B) the query is formed in a
way to include all guests but the group name is left empty. This leads to
that z/VM guests which have access control group set not being included
in the results (even local vm).

Change the query group identifier from empty to "any" to retrieve
information about all guests from any groups (or without a group set).

Cc: stable@vger.kernel.org
Fixes: 31cb4bd3 ("[S390] Hypervisor filesystem (s390_hypfs) for z/VM")
Reviewed-by: NGerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

57224967

s390/module: fix loading modules with a lot of relocations · 04c2b174

由 Ilya Leoshkevich 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.96
commit c10e0627c71c13b6f491e4a193abc84d9f08727e
bugzilla: https://gitee.com/openeuler/kernel/issues/I55NWB

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c10e0627c71c13b6f491e4a193abc84d9f08727e

--------------------------------

commit f3b7e73b upstream.

If the size of the PLT entries generated by apply_rela() exceeds
64KiB, the first ones can no longer reach __jump_r1 with brc. Fix by
using brcl. An alternative solution is to add a __jump_r1 copy after
every 64KiB, however, the space savings are quite small and do not
justify the additional complexity.

Fixes: f19fbd5e ("s390: introduce execute-trampolines for branches")
Cc: stable@vger.kernel.org
Reported-by: NAndrea Righi <andrea.righi@canonical.com>
Signed-off-by: NIlya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: NHeiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

04c2b174

KVM: x86/mmu: Fix write-protection of PTs mapped by the TDP MMU · 34011a9a

由 David Matlack 提交于 5月 10, 2022

stable inclusion
from stable-v5.10.95
commit a447d7f786ec925d1c23f6509255f43ffc2ddffe
bugzilla: https://gitee.com/openeuler/kernel/issues/I55EDV

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a447d7f786ec925d1c23f6509255f43ffc2ddffe

--------------------------------

commit 7c8a4742 upstream.

When the TDP MMU is write-protection GFNs for page table protection (as
opposed to for dirty logging, or due to the HVA not being writable), it
checks if the SPTE is already write-protected and if so skips modifying
the SPTE and the TLB flush.

This behavior is incorrect because it fails to check if the SPTE
is write-protected for page table protection, i.e. fails to check
that MMU-writable is '0'.  If the SPTE was write-protected for dirty
logging but not page table protection, the SPTE could locklessly be made
writable, and vCPUs could still be running with writable mappings cached
in their TLB.

Fix this by only skipping setting the SPTE if the SPTE is already
write-protected *and* MMU-writable is already clear.  Technically,
checking only MMU-writable would suffice; a SPTE cannot be writable
without MMU-writable being set.  But check both to be paranoid and
because it arguably yields more readable code.

Fixes: 46044f72 ("kvm: x86/mmu: Support write protection for nesting in tdp MMU")
Cc: stable@vger.kernel.org
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Message-Id: <20220113233020.3986005-2-dmatlack@google.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NYu Liao <liaoyu15@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

34011a9a

28 4月, 2022 1 次提交

Revert "clocksource: Reduce clocksource-skew threshold" · 81d82781

由 Zheng Zengkai 提交于 4月 28, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I531X9

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=cacc6c30e3eb7c452132ee5b273e248d2f263323

--------------------------------

This reverts commit 270507d8.
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

81d82781

27 4月, 2022 25 次提交

arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G · ebe315a0

由 Zhen Lei 提交于 4月 27, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I545H8
CVE: NA

-------------------------------------------------------------------------

For the case crashkernel=X@offset and crashkernel=X,high, we've explicitly
used 'crashk_res' to mark the scope of the page-level mapping required, so
NO_BLOCK_MAPPINGS should not be required for other areas. Otherwise,
system performance will be affected. In fact, only the case crashkernel=X
requires page-level mapping for all low memory under 4G because it
attempts high memory after it fails to request low memory first, and we
cannot predict its final location.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ebe315a0

arm64: kdump: Use page-level mapping for the high memory of crashkernel · fa1b3019

由 Zhen Lei 提交于 4月 27, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I545H8
CVE: NA

-------------------------------------------------------------------------

If the crashkernel has both high memory above 4G and low memory under 4G,
kexec always loads the content such as Imge and dtb to the high memory
instead of the low memory. This means that only high memory requires write
protection based on page-level mapping. The allocation of high memory does
not depend on the DMA boundary. So we can reserve the high memory first
even if the crashkernel reservation is deferred.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fa1b3019

arm64: kdump: Don't force page-level mappings for memory above 4G · 243d897c

由 Zhen Lei 提交于 4月 27, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I545H8
CVE: NA

-------------------------------------------------------------------------

If the crashkernel reservation is deferred, such boundaries are not known
when the linear mapping is created. But its upper limit is fixed, cannot
above 4G. Therefore, unless otherwise required, block mapping should be
used for memory above 4G to improve performance.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

243d897c

arm64: kdump: Update the name of crashk_low_res · 12a91d0e

由 Zhen Lei 提交于 4月 27, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I545H8
CVE: NA

-------------------------------------------------------------------------

To be consistent with the style of other ARCHs such as x86, the kexec
commit b5a34a20984c ("arm64: support more than one crash kernel regions")
requires all crash regions to be named "Crash kernel". Update the name of
crashk_low_res, so that we can directly use the latest kexec tool without
having to maintain a private version.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

12a91d0e

x86: KVM: Fixed the bug that WAITmax cannot be updated in real time · 414a578b

由 liangtian 提交于 4月 27, 2022

virt inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I53PTV?from=project-issue
CVE: NA

-----------------------------------------------------

Since the reset function is in kvm_intel module instead of kvm
module, the attribute weak function in kvm_main.c could not be found, which
would cause st_max in X86 never be refreshed.
The solution is to define the reset function in x86.c under the kvm module.
Signed-off-by: Nliangtian <liangtian13@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

414a578b

powerpc: Free fdt on error in elf64_load() · b13f0066

由 Lakshmi Ramasubramanian 提交于 4月 27, 2022

mainline inclusion
from mainline-v5.13-rc1
commit a45dd984
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I53YU3

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a45dd984dea9baa22b15fb692fe870ab5670a4a0

--------------------------------

There are a few "goto out;" statements before the local variable "fdt"
is initialized through the call to of_kexec_alloc_and_setup_fdt() in
elf64_load().  This will result in an uninitialized "fdt" being passed
to kvfree() in this function if there is an error before the call to
of_kexec_alloc_and_setup_fdt().

If there is any error after fdt is allocated, but before it is
saved in the arch specific kimage struct, free the fdt.

Fixes: 3c985d31 ("powerpc: Use common of_kexec_alloc_and_setup_fdt()")
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NLakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421163610.23775-1-nramas@linux.microsoft.comSigned-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b13f0066

powerpc: If kexec_build_elf_info() fails return immediately from elf64_load() · 2f819d65

由 Lakshmi Ramasubramanian 提交于 4月 27, 2022

mainline inclusion
from mainline-v5.13-rc1
commit 031cc263
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I53YJF

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=031cc263c037a95e5d1249cbd3d55b77021f1eb8

--------------------------------

Uninitialized local variable "elf_info" would be passed to
kexec_free_elf_info() if kexec_build_elf_info() returns an error
in elf64_load().

If kexec_build_elf_info() returns an error, return the error
immediately.
Signed-off-by: NLakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421163610.23775-2-nramas@linux.microsoft.comSigned-off-by: NLin Yujun <linyujun809@huawei.com>
Reviewed-by: NZhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2f819d65

arm64: fix clang warning about TRAMP_VALIAS · e37cd6df

由 Arnd Bergmann 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.108
commit 2c010c61e614f3ae5d26bf0803797075cc649f0b
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=2c010c61e614

--------------------------------

[ Upstream commit 7f34b43e ]

The newly introduced TRAMP_VALIAS definition causes a build warning
with clang-14:

arch/arm64/include/asm/vectors.h:66:31: error: arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension [-Werror,-Wnull-pointer-arithmetic]
                return (char *)TRAMP_VALIAS + SZ_2K * slot;

Change the addition to something clang does not complain about.

Fixes: bd09128d ("arm64: Add percpu vectors for EL1")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NJames Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220316183833.1563139-1-arnd@kernel.orgSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e37cd6df

arm64: kvm: Fix copy-and-paste error in bhb templates for v5.10 stable · ac575312

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.107
commit 7a0d13ef67a1084e1a77bf4d2334cc482699f861
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7a0d13ef67a1084e1a77bf4d2334cc482699f861

--------------------------------

KVM's infrastructure for spectre mitigations in the vectors in v5.10 and
earlier is different, it uses templates which are used to build a set of
vectors at runtime.

There are two copy-and-paste errors in the templates: __spectre_bhb_loop_k24
should loop 24 times and __spectre_bhb_loop_k32 32.

Fix these.
Reported-by: NPavel Machek <pavel@denx.de>
Link: https://lore.kernel.org/all/20220310234858.GB16308@amd/Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ac575312

arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting · 40aff6d7

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit b65b87e718c33caa46d5246d8fbeda895aa9cf5b
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b65b87e718c3

--------------------------------

commit 58c9a506 upstream.

The mitigations for Spectre-BHB are only applied when an exception is
taken from user-space. The mitigation status is reported via the spectre_v2
sysfs vulnerabilities file.

When unprivileged eBPF is enabled the mitigation in the exception vectors
can be avoided by an eBPF program.

When unprivileged eBPF is enabled, print a warning and report vulnerable
via the sysfs vulnerabilities file.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

40aff6d7

arm64: Use the clearbhb instruction in mitigations · 7e4b9669

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 551717cf3b58f11311d10f70eb027d4b275135de
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=551717cf3b58

--------------------------------

commit 228a26b9 upstream.

Future CPUs may implement a clearbhb instruction that is sufficient
to mitigate SpectreBHB. CPUs that implement this instruction, but
not CSV2.3 must be affected by Spectre-BHB.

Add support to use this instruction as the BHB mitigation on CPUs
that support it. The instruction is in the hint space, so it will
be treated by a NOP as older CPUs.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
[ modified for stable: Use a KVM vector template instead of alternatives,
  removed bitmap of mitigations ]
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7e4b9669

KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated · 980d2a04

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 38c26bdb3cc53f219d6ab75ac1a95436f393c60f
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=38c26bdb3cc5

--------------------------------

commit a5905d6a upstream.

KVM allows the guest to discover whether the ARCH_WORKAROUND SMCCC are
implemented, and to preserve that state during migration through its
firmware register interface.

Add the necessary boiler plate for SMCCC_ARCH_WORKAROUND_3.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

980d2a04

arm64: Mitigate spectre style branch history side channels · b84dced7

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit e192c8baa69ac8a5585d61ac535aa1e5eb795e80
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e192c8baa69a

--------------------------------

commit 558c303c upstream.

Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation.
When taking an exception from user-space, a sequence of branches
or a firmware call overwrites or invalidates the branch history.

The sequence of branches is added to the vectors, and should appear
before the first indirect branch. For systems using KPTI the sequence
is added to the kpti trampoline where it has a free register as the exit
from the trampoline is via a 'ret'. For systems not using KPTI, the same
register tricks are used to free up a register in the vectors.

For the firmware call, arch-workaround-3 clobbers 4 registers, so
there is no choice but to save them to the EL1 stack. This only happens
for entry from EL0, so if we take an exception due to the stack access,
it will not become re-entrant.

For KVM, the existing branch-predictor-hardening vectors are used.
When a spectre version of these vectors is in use, the firmware call
is sufficient to mitigate against Spectre-BHB. For the non-spectre
versions, the sequence of branches is added to the indirect vector.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
[ modified for stable, removed bitmap of mitigations, use kvm template
infrastructure ]
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b84dced7

KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A · 4ad93d29

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 192023e6baf7cce7fb76ff3a5c24c55968c774ff
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=192023e6baf7

--------------------------------

commit 5bdf3437 upstream.

CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware
call from the vectors, or run a sequence of branches. This gets added
to the hyp vectors. If there is no support for arch-workaround-1 in
firmware, the indirect vector will be used.

kvm_init_vector_slots() only initialises the two indirect slots if
the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors()
only initialises __hyp_bp_vect_base if the platform is vulnerable to
Spectre-v3a.

As there are about to more users of the indirect vectors, ensure
their entries in hyp_spectre_vector_selector[] are always initialised,
and __hyp_bp_vect_base defaults to the regular VA mapping.

The Spectre-v3a check is moved to a helper
kvm_system_needs_idmapped_vectors(), and merged with the code
that creates the hyp mappings.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4ad93d29

arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2 · 0318b9ff

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 13a807a0a080383ceab6c40e53c0228108423e51
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=13a807a0a080

--------------------------------

commit dee435be upstream.

Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation as part of
a spectre-v2 attack. This is not mitigated by CSV2, meaning CPUs that
previously reported 'Not affected' are now moderately mitigated by CSV2.

Update the value in /sys/devices/system/cpu/vulnerabilities/spectre_v2
to also show the state of the BHB mitigation.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0318b9ff

arm64: Add percpu vectors for EL1 · 2acb40eb

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 1f63326a5211208e2c5868650e47f13a9072afde
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1f63326a5211

--------------------------------

commit bd09128d upstream.

The Spectre-BHB workaround adds a firmware call to the vectors. This
is needed on some CPUs, but not others. To avoid the unaffected CPU in
a big/little pair from making the firmware call, create per cpu vectors.

The per-cpu vectors only apply when returning from EL0.

Systems using KPTI can use the canonical 'full-fat' vectors directly at
EL1, the trampoline exit code will switch to this_cpu_vector on exit to
EL0. Systems not using KPTI should always use this_cpu_vector.

this_cpu_vector will point at a vector in tramp_vecs or
__bp_harden_el1_vectors, depending on whether KPTI is in use.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2acb40eb

arm64: entry: Add macro for reading symbol addresses from the trampoline · 9dcaaf74

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 56cf5326bdf9c20de9a45e4a7a4c0ae16833e561
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=56cf5326bdf9

--------------------------------

commit b28a8eeb upstream.

The trampoline code needs to use the address of symbols in the wider
kernel, e.g. vectors. PC-relative addressing wouldn't work as the
trampoline code doesn't run at the address the linker expected.

tramp_ventry uses a literal pool, unless CONFIG_RANDOMIZE_BASE is
set, in which case it uses the data page as a literal pool because
the data page can be unmapped when running in user-space, which is
required for CPUs vulnerable to meltdown.

Pull this logic out as a macro, instead of adding a third copy
of it.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9dcaaf74

arm64: entry: Add vectors that have the bhb mitigation sequences · 5acc7c16

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 3f21b7e355237aa2f8196ad44c2b7456a739518d
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3f21b7e35523

--------------------------------

commit ba268923 upstream.

Some CPUs affected by Spectre-BHB need a sequence of branches, or a
firmware call to be run before any indirect branch. This needs to go
in the vectors. No CPU needs both.

While this can be patched in, it would run on all CPUs as there is a
single set of vectors. If only one part of a big/little combination is
affected, the unaffected CPUs have to run the mitigation too.

Create extra vectors that include the sequence. Subsequent patches will
allow affected CPUs to select this set of vectors. Later patches will
modify the loop count to match what the CPU requires.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5acc7c16

arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations · 9d5892b1

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 49379552969acee3237387cc258848437e127d98
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=49379552969a

--------------------------------

commit aff65393 upstream.

kpti is an optional feature, for systems not using kpti a set of
vectors for the spectre-bhb mitigations is needed.

Add another set of vectors, __bp_harden_el1_vectors, that will be
used if a mitigation is needed and kpti is not in use.

The EL1 ventries are repeated verbatim as there is no additional
work needed for entry from EL1.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9d5892b1

arm64: entry: Allow the trampoline text to occupy multiple pages · a7d55466

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 26211252c1c104732a0fea6c37645f1b670587f5
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=26211252c1c1

--------------------------------

commit a9c406e6 upstream.

Adding a second set of vectors to .entry.tramp.text will make it
larger than a single 4K page.

Allow the trampoline text to occupy up to three pages by adding two
more fixmap slots. Previous changes to tramp_valias allowed it to reach
beyond a single page.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a7d55466

arm64: entry: Make the kpti trampoline's kpti sequence optional · b91bf187

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 73ee716a1f6356ca86d16d4ffc97fcfc7961d3ef
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=73ee716a1f63

--------------------------------

commit c47e4d04 upstream.

Spectre-BHB needs to add sequences to the vectors. Having one global
set of vectors is a problem for big/little systems where the sequence
is costly on cpus that are not vulnerable.

Making the vectors per-cpu in the style of KVM's bh_harden_hyp_vecs
requires the vectors to be generated by macros.

Make the kpti re-mapping of the kernel optional, so the macros can be
used without kpti.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b91bf187

arm64: entry: Move trampoline macros out of ifdef'd section · 616834c6

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 8c691e5308c531deede16bef4f2d933d5f859ce7
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8c691e5308c5

--------------------------------

commit 13d7a083 upstream.

The macros for building the kpti trampoline are all behind
CONFIG_UNMAP_KERNEL_AT_EL0, and in a region that outputs to the
.entry.tramp.text section.

Move the macros out so they can be used to generate other kinds of
trampoline. Only the symbols need to be guarded by
CONFIG_UNMAP_KERNEL_AT_EL0 and appear in the .entry.tramp.text section.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

616834c6

arm64: entry: Don't assume tramp_vectors is the start of the vectors · acdcb35e

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit e55025063276fcf7b07e9340c38d70b04aa8a7b9
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e55025063276

--------------------------------

commit ed50da77 upstream.

The tramp_ventry macro uses tramp_vectors as the address of the vectors
when calculating which ventry in the 'full fat' vectors to branch to.

While there is one set of tramp_vectors, this will be true.
Adding multiple sets of vectors will break this assumption.

Move the generation of the vectors to a macro, and pass the start
of the vectors as an argument to tramp_ventry.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

acdcb35e

arm64: entry: Allow tramp_alias to access symbols after the 4K boundary · 4f48a4ca

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit 5275fb5ea5f573ce1ecd2bf0bcd928abb916b43d
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5275fb5ea5f5

--------------------------------

commit 6c5bf79b upstream.

Systems using kpti enter and exit the kernel through a trampoline mapping
that is always mapped, even when the kernel is not. tramp_valias is a macro
to find the address of a symbol in the trampoline mapping.

Adding extra sets of vectors will expand the size of the entry.tramp.text
section to beyond 4K. tramp_valias will be unable to generate addresses
for symbols beyond 4K as it uses the 12 bit immediate of the add
instruction.

As there are now two registers available when tramp_alias is called,
use the extra register to avoid the 4K limit of the 12 bit immediate.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4f48a4ca

arm64: entry: Move the trampoline data page before the text page · ddc7070b

由 James Morse 提交于 4月 27, 2022

stable inclusion
from stable-v5.10.105
commit bda89602814c69e6f027878209b0b9453133ada2
category: bugfix
bugzilla: 186460 https://gitee.com/src-openeuler/kernel/issues/I53MHA
CVE: CVE-2022-23960

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=bda89602814c

--------------------------------

commit c091fb6a upstream.

The trampoline code has a data page that holds the address of the vectors,
which is unmapped when running in user-space. This ensures that with
CONFIG_RANDOMIZE_BASE, the randomised address of the kernel can't be
discovered until after the kernel has been mapped.

If the trampoline text page is extended to include multiple sets of
vectors, it will be larger than a single page, making it tricky to
find the data page without knowing the size of the trampoline text
pages, which will vary with PAGE_SIZE.

Move the data page to appear before the text page. This allows the
data page to be found without knowing the size of the trampoline text
pages. 'tramp_vectors' is used to refer to the beginning of the
.entry.tramp.text section, do that explicitly.
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ddc7070b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功