提交 · b9b01a5625b5a9e9d96d14d4a813a54e8a124f4b · openeuler / Kernel

18 11月, 2022 5 次提交

random: use random.trust_{bootloader,cpu} command line option only · b9b01a56

由 Jason A. Donenfeld 提交于 11月 01, 2022

It's very unusual to have both a command line option and a compile time
option, and apparently that's confusing to people. Also, basically
everybody enables the compile time option now, which means people who
want to disable this wind up having to use the command line option to
ensure that anyway. So just reduce the number of moving pieces and nix
the compile time option in favor of the more versatile command line
option.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

b9b01a56

stackprotector: actually use get_random_canary() · 622754e8

由 Jason A. Donenfeld 提交于 10月 23, 2022

The RNG always mixes in the Linux version extremely early in boot. It
also always includes a cycle counter, not only during early boot, but
each and every time it is invoked prior to being fully initialized.
Together, this means that the use of additional xors inside of the
various stackprotector.h files is superfluous and over-complicated.
Instead, we can get exactly the same thing, but better, by just calling
`get_random_canary()`.

Acked-by: Guo Ren <guoren@kernel.org> # for csky
Acked-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

622754e8

stackprotector: move get_random_canary() into stackprotector.h · b3883a9a

由 Jason A. Donenfeld 提交于 10月 23, 2022

This has nothing to do with random.c and everything to do with stack
protectors. Yes, it uses randomness. But many things use randomness.
random.h and random.c are concerned with the generation of randomness,
not with each and every use. So move this function into the more
specific stackprotector.h file where it belongs.
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

b3883a9a

treewide: use get_random_u32_inclusive() when possible · e8a533cb

由 Jason A. Donenfeld 提交于 10月 09, 2022

These cases were done with this Coccinelle:

@@
expression H;
expression L;
@@
- (get_random_u32_below(H) + L)
+ get_random_u32_inclusive(L, H + L - 1)

@@
expression H;
expression L;
expression E;
@@
  get_random_u32_inclusive(L,
  H
- + E
- - E
  )

@@
expression H;
expression L;
expression E;
@@
  get_random_u32_inclusive(L,
  H
- - E
- + E
  )

@@
expression H;
expression L;
expression E;
expression F;
@@
  get_random_u32_inclusive(L,
  H
- - E
  + F
- + E
  )

@@
expression H;
expression L;
expression E;
expression F;
@@
  get_random_u32_inclusive(L,
  H
- + E
  + F
- - E
  )

And then subsequently cleaned up by hand, with several automatic cases
rejected if it didn't make sense contextually.
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

e8a533cb

treewide: use get_random_u32_below() instead of deprecated function · 8032bf12

由 Jason A. Donenfeld 提交于 10月 09, 2022

This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

8032bf12

11 11月, 2022 12 次提交

MIPS: pic32: treat port as signed integer · 64806090

由 Jason A. Donenfeld 提交于 10月 28, 2022

get_port_from_cmdline() returns an int, yet is assigned to a char, which
is wrong in its own right, but also, with char becoming unsigned, this
poses problems, because -1 is used as an error value. Further
complicating things, fw_init_early_console() is only ever called with a
-1 argument. Fix this up by removing the unused argument from
fw_init_early_console() and treating port as a proper signed integer.

Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

64806090

MIPS: jump_label: Fix compat branch range check · 64ac0bef

由 Jiaxun Yang 提交于 11月 03, 2022

Cast upper bound of branch range to long to do signed compare,
avoid negative offset trigger this warning.

Fixes: 9b6584e3 ("MIPS: jump_label: Use compact branches for >= r6")
Signed-off-by: NJiaxun Yang <jiaxun.yang@flygoat.com>
Cc: stable@vger.kernel.org
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

64ac0bef

mips: alchemy: gpio: Include the right header · 2a296157

由 Linus Walleij 提交于 11月 03, 2022

The local GPIO driver in the MIPS Alchemy is including the legacy
<linux/gpio.h> header but what it wants is to implement a GPIO
driver so include <linux/gpio/driver.h> instead.

Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: linux-gpio@vger.kernel.org
Reviewed-by: NPhilippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

2a296157

MIPS: Loongson64: Add WARN_ON on kexec related kmalloc failed · fa706927

由 Liao Chang 提交于 11月 03, 2022

Add WARN_ON on kexec related kmalloc failed, avoid to pass NULL pointer
to following memcpy and loongson_kexec_prepare.

Fixes: 6ce48897 ("MIPS: Loongson64: Add kexec/kdump support")
Signed-off-by: NLiao Chang <liaochang1@huawei.com>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

fa706927

MIPS: fix duplicate definitions for exported symbols · 612d8078

由 Rongwei Zhang 提交于 11月 02, 2022

Building with clang-14 fails with:

AS      arch/mips/kernel/relocate_kernel.o
<unknown>:0: error: symbol 'kexec_args' is already defined
<unknown>:0: error: symbol 'secondary_kexec_args' is already defined
<unknown>:0: error: symbol 'kexec_start_address' is already defined
<unknown>:0: error: symbol 'kexec_indirection_page' is already defined
<unknown>:0: error: symbol 'relocate_new_kernel_size' is already defined

It turns out EXPORT defined in asm/asm.h expands to a symbol definition,
so there is no need to define these symbols again. Remove duplicated
symbol definitions.

Fixes: 7aa1c8f4 ("MIPS: kdump: Add support")
Signed-off-by: NRongwei Zhang <pudh4418@gmail.com>
Reviewed-by: NNathan Chancellor <nathan@kernel.org>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

612d8078

mips: boot/compressed: use __NO_FORTIFY · 62776e43

由 John Thomson 提交于 11月 01, 2022

In the mips CONFIG_SYS_SUPPORTS_ZBOOT kernel, fix the compile error
when using CONFIG_FORTIFY_SOURCE=y

LD      vmlinuz
mipsel-openwrt-linux-musl-ld: arch/mips/boot/compressed/decompress.o: in
function `decompress_kernel':
./include/linux/decompress/mm.h:(.text.decompress_kernel+0x177c):
undefined reference to `warn_slowpath_fmt'

kernel test robot helped identify this as related to fortify. The error
appeared with commit 54d9469b ("fortify: Add run-time WARN for
cross-field memcpy()")
Link: https://lore.kernel.org/r/202209161144.x9xSqNQZ-lkp@intel.com/

Resolve this in the same style as commit cfecea6e ("lib/string:
Move helper functions out of string.c")
Reported-by: Nkernel test robot <lkp@intel.com>
Fixes: 54d9469b ("fortify: Add run-time WARN for cross-field memcpy()")
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJohn Thomson <git@johnthomson.fastmail.com.au>
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>

62776e43

KVM: x86/mmu: Block all page faults during kvm_zap_gfn_range() · 6d3085e4

由 Sean Christopherson 提交于 11月 11, 2022

When zapping a GFN range, pass 0 => ALL_ONES for the to-be-invalidated
range to effectively block all page faults while the zap is in-progress.
The invalidation helpers take a host virtual address, whereas zapping a
GFN obviously provides a guest physical address and with the wrong unit
of measurement (frame vs. byte).

Alternatively, KVM could walk all memslots to get the associated HVAs,
but thanks to SMM, that would require multiple lookups.  And practically
speaking, kvm_zap_gfn_range() usage is quite rare and not a hot path,
e.g. MTRR and CR0.CD are almost guaranteed to be done only on vCPU0
during boot, and APICv inhibits are similarly infrequent operations.

Fixes: edb298c6 ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range")
Reported-by: NChao Peng <chao.p.peng@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20221111001841.2412598-1-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6d3085e4

RISC-V: vdso: Do not add missing symbols to version section in linker script · fcae44fd

由 Nathan Chancellor 提交于 11月 08, 2022

Recently, ld.lld moved from '--undefined-version' to
'--no-undefined-version' as the default, which breaks the compat vDSO
build:

  ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_gettimeofday' failed: symbol not defined
  ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_gettime' failed: symbol not defined
  ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_getres' failed: symbol not defined

These symbols are not present in the compat vDSO or the regular vDSO for
32-bit but they are unconditionally included in the version section of
the linker script, which is prohibited with '--no-undefined-version'.

Fix this issue by only including the symbols that are actually exported
in the version section of the linker script.

Link: https://github.com/ClangBuiltLinux/linux/issues/1756Signed-off-by: NNathan Chancellor <nathan@kernel.org>
Tested-by: NConor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221108171324.3377226-1-nathan@kernel.org/Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>

fcae44fd

riscv: fix reserved memory setup · 50e63dd8

由 Conor Dooley 提交于 11月 07, 2022

Currently, RISC-V sets up reserved memory using the "early" copy of the
device tree. As a result, when trying to get a reserved memory region
using of_reserved_mem_lookup(), the pointer to reserved memory regions
is using the early, pre-virtual-memory address which causes a kernel
panic when trying to use the buffer's name:

 Unable to handle kernel paging request at virtual address 00000000401c31ac
 Oops [#1]
 Modules linked in:
 CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc1-00001-g0d9d6953d834 #1
 Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
 epc : string+0x4a/0xea
  ra : vsnprintf+0x1e4/0x336
 epc : ffffffff80335ea0 ra : ffffffff80338936 sp : ffffffff81203be0
  gp : ffffffff812e0a98 tp : ffffffff8120de40 t0 : 0000000000000000
  t1 : ffffffff81203e28 t2 : 7265736572203a46 s0 : ffffffff81203c20
  s1 : ffffffff81203e28 a0 : ffffffff81203d22 a1 : 0000000000000000
  a2 : ffffffff81203d08 a3 : 0000000081203d21 a4 : ffffffffffffffff
  a5 : 00000000401c31ac a6 : ffff0a00ffffff04 a7 : ffffffffffffffff
  s2 : ffffffff81203d08 s3 : ffffffff81203d00 s4 : 0000000000000008
  s5 : ffffffff000000ff s6 : 0000000000ffffff s7 : 00000000ffffff00
  s8 : ffffffff80d9821a s9 : ffffffff81203d22 s10: 0000000000000002
  s11: ffffffff80d9821c t3 : ffffffff812f3617 t4 : ffffffff812f3617
  t5 : ffffffff812f3618 t6 : ffffffff81203d08
 status: 0000000200000100 badaddr: 00000000401c31ac cause: 000000000000000d
 [<ffffffff80338936>] vsnprintf+0x1e4/0x336
 [<ffffffff80055ae2>] vprintk_store+0xf6/0x344
 [<ffffffff80055d86>] vprintk_emit+0x56/0x192
 [<ffffffff80055ed8>] vprintk_default+0x16/0x1e
 [<ffffffff800563d2>] vprintk+0x72/0x80
 [<ffffffff806813b2>] _printk+0x36/0x50
 [<ffffffff8068af48>] print_reserved_mem+0x1c/0x24
 [<ffffffff808057ec>] paging_init+0x528/0x5bc
 [<ffffffff808031ae>] setup_arch+0xd0/0x592
 [<ffffffff8080070e>] start_kernel+0x82/0x73c

early_init_fdt_scan_reserved_mem() takes no arguments as it operates on
initial_boot_params, which is populated by early_init_dt_verify(). On
RISC-V, early_init_dt_verify() is called twice. Once, directly, in
setup_arch() if CONFIG_BUILTIN_DTB is not enabled and once indirectly,
very early in the boot process, by parse_dtb() when it calls
early_init_dt_scan_nodes().

This first call uses dtb_early_va to set initial_boot_params, which is
not usable later in the boot process when
early_init_fdt_scan_reserved_mem() is called. On arm64 for example, the
corresponding call to early_init_dt_scan_nodes() uses fixmap addresses
and doesn't suffer the same fate.

Move early_init_fdt_scan_reserved_mem() further along the boot sequence,
after the direct call to early_init_dt_verify() in setup_arch() so that
the names use the correct virtual memory addresses. The above supposed
that CONFIG_BUILTIN_DTB was not set, but should work equally in the case
where it is - unflatted_and_copy_device_tree() also updates
initial_boot_params.
Reported-by: NValentina Fernandez <valentina.fernandezalanis@microchip.com>
Reported-by: NEvgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/linux-riscv/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/
Fixes: 922b0375 ("riscv: Fix memblock reservation for device tree blob")
Signed-off-by: NConor Dooley <conor.dooley@microchip.com>
Tested-by: NEvgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/r/20221107151524.3941467-1-conor.dooley@microchip.comSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>

50e63dd8

arm64: efi: Fix handling of misaligned runtime regions and drop warning · 9b9eaee9

由 Ard Biesheuvel 提交于 11月 06, 2022

Currently, when mapping the EFI runtime regions in the EFI page tables,
we complain about misaligned regions in a rather noisy way, using
WARN().

Not only does this produce a lot of irrelevant clutter in the log, it is
factually incorrect, as misaligned runtime regions are actually allowed
by the EFI spec as long as they don't require conflicting memory types
within the same 64k page.

So let's drop the warning, and tweak the code so that we
- take both the start and end of the region into account when checking
  for misalignment
- only revert to RWX mappings for non-code regions if misaligned code
  regions are also known to exist.

Cc: <stable@vger.kernel.org>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>

9b9eaee9

riscv: vdso: fix build with llvm · 50f4dd65

由 Jisheng Zhang 提交于 11月 01, 2022

Even after commit 89fd4a1d ("riscv: jump_label: mark arguments as
const to satisfy asm constraints"), building with CC_OPTIMIZE_FOR_SIZE
+ LLVM=1 can reproduce below build error:

  CC      arch/riscv/kernel/vdso/vgettimeofday.o
In file included from <built-in>:4:
In file included from lib/vdso/gettimeofday.c:5:
In file included from include/vdso/datapage.h:17:
In file included from include/vdso/processor.h:10:
In file included from arch/riscv/include/asm/vdso/processor.h:7:
In file included from include/linux/jump_label.h:112:
arch/riscv/include/asm/jump_label.h:42:3: error:
invalid operand for inline asm constraint 'i'
                "       .option push                            \n\t"
                ^
1 error generated.

I think the problem is when "-Os" is passed as CFLAGS, it's removed by
"CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os" which is
introduced in commit e05d57dc ("riscv: Fixup __vdso_gettimeofday
broke dynamic ftrace"), thus no optimization at all for vgettimeofday.c
arm64 does remove "-Os" as well, but it forces "-O2" after removing
"-Os".

I compared the generated vgettimeofday.o with "-O2" and "-Os",
I think no big performance difference. So let's tell the kbuild not
to remove "-Os" rather than follow arm64 style.

vdso related performance can be improved a lot when building kernel with
CC_OPTIMIZE_FOR_SIZE after this commit, ("-Os" VS no optimization)

Fixes: e05d57dc ("riscv: Fixup __vdso_gettimeofday broke dynamic ftrace")
Signed-off-by: NJisheng Zhang <jszhang@kernel.org>
Tested-by: NConor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221031182943.2453-1-jszhang@kernel.orgSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>

50f4dd65

riscv: process: fix kernel info leakage · 6510c784

由 Jisheng Zhang 提交于 10月 29, 2022

thread_struct's s[12] may contain random kernel memory content, which
may be finally leaked to userspace. This is a security hole. Fix it
by clearing the s[12] array in thread_struct when fork.

As for kthread case, it's better to clear the s[12] array as well.

Fixes: 7db91e57 ("RISC-V: Task implementation")
Signed-off-by: NJisheng Zhang <jszhang@kernel.org>
Tested-by: NGuo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20221029113450.4027-1-jszhang@kernel.orgReviewed-by: NGuo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/CAJF2gTSdVyAaM12T%2B7kXAdRPGS4VyuO08X1c7paE-n4Fr8OtRA@mail.gmail.com/Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>

6510c784

10 11月, 2022 15 次提交

KVM: x86/pmu: Limit the maximum number of supported AMD GP counters · 556f3c9a

由 Like Xu 提交于 9月 19, 2022

The AMD PerfMonV2 specification allows for a maximum of 16 GP counters,
but currently only 6 pairs of MSRs are accepted by KVM.

While AMD64_NUM_COUNTERS_CORE is already equal to 6, increasing without
adjusting msrs_to_save_all[] could result in out-of-bounds accesses.
Therefore introduce a macro (named KVM_AMD_PMC_MAX_GENERIC) to
refer to the number of counters supported by KVM.
Signed-off-by: NLike Xu <likexu@tencent.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Message-Id: <20220919091008.60695-3-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

556f3c9a

KVM: x86/pmu: Limit the maximum number of supported Intel GP counters · 4f1fa2a1

由 Like Xu 提交于 9月 19, 2022

The Intel Architectural IA32_PMCx MSRs addresses range allows for a
maximum of 8 GP counters, and KVM cannot address any more.  Introduce a
local macro (named KVM_INTEL_PMC_MAX_GENERIC) and use it consistently to
refer to the number of counters supported by KVM, thus avoiding possible
out-of-bound accesses.
Suggested-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NLike Xu <likexu@tencent.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Message-Id: <20220919091008.60695-2-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4f1fa2a1

KVM: x86/pmu: Do not speculatively query Intel GP PMCs that don't exist yet · 8631ef59

由 Like Xu 提交于 9月 19, 2022

The SDM lists an architectural MSR IA32_CORE_CAPABILITIES (0xCF)
that limits the theoretical maximum value of the Intel GP PMC MSRs
allocated at 0xC1 to 14; likewise the Intel April 2022 SDM adds
IA32_OVERCLOCKING_STATUS at 0x195 which limits the number of event
selection MSRs to 15 (0x186-0x194).

Limiting the maximum number of counters to 14 or 18 based on the currently
allocated MSRs is clearly fragile, and it seems likely that Intel will
even place PMCs 8-15 at a completely different range of MSR indices.
So stop at the maximum number of GP PMCs supported today on Intel
processors.

There are some machines, like Intel P4 with non Architectural PMU, that
may indeed have 18 counters, but those counters are in a completely
different MSR address range and are not supported by KVM.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Fixes: cf05a67b ("KVM: x86: omit "impossible" pmu MSRs from MSR list")
Suggested-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NLike Xu <likexu@tencent.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Message-Id: <20220919091008.60695-1-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8631ef59

KVM: SVM: Only dump VMSA to klog at KERN_DEBUG level · 0bd8bd2f

由 Peter Gonda 提交于 11月 04, 2022

Explicitly print the VMSA dump at KERN_DEBUG log level, KERN_CONT uses
KERNEL_DEFAULT if the previous log line has a newline, i.e. if there's
nothing to continuing, and as a result the VMSA gets dumped when it
shouldn't.

The KERN_CONT documentation says it defaults back to KERNL_DEFAULT if the
previous log line has a newline. So switch from KERN_CONT to
print_hex_dump_debug().

Jarkko pointed this out in reference to the original patch. See:
https://lore.kernel.org/all/YuPMeWX4uuR1Tz3M@kernel.org/
print_hex_dump(KERN_DEBUG, ...) was pointed out there, but
print_hex_dump_debug() should similar.

Fixes: 6fac42f1 ("KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog")
Signed-off-by: NPeter Gonda <pgonda@google.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Harald Hoyer <harald@profian.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Message-Id: <20221104142220.469452-1-pgonda@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0bd8bd2f

x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers · bd3d394e

由 Paolo Bonzini 提交于 9月 30, 2022

x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bd3d394e

KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly · 9f2febf3

由 Paolo Bonzini 提交于 9月 30, 2022

Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Suggested-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9f2febf3

KVM: SVM: restore host save area from assembly · e287bd00

由 Paolo Bonzini 提交于 11月 07, 2022

Allow access to the percpu area via the GS segment base, which is
needed in order to access the saved host spec_ctrl value.  In linux-next
FILL_RETURN_BUFFER also needs to access percpu data.

For simplicity, the physical address of the save area is added to struct
svm_cpu_data.

Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Reported-by: NNathan Chancellor <nathan@kernel.org>
Analyzed-by: NAndrew Cooper <andrew.cooper3@citrix.com>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e287bd00

KVM: SVM: move guest vmsave/vmload back to assembly · e61ab42d

由 Paolo Bonzini 提交于 11月 07, 2022

It is error-prone that code after vmexit cannot access percpu data
because GSBASE has not been restored yet.  It forces MSR_IA32_SPEC_CTRL
save/restore to happen very late, after the predictor untraining
sequence, and it gets in the way of return stack depth tracking
(a retbleed mitigation that is in linux-next as of 2022-11-09).

As a first step towards fixing that, move the VMCB VMSAVE/VMLOAD to
assembly, essentially undoing commit fb0c4a4f ("KVM: SVM: move
VMLOAD/VMSAVE to C code", 2021-03-15).  The reason for that commit was
that it made it simpler to use a different VMCB for VMLOAD/VMSAVE versus
VMRUN; but that is not a big hassle anymore thanks to the kvm-asm-offsets
machinery and other related cleanups.

The idea on how to number the exception tables is stolen from
a prototype patch by Peter Zijlstra.

Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Link: <https://lore.kernel.org/all/f571e404-e625-bae1-10e9-449b2eb4cbd8@citrix.com/>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e61ab42d

KVM: SVM: do not allocate struct svm_cpu_data dynamically · 73412dfe

由 Paolo Bonzini 提交于 11月 09, 2022

The svm_data percpu variable is a pointer, but it is allocated via
svm_hardware_setup() when KVM is loaded.  Unlike hardware_enable()
this means that it is never NULL for the whole lifetime of KVM, and
static allocation does not waste any memory compared to the status quo.
It is also more efficient and more easily handled from assembly code,
so do it and don't look back.
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73412dfe

KVM: SVM: remove dead field from struct svm_cpu_data · 181d0fb0

由 Paolo Bonzini 提交于 11月 09, 2022

The "cpu" field of struct svm_cpu_data has been write-only since commit
4b656b12 ("KVM: SVM: force new asid on vcpu migration", 2009-08-05).
Remove it.
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

181d0fb0

KVM: SVM: remove unused field from struct vcpu_svm · 00145978

由 Paolo Bonzini 提交于 11月 09, 2022

The pointer to svm_cpu_data in struct vcpu_svm looks interesting from
the point of view of accessing it after vmexit, when the GSBASE is still
containing the guest value. However, despite existing since the very
first commit of drivers/kvm/svm.c (commit 6aa8b732, "[PATCH] kvm:
userspace interface", 2006-12-10), it was never set to anything.

Ignore the opportunity to fix a 16 year old "bug" and delete it; doing
things the "harder" way makes it possible to remove more old cruft.
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

00145978

KVM: SVM: retrieve VMCB from assembly · f6d58266

由 Paolo Bonzini 提交于 11月 07, 2022

Continue moving accesses to struct vcpu_svm to vmenter.S.  Reducing the
number of arguments limits the chance of mistakes due to different
registers used for argument passing in 32- and 64-bit ABIs; pushing the
VMCB argument and almost immediately popping it into a different
register looks pretty weird.

32-bit ABI is not a concern for __svm_sev_es_vcpu_run() which is 64-bit
only; however, it will soon need @svm to save/restore SPEC_CTRL so stay
consistent with __svm_vcpu_run() and let them share the same prototype.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f6d58266

KVM: SVM: adjust register allocation for __svm_vcpu_run() · f7ef2801

由 Paolo Bonzini 提交于 10月 28, 2022

32-bit ABI uses RAX/RCX/RDX as its argument registers, so they are in
the way of instructions that hardcode their operands such as RDMSR/WRMSR
or VMLOAD/VMRUN/VMSAVE.

In preparation for moving vmload/vmsave to __svm_vcpu_run(), keep
the pointer to the struct vcpu_svm in %rdi.  In particular, it is now
possible to load svm->vmcb01.pa in %rax without clobbering the struct
vcpu_svm pointer.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f7ef2801

KVM: SVM: replace regs argument of __svm_vcpu_run() with vcpu_svm · 16fdc1de

由 Paolo Bonzini 提交于 9月 30, 2022

Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

16fdc1de

KVM: x86: use a separate asm-offsets.c file · debc5a1e

由 Paolo Bonzini 提交于 11月 08, 2022

This already removes an ugly #include "" from asm-offsets.c, but
especially it avoids a future error when trying to define asm-offsets
for KVM's svm/svm.h header.

This would not work for kernel/asm-offsets.c, because svm/svm.h
includes kvm_cache_regs.h which is not in the include path when
compiling asm-offsets.c.  The problem is not there if the .c file is
in arch/x86/kvm.
Suggested-by: NSean Christopherson <seanjc@google.com>
Cc: stable@vger.kernel.org
Fixes: a149180f ("x86: Add magic AMD return-thunk")
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

debc5a1e

09 11月, 2022 8 次提交

arm64/syscall: Include asm/ptrace.h in syscall_wrapper header. · acfc35cf

由 Kuniyuki Iwashima 提交于 10月 31, 2022

Add the same change for ARM64 as done in the commit 9440c429
("x86/syscall: Include asm/ptrace.h in syscall_wrapper header") to
make sure all syscalls see 'struct pt_regs' definition and resulted
BTF for '__arm64_sys_*(struct pt_regs *regs)' functions point to
actual struct.

Without this patch, the BPF verifier refuses to load a tracing prog
which accesses pt_regs.

  bpf(BPF_PROG_LOAD, {prog_type=0x1a, ...}, 128) = -1 EACCES

With this patch, we can see the correct error, which saves us time
in debugging the prog.

  bpf(BPF_PROG_LOAD, {prog_type=0x1a, ...}, 128) = 4
  bpf(BPF_RAW_TRACEPOINT_OPEN, {raw_tracepoint={name=NULL, prog_fd=4}}, 128) = -1 ENOTSUPP
Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: NAndrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221031215728.50389-1-kuniyu@amazon.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

acfc35cf

arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro · 8ec8490a

由 D Scott Phillips 提交于 11月 02, 2022

CONFIG_UBSAN_SHIFT with gcc-5 complains that the shifting of
ARM_CPU_IMP_AMPERE (0xC0) into bits [31:24] by MIDR_CPU_MODEL() is
undefined behavior. Well, sort of, it actually spells the error as:

 arch/arm64/kernel/proton-pack.c: In function 'spectre_bhb_loop_affected':
 arch/arm64/include/asm/cputype.h:44:2: error: initializer element is not constant
   (((imp)   << MIDR_IMPLEMENTOR_SHIFT) | \
   ^

This isn't an issue for other Implementor codes, as all the other codes
have zero in the top bit and so are representable as a signed int.

Cast the implementor code to unsigned in MIDR_CPU_MODEL to remove the
undefined behavior.

Fixes: 0e5d5ae8 ("arm64: Add AMPERE1 to the Spectre-BHB affected list")
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: ND Scott Phillips <scott@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20221102160106.1096948-1-scott@os.amperecomputing.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8ec8490a

arch/x86/mm/hugetlbpage.c: pud_huge() returns 0 when using 2-level paging · 1fdbed65

由 Naoya Horiguchi 提交于 11月 07, 2022

The following bug is reported to be triggered when starting X on x86-32
system with i915:

  [  225.777375] kernel BUG at mm/memory.c:2664!
  [  225.777391] invalid opcode: 0000 [#1] PREEMPT SMP
  [  225.777405] CPU: 0 PID: 2402 Comm: Xorg Not tainted 6.1.0-rc3-bdg+ #86
  [  225.777415] Hardware name:  /8I865G775-G, BIOS F1 08/29/2006
  [  225.777421] EIP: __apply_to_page_range+0x24d/0x31c
  [  225.777437] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76
  [  225.777446] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00
  [  225.777454] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80
  [  225.777462] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202
  [  225.777470] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0
  [  225.777479] Call Trace:
  [  225.777486]  ? i915_memcpy_init_early+0x63/0x63 [i915]
  [  225.777684]  apply_to_page_range+0x21/0x27
  [  225.777694]  ? i915_memcpy_init_early+0x63/0x63 [i915]
  [  225.777870]  remap_io_mapping+0x49/0x75 [i915]
  [  225.778046]  ? i915_memcpy_init_early+0x63/0x63 [i915]
  [  225.778220]  ? mutex_unlock+0xb/0xd
  [  225.778231]  ? i915_vma_pin_fence+0x6d/0xf7 [i915]
  [  225.778420]  vm_fault_gtt+0x2a9/0x8f1 [i915]
  [  225.778644]  ? lock_is_held_type+0x56/0xe7
  [  225.778655]  ? lock_is_held_type+0x7a/0xe7
  [  225.778663]  ? 0xc1000000
  [  225.778670]  __do_fault+0x21/0x6a
  [  225.778679]  handle_mm_fault+0x708/0xb21
  [  225.778686]  ? mt_find+0x21e/0x5ae
  [  225.778696]  exc_page_fault+0x185/0x705
  [  225.778704]  ? doublefault_shim+0x127/0x127
  [  225.778715]  handle_exception+0x130/0x130
  [  225.778723] EIP: 0xb700468a

Recently pud_huge() got aware of non-present entry by commit 3a194f3f
("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present
pud entry") to handle some special states of gigantic page.  However, it's
overlooked that pud_none() always returns false when running with 2-level
paging, and as a result pud_huge() can return true pointlessly.

Introduce "#if CONFIG_PGTABLE_LEVELS > 2" to pud_huge() to deal with this.

Link: https://lkml.kernel.org/r/20221107021010.2449306-1-naoya.horiguchi@linux.dev
Fixes: 3a194f3f ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry")
Signed-off-by: NNaoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

1fdbed65

x86/traps: avoid KMSAN bugs originating from handle_bug() · ba54d194

由 Alexander Potapenko 提交于 11月 02, 2022

There is a case in exc_invalid_op handler that is executed outside the
irqentry_enter()/irqentry_exit() region when an UD2 instruction is used to
encode a call to __warn().

In that case the `struct pt_regs` passed to the interrupt handler is never
unpoisoned by KMSAN (this is normally done in irqentry_enter()), which
leads to false positives inside handle_bug().

Use kmsan_unpoison_entry_regs() to explicitly unpoison those registers
before using them.

Link: https://lkml.kernel.org/r/20221102110611.1085175-5-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

ba54d194

x86/uaccess: instrument copy_from_user_nmi() · 11385b26

由 Alexander Potapenko 提交于 11月 02, 2022

Make sure usercopy hooks from linux/instrumented.h are invoked for
copy_from_user_nmi().  This fixes KMSAN false positives reported when
dumping opcodes for a stack trace.

Link: https://lkml.kernel.org/r/20221102110611.1085175-2-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

11385b26

s390: always build relocatable kernel · 80ddf5ce

由 Heiko Carstens 提交于 10月 30, 2022

Nathan Chancellor reported several link errors on s390 with
CONFIG_RELOCATABLE disabled, after binutils commit 906f69cf65da ("IBM
zSystems: Issue error for *DBL relocs on misaligned symbols"). The binutils
commit reveals potential miscompiles that might have happened already
before with linker script defined symbols at odd addresses.

A similar bug was recently fixed in the kernel with commit c9305b6c
("s390: fix nospec table alignments").

See https://github.com/ClangBuiltLinux/linux/issues/1747 for an analysis
from Ulich Weigand.

Therefore always build a relocatable kernel to avoid this problem. There is
hardly any use-case for non-relocatable kernels, so this shouldn't be
controversial.

Link: https://github.com/ClangBuiltLinux/linux/issues/1747Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Reported-by: NNathan Chancellor <nathan@kernel.org>
Tested-by: NNathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221030182202.2062705-1-hca@linux.ibm.comSigned-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

80ddf5ce

s390/configs: add kasan.config addon config file · 9afea696

由 Heiko Carstens 提交于 10月 31, 2022

Add kasan.config addon config file which allows to easily enable KASAN
into the current kernel config.
Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

9afea696

s390/configs: move CONFIG_DEBUG_INFO_BTF into btf.config addon config · 6191de8b

由 Heiko Carstens 提交于 10月 31, 2022

CONFIG_DEBUG_INFO_BTF significantly increases compile time for the
kernel. E.g. when changing a single C file compile time for a new bzImage
is increased by ~50% if BTF debug info is generated.

Therefore remove CONFIG_DEBUG_INFO_BTF from all defconfigs and introduce a
btf.config addon config file. Quickly enabling CONFIG_DEBUG_INFO_BTF into
the current kernel config can be done by simply invoking

make btf.config
Reviewed-by: NAlexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

6191de8b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功