提交 · d6ee6529436a15a0541aff6e1697989ee7dc2c44 · openeuler / Kernel

22 5月, 2020 1 次提交

x86/boot: Discard .discard.unreachable for arch/x86/boot/compressed/vmlinux · d6ee6529

由 Fangrui Song 提交于 5月 20, 2020

With commit

  ce5e3f90 ("efi/printf: Add 64-bit and 8-bit integer support")

arch/x86/boot/compressed/vmlinux may have an undesired .discard.unreachable
section coming from drivers/firmware/efi/libstub/vsprintf.stub.o. That section
gets generated from unreachable() annotations when CONFIG_STACK_VALIDATION is
enabled.

.discard.unreachable contains an R_X86_64_PC32 relocation which will be
warned about by LLD: a non-SHF_ALLOC section (.discard.unreachable) is
not part of the memory image, thus conceptually the distance between a
non-SHF_ALLOC and a SHF_ALLOC is not a constant which can be resolved at
link time:

  % ld.lld -m elf_x86_64 -T arch/x86/boot/compressed/vmlinux.lds ... -o arch/x86/boot/compressed/vmlinux
  ld.lld: warning: vsprintf.c:(.discard.unreachable+0x0): has non-ABS relocation R_X86_64_PC32 against symbol ''

Reuse the DISCARDS macro which includes .discard.* to drop
.discard.unreachable.

 [ bp: Massage and complete the commit message. ]
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NFangrui Song <maskray@google.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NKees Cook <keescook@chromium.org>
Tested-by: NArvind Sankar <nivedita@alum.mit.edu>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Link: https://lkml.kernel.org/r/20200520182010.242489-1-maskray@google.com

d6ee6529

22 4月, 2020 2 次提交

x86/boot/build: Add phony targets in arch/x86/boot/Makefile to PHONY · 675a59b7

由 Masahiro Yamada 提交于 2月 15, 2020

These targets are correctly added to PHONY in arch/x86/Makefile, but
not in arch/x86/boot/Makefile. Thus, with a file 'install' in the top
directory, 'make install' does nothing:

  $ touch install
  $ make install
  make[1]: 'install' is up to date.

Add them to the PHONY targets in the boot Makefile too.

 [ bp: Massage. ]
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200215063852.8298-2-masahiroy@kernel.org

675a59b7

x86/boot/build: Make 'make bzlilo' not depend on vmlinux or $(obj)/bzImage · 30ce434e

由 Masahiro Yamada 提交于 2月 15, 2020

bzlilo is an installation target because it copies files to
$(INSTALL_PATH)/, then runs 'lilo'. However, arch/x86/Makefile and
arch/x86/boot/Makefile have it depend on vmlinux and $(obj)/bzImage,
respectively.

'make bzlilo' may update some build artifacts in the source tree.

As commit

  19514fc6 ("arm, kbuild: make "make install" not depend on vmlinux")

explained, this should not happen.

Make 'bzlilo' not depend on any build artifact.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200215063852.8298-1-masahiroy@kernel.org

30ce434e

21 4月, 2020 1 次提交

x86/boot/build: Add cpustr.h to targets and remove clean-files · e3c7c105

由 Masahiro Yamada 提交于 2月 15, 2020

Files in $(targets) are always cleaned up. Move the 'targets' assignment
out of the ifdef and remove 'clean-files'.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200215063241.7437-1-masahiroy@kernel.org

e3c7c105

18 4月, 2020 3 次提交

x86/split_lock: Add Tremont family CPU models · 8b9a18a9

由 Tony Luck 提交于 4月 16, 2020

Tremont CPUs support IA32_CORE_CAPABILITIES bits to indicate whether
specific SKUs have support for split lock detection.
Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200416205754.21177-4-tony.luck@intel.com

8b9a18a9

x86/split_lock: Bits in IA32_CORE_CAPABILITIES are not architectural · 48fd5b5e

由 Tony Luck 提交于 4月 16, 2020

The Intel Software Developers' Manual erroneously listed bit 5 of the
IA32_CORE_CAPABILITIES register as an architectural feature. It is not.

Features enumerated by IA32_CORE_CAPABILITIES are model specific and
implementation details may vary in different cpu models. Thus it is only
safe to trust features after checking the CPU model.

Icelake client and server models are known to implement the split lock
detect feature even though they don't enumerate IA32_CORE_CAPABILITIES

[ tglx: Use switch() for readability and massage comments ]

Fixes: 6650cdd9 ("x86/split_lock: Enable split lock detection by kernel")
Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200416205754.21177-3-tony.luck@intel.com

48fd5b5e

x86/resctrl: Preserve CDP enable over CPU hotplug · 9fe04507

由 James Morse 提交于 2月 21, 2020

Resctrl assumes that all CPUs are online when the filesystem is mounted,
and that CPUs remember their CDP-enabled state over CPU hotplug.

This goes wrong when resctrl's CDP-enabled state changes while all the
CPUs in a domain are offline.

When a domain comes online, enable (or disable!) CDP to match resctrl's
current setting.

Fixes: 5ff193fb ("x86/intel_rdt: Add basic resctrl filesystem support")
Suggested-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200221162105.154163-1-james.morse@arm.com

9fe04507

17 4月, 2020 2 次提交

x86/resctrl: Fix invalid attempt at removing the default resource group · b0151da5

由 Reinette Chatre 提交于 3月 17, 2020

The default resource group ("rdtgroup_default") is associated with the
root of the resctrl filesystem and should never be removed. New resource
groups can be created as subdirectories of the resctrl filesystem and
they can be removed from user space.

There exists a safeguard in the directory removal code
(rdtgroup_rmdir()) that ensures that only subdirectories can be removed
by testing that the directory to be removed has to be a child of the
root directory.

A possible deadlock was recently fixed with

  334b0f4e ("x86/resctrl: Fix a deadlock due to inaccurate reference").

This fix involved associating the private data of the "mon_groups"
and "mon_data" directories to the resource group to which they belong
instead of NULL as before. A consequence of this change was that
the original safeguard code preventing removal of "mon_groups" and
"mon_data" found in the root directory failed resulting in attempts to
remove the default resource group that ends in a BUG:

  kernel BUG at mm/slub.c:3969!
  invalid opcode: 0000 [#1] SMP PTI

  Call Trace:
  rdtgroup_rmdir+0x16b/0x2c0
  kernfs_iop_rmdir+0x5c/0x90
  vfs_rmdir+0x7a/0x160
  do_rmdir+0x17d/0x1e0
  do_syscall_64+0x55/0x1d0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix this by improving the directory removal safeguard to ensure that
subdirectories of the resctrl root directory can only be removed if they
are a child of the resctrl filesystem's root _and_ not associated with
the default resource group.

Fixes: 334b0f4e ("x86/resctrl: Fix a deadlock due to inaccurate reference")
Reported-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/884cbe1773496b5dbec1b6bd11bb50cffa83603d.1584461853.git.reinette.chatre@intel.com

b0151da5

x86/split_lock: Update to use X86_MATCH_INTEL_FAM6_MODEL() · 3ab0762d

由 Tony Luck 提交于 4月 16, 2020

The SPLIT_LOCK_CPU() macro escaped the tree-wide sweep for old-style
initialization. Update to use X86_MATCH_INTEL_FAM6_MODEL().

Fixes: 6650cdd9 ("x86/split_lock: Enable split lock detection by kernel")
Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200416205754.21177-2-tony.luck@intel.com

3ab0762d

15 4月, 2020 1 次提交

x86/umip: Make umip_insns static · b0e387c3

由 Jason Yan 提交于 4月 13, 2020

Fix the following sparse warning:
  arch/x86/kernel/umip.c:84:12: warning: symbol 'umip_insns' was not declared.
  Should it be static?
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://lkml.kernel.org/r/20200413082213.22934-1-yanaijie@huawei.com

b0e387c3

14 4月, 2020 4 次提交

x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE · bdf89df3

由 John Allen 提交于 4月 09, 2020

Future AMD CPUs will have microcode patches that exceed the default 4K
patch size. Raise our limit.
Signed-off-by: NJohn Allen <john.allen@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org # v4.14..
Link: https://lkml.kernel.org/r/20200409152931.GA685273@mojo.amd.com

bdf89df3

efi/x86: Revert struct layout change to fix kexec boot regression · a088b858

由 Ard Biesheuvel 提交于 4月 10, 2020

Commit

  0a67361d ("efi/x86: Remove runtime table address from kexec EFI setup data")

removed the code that retrieves the non-remapped UEFI runtime services
pointer from the data structure provided by kexec, as it was never really
needed on the kexec boot path: mapping the runtime services table at its
non-remapped address is only needed when calling SetVirtualAddressMap(),
which never happens during a kexec boot in the first place.

However, dropping the 'runtime' member from struct efi_setup_data was a
mistake. That struct is shared ABI between the kernel and the kexec tooling
for x86, and so we cannot simply change its layout. So let's put back the
removed field, but call it 'unused' to reflect the fact that we never look
at its contents. While at it, add a comment to remind our future selves
that the layout is external ABI.

Fixes: 0a67361d ("efi/x86: Remove runtime table address from kexec EFI setup data")
Reported-by: NTheodore Ts'o <tytso@mit.edu>
Tested-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NDave Young <dyoung@redhat.com>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

a088b858

efi/x86: Don't remap text<->rodata gap read-only for mixed mode · f6103162

由 Ard Biesheuvel 提交于 4月 09, 2020

Commit

  d9e3d2c4 ("efi/x86: Don't map the entire kernel text RW for mixed mode")

updated the code that creates the 1:1 memory mapping to use read-only
attributes for the 1:1 alias of the kernel's text and rodata sections, to
protect it from inadvertent modification. However, it failed to take into
account that the unused gap between text and rodata is given to the page
allocator for general use.

If the vmap'ed stack happens to be allocated from this region, any by-ref
output arguments passed to EFI runtime services that are allocated on the
stack (such as the 'datasize' argument taken by GetVariable() when invoked
from efivar_entry_size()) will be referenced via a read-only mapping,
resulting in a page fault if the EFI code tries to write to it:

  BUG: unable to handle page fault for address: 00000000386aae88
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0003) - permissions violation
  PGD fd61063 P4D fd61063 PUD fd62063 PMD 386000e1
  Oops: 0003 [#1] SMP PTI
  CPU: 2 PID: 255 Comm: systemd-sysv-ge Not tainted 5.6.0-rc4-default+ #22
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0008:0x3eaeed95
  Code: ...  <89> 03 be 05 00 00 80 a1 74 63 b1 3e 83 c0 48 e8 44 d2 ff ff eb 05
  RSP: 0018:000000000fd73fa0 EFLAGS: 00010002
  RAX: 0000000000000001 RBX: 00000000386aae88 RCX: 000000003e9f1120
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001
  RBP: 000000000fd73fd8 R08: 00000000386aae88 R09: 0000000000000000
  R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
  R13: ffffc0f040220000 R14: 0000000000000000 R15: 0000000000000000
  FS:  00007f21160ac940(0000) GS:ffff9cf23d500000(0000) knlGS:0000000000000000
  CS:  0008 DS: 0018 ES: 0018 CR0: 0000000080050033
  CR2: 00000000386aae88 CR3: 000000000fd6c004 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
  Modules linked in:
  CR2: 00000000386aae88
  ---[ end trace a8bfbd202e712834 ]---

Let's fix this by remapping text and rodata individually, and leave the
gaps mapped read-write.

Fixes: d9e3d2c4 ("efi/x86: Don't map the entire kernel text RW for mixed mode")
Reported-by: NJiri Slaby <jslaby@suse.cz>
Tested-by: NJiri Slaby <jslaby@suse.cz>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200409130434.6736-10-ardb@kernel.org

f6103162

efi/x86: Fix the deletion of variables in mixed mode · a4b81ccf

由 Gary Lin 提交于 4月 09, 2020

efi_thunk_set_variable() treated the NULL "data" pointer as an invalid
parameter, and this broke the deletion of variables in mixed mode.
This commit fixes the check of data so that the userspace program can
delete a variable in mixed mode.

Fixes: 8319e9d5 ("efi/x86: Handle by-ref arguments covering multiple pages in mixed mode")
Signed-off-by: NGary Lin <glin@suse.com>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200408081606.1504-1-glin@suse.com
Link: https://lore.kernel.org/r/20200409130434.6736-9-ardb@kernel.org

a4b81ccf

12 4月, 2020 2 次提交

x86/Hyper-V: Report crash data in die() when panic_on_oops is set · f3a99e76

由 Tianyu Lan 提交于 4月 06, 2020

When oops happens with panic_on_oops unset, the oops
thread is killed by die() and system continues to run.
In such case, guest should not report crash register
data to host since system still runs. Check panic_on_oops
and return directly in hyperv_report_panic() when the function
is called in the die() and panic_on_oops is unset. Fix it.

Fixes: 7ed4325a ("Drivers: hv: vmbus: Make panic reporting to be more useful")
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-7-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

f3a99e76

x86/Hyper-V: Report crash register data or kmsg before running crash kernel · a1158956

由 Tianyu Lan 提交于 4月 06, 2020

We want to notify Hyper-V when a Linux guest VM crash occurs, so
there is a record of the crash even when kdump is enabled. But
crash_kexec_post_notifiers defaults to "false", so the kdump kernel
runs before the notifiers and Hyper-V never gets notified. Fix this by
always setting crash_kexec_post_notifiers to be true for Hyper-V VMs.

Fixes: 81b18bce ("Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic")
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NTianyu Lan <Tianyu.Lan@microsoft.com>
Link: https://lore.kernel.org/r/20200406155331.2105-5-Tianyu.Lan@microsoft.comSigned-off-by: NWei Liu <wei.liu@kernel.org>

a1158956

11 4月, 2020 11 次提交

KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest · e6f8b6c1

由 Xiaoyao Li 提交于 4月 10, 2020

Two types of #AC can be generated in Intel CPUs:
 1. legacy alignment check #AC
 2. split lock #AC

Reflect #AC back into the guest if the guest has legacy alignment checks
enabled or if split lock detection is disabled.

If the #AC is not a legacy one and split lock detection is enabled, then
invoke handle_guest_split_lock() which will either warn and disable split
lock detection for this task or force SIGBUS on it.

[ tglx: Switch it to handle_guest_split_lock() and rename the misnamed
  helper function. ]
Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.176308876@linutronix.de

e6f8b6c1

KVM: x86: Emulate split-lock access as a write in emulator · 9de6fe3c

由 Xiaoyao Li 提交于 4月 10, 2020

Emulate split-lock accesses as writes if split lock detection is on
to avoid #AC during emulation, which will result in a panic(). This
should never occur for a well-behaved guest, but a malicious guest can
manipulate the TLB to trigger emulation of a locked instruction[1].

More discussion can be found at [2][3].

[1] https://lkml.kernel.org/r/8c5b11c9-58df-38e7-a514-dc12d687b198@redhat.com
[2] https://lkml.kernel.org/r/20200131200134.GD18946@linux.intel.com
[3] https://lkml.kernel.org/r/20200227001117.GX9940@linux.intel.comSuggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.084300242@linutronix.de

9de6fe3c

x86/split_lock: Provide handle_guest_split_lock() · d7e94dbd

由 Thomas Gleixner 提交于 4月 10, 2020

Without at least minimal handling for split lock detection induced #AC,
VMX will just run into the same problem as the VMWare hypervisor, which
was reported by Kenneth.

It will inject the #AC blindly into the guest whether the guest is
prepared or not.

Provide a function for guest mode which acts depending on the host
SLD mode. If mode == sld_warn, treat it like user space, i.e. emit a
warning, disable SLD and mark the task accordingly. Otherwise force
SIGBUS.

 [ bp: Add a !CPU_SUP_INTEL stub for handle_guest_split_lock(). ]
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115516.978037132@linutronix.de
Link: https://lkml.kernel.org/r/20200402123258.895628824@linutronix.de

d7e94dbd

mm/memory_hotplug: add pgprot_t to mhp_params · bfeb022f

由 Logan Gunthorpe 提交于 4月 10, 2020

devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables.  For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).

For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bfeb022f

x86/mm: introduce __set_memory_prot() · 30796e18

由 Logan Gunthorpe 提交于 4月 10, 2020

For use in the 32bit arch_add_memory() to set the pgprot type of the
memory to add.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-5-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30796e18

x86/mm: thread pgprot_t through init_memory_mapping() · c164fbb4

由 Logan Gunthorpe 提交于 4月 10, 2020

In preparation to support a pgprot_t argument for arch_add_memory().

It's required to move the prototype of init_memory_mapping() seeing the
original location came before the definition of pgprot_t.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-4-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c164fbb4

mm/memory_hotplug: rename mhp_restrictions to mhp_params · f5637d3b

由 Logan Gunthorpe 提交于 4月 10, 2020

The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5637d3b

mm/vma: introduce VM_ACCESS_FLAGS · 6cb4d9a2

由 Anshuman Khandual 提交于 4月 10, 2020

There are many places where all basic VMA access flags (read, write,
exec) are initialized or checked against as a group.  One such example
is during page fault.  Existing vma_is_accessible() wrapper already
creates the notion of VMA accessibility as a group access permissions.

Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which
will not only reduce code duplication but also extend the VMA
accessibility concept in general.
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Springer <rspringer@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6cb4d9a2

mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS · c62da0c3

由 Anshuman Khandual 提交于 4月 10, 2020

There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS.  While here, also define some more
macros with standard VMA access flag combinations that are used
frequently across many platforms.  Apart from simplification, this
reduces code duplication as well.
Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c62da0c3

mm: define pte_index as macro for x86 · c97078bd

由 Arjun Roy 提交于 4月 10, 2020

pte_index() is either defined as a macro (e.g.  sparc64) or as an
inlined function (e.g.  x86).  vm_insert_pages() depends on pte_index
but it is not defined on all platforms (e.g.  m68k).

To fix compilation of vm_insert_pages() on architectures not providing
pte_index(), we perform the following fix:

0. For platforms where it is meaningful, and defined as a macro, no
    change is needed.
1. For platforms where it is meaningful and defined as an inlined
    function, and we want to use it with vm_insert_pages(), we define
    a degenerate macro of the form:  #define pte_index pte_index
2. vm_insert_pages() checks for the existence of a pte_index macro
   definition. If found, it implements a batched insert. If not found,
   it devolves to calling vm_insert_page() in a loop.

This patch implements step 1 for x86.

v3 of this patch fixes a compilation warning for an unused method.
v2 of this patch moved a macro definition to a more readable location.
Signed-off-by: NArjun Roy <arjunroy@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200228054714.204424-1-arjunroy.kdev@gmail.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c97078bd

mm: hugetlb: optionally allocate gigantic hugepages using cma · cf11e85f

由 Roman Gushchin 提交于 4月 10, 2020

Commit 944d9fec ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.

However it actually works only at early stages of the system loading,
when the majority of memory is free.  After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero.  Even dropping caches manually doesn't
help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!
Signed-off-by: NRoman Gushchin <guro@fb.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Tested-by: NAndreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
Acked-by: NMichal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf11e85f

10 4月, 2020 1 次提交

x86: hyperv: report value of misc_features · 97d9f1c4

由 Olaf Hering 提交于 4月 07, 2020

A few kernel features depend on ms_hyperv.misc_features, but unlike its
siblings ->features and ->hints, the value was never reported during boot.
Signed-off-by: NOlaf Hering <olaf@aepfle.de>
Link: https://lore.kernel.org/r/20200407172739.31371-1-olaf@aepfle.deSigned-off-by: NWei Liu <wei.liu@kernel.org>

97d9f1c4

09 4月, 2020 1 次提交

x86/xen: fix booting 32-bit pv guest · d6f34f4c

由 Juergen Gross 提交于 4月 09, 2020

Commit 2f62f36e ("x86/xen: Make the boot CPU idle task reliable")
introduced a regression for booting 32 bit Xen PV guests: the address
of the initial stack needs to be a virtual one.

Fixes: 2f62f36e ("x86/xen: Make the boot CPU idle task reliable")
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20200409070001.16675-1-jgross@suse.comSigned-off-by: NJuergen Gross <jgross@suse.com>

d6f34f4c

08 4月, 2020 11 次提交

x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2 · e6abef61

由 Jason A. Donenfeld 提交于 3月 26, 2020

Now that the kernel specifies binutils 2.23 as the minimum version, we
can remove ifdefs for AVX2 and ADX throughout.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

e6abef61

crypto: x86 - clean up poly1305-x86_64-cryptogams.S by 'make clean' · d7e40ea8

由 Masahiro Yamada 提交于 3月 26, 2020

poly1305-x86_64-cryptogams.S is a generated file, so it should be
cleaned up by 'make clean'.

Assigning it to the variable 'targets' teaches Kbuild that it is a
generated file. However, this line is not evaluated when cleaning
because scripts/Makefile.clean does not include include/config/auto.conf.

Remove the ifneq-conditional, so this file is correctly cleaned up.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NIngo Molnar <mingo@kernel.org>

d7e40ea8

crypto: x86 - rework configuration based on Kconfig · 4dcbfc35

由 Jason A. Donenfeld 提交于 3月 26, 2020

Now that assembler capabilities are probed inside of Kconfig, we can set
up proper Kconfig-based dependencies. We also take this opportunity to
reorder the Makefile, so that items are grouped logically by primitive.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

4dcbfc35

x86: add comments about the binutils version to support code in as-instr · e9e070cf

由 Masahiro Yamada 提交于 3月 26, 2020

We raise the minimal supported binutils version from time to time.
The last bump was commit 1fb12b35 ("kbuild: Raise the minimum
required binutils version to 2.21").

We have these as-instr tests because binutils 2.21 does not support
them.

When we bump the binutils version next time, this will be a good
hint to find out which one can be dropped.

As for the Clang/LLVM builds, we require very new LLVM version,
so the LLVM integrated assembler supports all of them.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Acked-by: NNick Desaulniers <ndesaulniers@google.com>

e9e070cf

x86: probe assembler capabilities via kconfig instead of makefile · 5e8ebd84

由 Jason A. Donenfeld 提交于 3月 26, 2020

Doing this probing inside of the Makefiles means we have a maze of
ifdefs inside the source code and child Makefiles that need to make
proper decisions on this too. Instead, we do it at Kconfig time, like
many other compiler and assembler options, which allows us to set up the
dependencies normally for full compilation units. In the process, the
ADX test changes to use %eax instead of %r10 so that it's valid in both
32-bit and 64-bit mode.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>

5e8ebd84

x86: remove always-defined CONFIG_AS_AVX · 42251572

由 Masahiro Yamada 提交于 3月 26, 2020

CONFIG_AS_AVX was introduced by commit ea4d26ae ("raid5: add AVX
optimized RAID5 checksumming").

We raise the minimal supported binutils version from time to time.
The last bump was commit 1fb12b35 ("kbuild: Raise the minimum
required binutils version to 2.21").

I confirmed the code in $(call as-instr,...) can be assembled by the
binutils 2.21 assembler and also by LLVM integrated assembler.

Remove CONFIG_AS_AVX, which is always defined.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Acked-by: NIngo Molnar <mingo@kernel.org>

42251572

x86: remove always-defined CONFIG_AS_SSSE3 · 92203b02

由 Masahiro Yamada 提交于 3月 26, 2020

CONFIG_AS_SSSE3 was introduced by commit 75aaf4c3 ("x86/raid6:
correctly check for assembler capabilities").

We raise the minimal supported binutils version from time to time.
The last bump was commit 1fb12b35 ("kbuild: Raise the minimum
required binutils version to 2.21").

I confirmed the code in $(call as-instr,...) can be assembled by the
binutils 2.21 assembler and also by LLVM integrated assembler.

Remove CONFIG_AS_SSSE3, which is always defined.

I added ifdef CONFIG_X86 to lib/raid6/algos.c to avoid link errors
on non-x86 architectures.

lib/raid6/algos.c is built not only for the kernel but also for
testing the library code from userspace. I added -DCONFIG_X86 to
lib/raid6/test/Makefile to cator to this usecase.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Acked-by: NIngo Molnar <mingo@kernel.org>

92203b02

x86: remove always-defined CONFIG_AS_CFI_SECTIONS · 48e24723

由 Masahiro Yamada 提交于 3月 26, 2020

CONFIG_AS_CFI_SECTIONS was introduced by commit 9e565292 ("x86:
Use .cfi_sections for assembly code").

We raise the minimal supported binutils version from time to time.
The last bump was commit 1fb12b35 ("kbuild: Raise the minimum
required binutils version to 2.21").

I confirmed the code in $(call as-instr,...) can be assembled by the
binutils 2.21 assembler and also by LLVM integrated assembler.

Remove CONFIG_AS_CFI_SECTIONS, which is always defined.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Acked-by: NIngo Molnar <mingo@kernel.org>

48e24723

x86: remove unneeded (CONFIG_AS_)CFI_SIGNAL_FRAME · 46427f65

由 Masahiro Yamada 提交于 3月 26, 2020

Commit 131484c8 ("x86/debug: Remove perpetually broken,
unmaintainable dwarf annotations") removes all the users of
CFI_SIGNAL_FRAME.

Remove the CFI_SIGNAL_FRAME and CONFIG_AS_CFI_SIGNAL_FRAME.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Acked-by: NIngo Molnar <mingo@kernel.org>

46427f65

x86: remove always-defined CONFIG_AS_CFI · 0f2661c4

由 Masahiro Yamada 提交于 3月 26, 2020

CONFIG_AS_CFI was introduced by commit e2414910 ("[PATCH] x86:
Detect CFI support in the assembler at runtime"), and extended by
commit f0f12d85 ("x86_64: Check for .cfi_rel_offset in CFI probe").

We raise the minimal supported binutils version from time to time.
The last bump was commit 1fb12b35 ("kbuild: Raise the minimum
required binutils version to 2.21").

I confirmed the code in $(call as-instr,...) can be assembled by the
binutils 2.21 assembler and also by LLVM integrated assembler.

Remove CONFIG_AS_CFI, which is always defined.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Acked-by: NIngo Molnar <mingo@kernel.org>

0f2661c4

x86: remove unneeded defined(__ASSEMBLY__) check from asm/dwarf2.h · 418d6e29

由 Masahiro Yamada 提交于 3月 26, 2020

This header file has the following check at the top:

  #ifndef __ASSEMBLY__
  #warning "asm/dwarf2.h should be only included in pure assembly files"
  #endif

So, we expect defined(__ASSEMBLY__) is always true.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
Acked-by: NIngo Molnar <mingo@kernel.org>

418d6e29

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功