- 14 1月, 2020 3 次提交
-
-
由 Dmitry Safonov 提交于
VDSO support for time namespaces needs to set up a page with the same layout as VVAR. That timens page will be placed on position of VVAR page inside namespace. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. To prepare the time namespace page the kernel needs to know the vdso_data offset. Provide arch_get_vdso_data() helper for locating vdso_data on VVAR page. Co-developed-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-22-dima@arista.com
-
由 Dmitry Safonov 提交于
Forbid splitting VVAR VMA resulting in a stricter ABI and reducing the amount of corner-cases to consider while working further on VDSO time namespace support. As the offset from timens to VVAR page is computed compile-time, the pages in VVAR should stay together and not being partically mremap()'ed. Co-developed-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NAndrei Vagin <avagin@openvz.org> Signed-off-by: NDmitry Safonov <dima@arista.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-20-dima@arista.com
-
由 Vincenzo Frascino 提交于
VDSO_HAS_32BIT_FALLBACK has been removed from the core since the architectures that support the generic vDSO library have been converted to support the 32 bit fallbacks. Remove unused VDSO_HAS_32BIT_FALLBACK from x86 vdso. Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20190830135902.20861-9-vincenzo.frascino@arm.com
-
- 05 1月, 2020 1 次提交
-
-
由 David Hildenbrand 提交于
We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e8] Signed-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: <stable@vger.kernel.org> [5.0+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 12月, 2019 2 次提交
-
-
由 Jim Mattson 提交于
The host reports support for the synthetic feature X86_FEATURE_SSBD when any of the three following hardware features are set: CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] CPUID.80000008H:EBX.AMD_SSBD[bit 24] CPUID.80000008H:EBX.VIRT_SSBD[bit 25] Either of the first two hardware features implies the existence of the IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does not. Therefore, CPUID.80000008H:EBX.AMD_SSBD[bit 24] should only be set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host. Fixes: 4c6903a0 ("KVM: x86: fix reporting of AMD speculation bug CPUID leaf") Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NJacob Xu <jacobhxu@google.com> Reviewed-by: NPeter Shier <pshier@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Reported-by: NEric Biggers <ebiggers@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
The host reports support for the synthetic feature X86_FEATURE_SSBD when any of the three following hardware features are set: CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] CPUID.80000008H:EBX.AMD_SSBD[bit 24] CPUID.80000008H:EBX.VIRT_SSBD[bit 25] Either of the first two hardware features implies the existence of the IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does not. Therefore, CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] should only be set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host. Fixes: 0c54914d ("KVM: x86: use Intel speculation bugs and features as derived in generic x86 code") Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NJacob Xu <jacobhxu@google.com> Reviewed-by: NPeter Shier <pshier@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Reported-by: NEric Biggers <ebiggers@kernel.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 12月, 2019 6 次提交
-
-
由 Alexander Shishkin 提交于
Commit: ccbebba4 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") skips the PT/LBR exclusivity check on CPUs where PT and LBRs coexist, but also inadvertently skips the active_events bump for PT in that case, which is a bug. If there aren't any hardware events at the same time as PT, the PMI handler will ignore PT PMIs, as active_events reads zero in that case, resulting in the "Uhhuh" spurious NMI warning and PT data loss. Fix this by always increasing active_events for PT events. Fixes: ccbebba4 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") Reported-by: NVitaly Slobodskoy <vitaly.slobodskoy@intel.com> Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NAlexey Budankov <alexey.budankov@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lkml.kernel.org/r/20191210105101.77210-1-alexander.shishkin@linux.intel.com
-
由 Alexander Shishkin 提交于
Commit 8062382c ("perf/x86/intel/bts: Add BTS PMU driver") brought in a warning with the BTS buffer initialization that is easily tripped with (assuming KPTI is disabled): instantly throwing: > ------------[ cut here ]------------ > WARNING: CPU: 2 PID: 326 at arch/x86/events/intel/bts.c:86 bts_buffer_setup_aux+0x117/0x3d0 > Modules linked in: > CPU: 2 PID: 326 Comm: perf Not tainted 5.4.0-rc8-00291-gceb9e773 #904 > RIP: 0010:bts_buffer_setup_aux+0x117/0x3d0 > Call Trace: > rb_alloc_aux+0x339/0x550 > perf_mmap+0x607/0xc70 > mmap_region+0x76b/0xbd0 ... It appears to assume (for lost raisins) that PagePrivate() is set, while later it actually tests for PagePrivate() before using page_private(). Make it consistent and always check PagePrivate() before using page_private(). Fixes: 8062382c ("perf/x86/intel/bts: Add BTS PMU driver") Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lkml.kernel.org/r/20191205142853.28894-2-alexander.shishkin@linux.intel.com
-
由 Peter Zijlstra 提交于
UBSAN reported out-of-bound accesses for x86_pmu.event_map(), it's arguments should be < x86_pmu.max_events. Make sure all users observe this constraint. Reported-by: NMeelis Roos <mroos@linux.ee> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NMeelis Roos <mroos@linux.ee>
-
由 Jan H. Schönherr 提交于
The function mce_severity_amd_smca() requires m->bank to be initialized for correct operation. Fix the one case, where mce_severity() is called without doing so. Fixes: 6bda529e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems") Fixes: d28af26f ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()") Signed-off-by: NJan H. Schönherr <jschoenh@amazon.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NTony Luck <tony.luck@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de
-
由 Yazen Ghannam 提交于
Each logical CPU in Scalable MCA systems controls a unique set of MCA banks in the system. These banks are not shared between CPUs. The bank types and ordering will be the same across CPUs on currently available systems. However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while other CPUs do not. In this case, the bank seen as Reserved on one CPU is assumed to be the same type as the bank seen as a known type on another CPU. In general, this occurs when the hardware represented by the MCA bank is disabled, e.g. disabled memory controllers on certain models, etc. The MCA bank is disabled in the hardware, so there is no possibility of getting an MCA/MCE from it even if it is assumed to have a known type. For example: Full system: Bank | Type seen on CPU0 | Type seen on CPU1 ------------------------------------------------ 0 | LS | LS 1 | UMC | UMC 2 | CS | CS System with hardware disabled: Bank | Type seen on CPU0 | Type seen on CPU1 ------------------------------------------------ 0 | LS | LS 1 | UMC | RAZ 2 | CS | CS For this reason, there is a single, global struct smca_banks[] that is initialized at boot time. This array is initialized on each CPU as it comes online. However, the array will not be updated if an entry already exists. This works as expected when the first CPU (usually CPU0) has all possible MCA banks enabled. But if the first CPU has a subset, then it will save a "Reserved" type in smca_banks[]. Successive CPUs will then not be able to update smca_banks[] even if they encounter a known bank type. This may result in unexpected behavior. Depending on the system configuration, a user may observe issues enumerating the MCA thresholding sysfs interface. The issues may be as trivial as sysfs entries not being available, or as severe as system hangs. For example: Bank | Type seen on CPU0 | Type seen on CPU1 ------------------------------------------------ 0 | LS | LS 1 | RAZ | UMC 2 | CS | CS Extend the smca_banks[] entry check to return if the entry is a non-reserved type. Otherwise, continue so that CPUs that encounter a known bank type can update smca_banks[]. Fixes: 68627a69 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type") Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com
-
由 Konstantin Khlebnikov 提交于
... because interrupts are disabled that early and sending IPIs can deadlock: BUG: sleeping function called from invalid context at kernel/sched/completion.c:99 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 no locks held by swapper/1/0. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0 softirqs last enabled at (0): [<ffffffff8106dda9>] copy_process+0x8b9/0x1ca0 softirqs last disabled at (0): [<0000000000000000>] 0x0 Preemption disabled at: [<ffffffff8104703b>] start_secondary+0x3b/0x190 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1 Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018 Call Trace: dump_stack ___might_sleep.cold.92 wait_for_completion ? generic_exec_single rdmsr_safe_on_cpu ? wrmsr_on_cpus mce_amd_feature_init mcheck_cpu_init identify_cpu identify_secondary_cpu smp_store_cpu_info start_secondary secondary_startup_64 The function smca_configure() is called only on the current CPU anyway, therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid the IPI. [ bp: Update commit message. ] Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NYazen Ghannam <yazen.ghannam@amd.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz
-
- 14 12月, 2019 1 次提交
-
-
由 Dmitry Golovin 提交于
Introduce a new READELF variable to top-level Makefile, so the name of readelf binary can be specified. Before this change the name of the binary was hardcoded to "$(CROSS_COMPILE)readelf" which might not be present for every toolchain. This allows to build with LLVM Object Reader by using make parameter READELF=llvm-readelf. Link: https://github.com/ClangBuiltLinux/linux/issues/771Signed-off-by: NDmitry Golovin <dima@golovin.in> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
- 11 12月, 2019 1 次提交
-
-
由 Alexei Starovoitov 提交于
Depending on type of BPF programs served by BPF trampoline it can call original function. In such case the trampoline will skip one stack frame while returning. That will confuse function_graph tracer and will cause crashes with bad RIP. Teach graph tracer to skip functions that have BPF trampoline attached. Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 10 12月, 2019 1 次提交
-
-
由 Pankaj Bharadiya 提交于
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" | while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: NPankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.comCo-developed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net
-
- 05 12月, 2019 2 次提交
-
-
由 Masahiro Yamada 提交于
Userspace cannot compile <asm/sembuf.h> due to some missing type definitions. For example, building it for x86 fails as follows: CC usr/include/asm/sembuf.h.s In file included from <command-line>:32:0: usr/include/asm/sembuf.h:17:20: error: field `sem_perm' has incomplete type struct ipc64_perm sem_perm; /* permissions .. see ipc.h */ ^~~~~~~~ usr/include/asm/sembuf.h:24:2: error: unknown type name `__kernel_time_t' __kernel_time_t sem_otime; /* last semop time */ ^~~~~~~~~~~~~~~ usr/include/asm/sembuf.h:25:2: error: unknown type name `__kernel_ulong_t' __kernel_ulong_t __unused1; ^~~~~~~~~~~~~~~~ usr/include/asm/sembuf.h:26:2: error: unknown type name `__kernel_time_t' __kernel_time_t sem_ctime; /* last change time */ ^~~~~~~~~~~~~~~ usr/include/asm/sembuf.h:27:2: error: unknown type name `__kernel_ulong_t' __kernel_ulong_t __unused2; ^~~~~~~~~~~~~~~~ usr/include/asm/sembuf.h:29:2: error: unknown type name `__kernel_ulong_t' __kernel_ulong_t sem_nsems; /* no. of semaphores in array */ ^~~~~~~~~~~~~~~~ usr/include/asm/sembuf.h:30:2: error: unknown type name `__kernel_ulong_t' __kernel_ulong_t __unused3; ^~~~~~~~~~~~~~~~ usr/include/asm/sembuf.h:31:2: error: unknown type name `__kernel_ulong_t' __kernel_ulong_t __unused4; ^~~~~~~~~~~~~~~~ It is just a matter of missing include directive. Include <asm/ipcbuf.h> to make it self-contained, and add it to the compile-test coverage. Link: http://lkml.kernel.org/r/20191030063855.9989-3-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Masahiro Yamada 提交于
Userspace cannot compile <asm/msgbuf.h> due to some missing type definitions. For example, building it for x86 fails as follows: CC usr/include/asm/msgbuf.h.s In file included from usr/include/asm/msgbuf.h:6:0, from <command-line>:32: usr/include/asm-generic/msgbuf.h:25:20: error: field `msg_perm' has incomplete type struct ipc64_perm msg_perm; ^~~~~~~~ usr/include/asm-generic/msgbuf.h:27:2: error: unknown type name `__kernel_time_t' __kernel_time_t msg_stime; /* last msgsnd time */ ^~~~~~~~~~~~~~~ usr/include/asm-generic/msgbuf.h:28:2: error: unknown type name `__kernel_time_t' __kernel_time_t msg_rtime; /* last msgrcv time */ ^~~~~~~~~~~~~~~ usr/include/asm-generic/msgbuf.h:29:2: error: unknown type name `__kernel_time_t' __kernel_time_t msg_ctime; /* last change time */ ^~~~~~~~~~~~~~~ usr/include/asm-generic/msgbuf.h:41:2: error: unknown type name `__kernel_pid_t' __kernel_pid_t msg_lspid; /* pid of last msgsnd */ ^~~~~~~~~~~~~~ usr/include/asm-generic/msgbuf.h:42:2: error: unknown type name `__kernel_pid_t' __kernel_pid_t msg_lrpid; /* last receive pid */ ^~~~~~~~~~~~~~ It is just a matter of missing include directive. Include <asm/ipcbuf.h> to make it self-contained, and add it to the compile-test coverage. Link: http://lkml.kernel.org/r/20191030063855.9989-2-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 12月, 2019 3 次提交
-
-
由 Jim Mattson 提交于
We will never need more guest_msrs than there are indices in vmx_msr_index. Thus, at present, the guest_msrs array will not exceed 168 bytes. Signed-off-by: NJim Mattson <jmattson@google.com> Reviewed-by: NLiran Alon <liran.alon@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The bounds check was present in KVM_GET_SUPPORTED_CPUID but not KVM_GET_EMULATED_CPUID. Reported-by: syzbot+e3f4897236c4eeb8af4f@syzkaller.appspotmail.com Fixes: 84cffe49 ("kvm: Emulate MOVBE", 2013-10-29) Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Dave Young 提交于
Michael Weiser reported that he got this error during a kexec rebooting: esrt: Unsupported ESRT version 2904149718861218184. The ESRT memory stays in EFI boot services data, and it was reserved in kernel via efi_mem_reserve(). The initial purpose of the reservation is to reuse the EFI boot services data across kexec reboot. For example the BGRT image data and some ESRT memory like Michael reported. But although the memory is reserved it is not updated in the X86 E820 table, and kexec_file_load() iterates system RAM in the IO resource list to find places for kernel, initramfs and other stuff. In Michael's case the kexec loaded initramfs overwrote the ESRT memory and then the failure happened. Since kexec_file_load() depends on the E820 table being updated, just fix this by updating the reserved EFI boot services memory as reserved type in E820. Originally any memory descriptors with EFI_MEMORY_RUNTIME attribute are bypassed in the reservation code path because they are assumed as reserved. But the reservation is still needed for multiple kexec reboots, and it is the only possible case we come here thus just drop the code chunk, then everything works without side effects. On my machine the ESRT memory sits in an EFI runtime data range, it does not trigger the problem, but I successfully tested with BGRT instead. both kexec_load() and kexec_file_load() work and kdump works as well. [ mingo: Edited the changelog. ] Reported-by: NMichael Weiser <michael@weiser.dinsnail.net> Tested-by: NMichael Weiser <michael@weiser.dinsnail.net> Signed-off-by: NDave Young <dyoung@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kexec@lists.infradead.org Cc: linux-efi@vger.kernel.org Link: https://lkml.kernel.org/r/20191204075233.GA10520@dhcp-128-65.nay.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 02 12月, 2019 2 次提交
-
-
由 Daniel Axtens 提交于
In the case where KASAN directly allocates memory to back vmalloc space, don't map the early shadow page over it. We prepopulate pgds/p4ds for the range that would otherwise be empty. This is required to get it synced to hardware on boot, allowing the lower levels of the page tables to be filled dynamically. Link: http://lkml.kernel.org/r/20191031093909.9228-5-dja@axtens.netSigned-off-by: NDaniel Axtens <dja@axtens.net> Acked-by: NDmitry Vyukov <dvyukov@google.com> Reviewed-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ingo Molnar 提交于
There's a bug in the new PAT code, the conversion of memtype_check_conflict() is buggy: 8d04a5f9: ("x86/mm/pat: Convert the PAT tree to a generic interval tree") dprintk("Overlap at 0x%Lx-0x%Lx\n", match->start, match->end); found_type = match->type; - node = rb_next(&match->rb); - while (node) { - match = rb_entry(node, struct memtype, rb); - - if (match->start >= end) /* Checked all possible matches */ - goto success; - - if (is_node_overlap(match, start, end) && - match->type != found_type) { + match = memtype_interval_iter_next(match, start, end); + while (match) { + if (match->type != found_type) goto failure; - } - node = rb_next(&match->rb); + match = memtype_interval_iter_next(match, start, end); } Note how the '>= end' condition to end the interval check, got converted into: + match = memtype_interval_iter_next(match, start, end); This is subtly off by one, because the interval trees interfaces require closed interval parameters: include/linux/interval_tree_generic.h /* \ * Iterate over intervals intersecting [start;last] \ * \ * Note that a node's interval intersects [start;last] iff: \ * Cond1: ITSTART(node) <= last \ * and \ * Cond2: start <= ITLAST(node) \ */ \ ... if (ITSTART(node) <= last) { /* Cond1 */ \ if (start <= ITLAST(node)) /* Cond2 */ \ return node; /* node is leftmost match */ \ [start;last] is a closed interval (note that '<= last' check) - while the PAT 'end' parameter is 1 byte beyond the end of the range, because ioremap() and the other mapping APIs usually use the [start,end) half-open interval, derived from 'size'. This is what ioremap() does for example: /* * Mappings have to be page-aligned */ offset = phys_addr & ~PAGE_MASK; phys_addr &= PHYSICAL_PAGE_MASK; size = PAGE_ALIGN(last_addr+1) - phys_addr; retval = reserve_memtype(phys_addr, (u64)phys_addr + size, pcm, &new_pcm); phys_addr+size will be on a page boundary, after the last byte of the mapped interval. So the correct parameter to use in the interval tree searches is not 'end' but 'end-1'. This could have relevance if conflicting PAT ranges are exactly adjacent, for example a future WC region is followed immediately by an already mapped UC- region - in this case memtype_check_conflict() would incorrectly deny the WC memtype region and downgrade the memtype to UC-. BTW., rather annoyingly this downgrading is done silently in memtype_check_insert(): int memtype_check_insert(struct memtype *new, enum page_cache_mode *ret_type) { int err = 0; err = memtype_check_conflict(new->start, new->end, new->type, ret_type); if (err) return err; if (ret_type) new->type = *ret_type; memtype_interval_insert(new, &memtype_rbroot); return 0; } So on such a conflict we'd just silently get UC- in *ret_type, and write it into the new region, never the wiser ... So assuming that the patch below fixes the primary bug the diagnostics side of ioremap() cache attribute downgrades would be another thing to fix. Anyway, I checked all the interval-tree iterations, and most of them are off by one - but I think the one related to memtype_check_conflict() is the one causing this particular performance regression. The only correct interval-tree searches were these two: arch/x86/mm/pat_interval.c: match = memtype_interval_iter_first(&memtype_rbroot, 0, ULONG_MAX); arch/x86/mm/pat_interval.c: match = memtype_interval_iter_next(match, 0, ULONG_MAX); The ULONG_MAX was hiding the off-by-one in plain sight. :-) Note that the bug was probably benign in the sense of implementing a too strict cache attribute conflict policy and downgrading cache attributes, so AFAICS the worst outcome of this bug would be a performance regression, not any instabilities. Reported-by: Nkernel test robot <rong.a.chen@intel.com> Reported-by: NKenneth R. Crudup <kenny@panix.com> Reported-by: NMariusz Ceier <mceier+kernel@gmail.com> Tested-by: NMariusz Ceier <mceier@gmail.com> Tested-by: NKenneth R. Crudup <kenny@panix.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191201144947.GA4167@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 12月, 2019 1 次提交
-
-
由 Borislav Petkov 提交于
... for better readability. No functional changes. [ Minor edit. ] Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 29 11月, 2019 3 次提交
-
-
由 Kai-Heng Feng 提交于
Like CFL and CFL-H, ICL SoC has skewed HPET timer once it hits PC10. So let's disable HPET on ICL. Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: feng.tang@intel.com Cc: harry.pan@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20191129062303.18982-2-kai.heng.feng@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kai-Heng Feng 提交于
Coffee Lake H SoC has similar behavior as Coffee Lake, skewed HPET timer once the SoCs entered PC10. So let's disable HPET on CFL-H platforms. Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: feng.tang@intel.com Cc: harry.pan@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20191129062303.18982-1-kai.heng.feng@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Srinivas Pandruvada 提交于
While writing to MSR IA32_THERM_STATUS/IA32_PKG_THERM_STATUS, avoid writing 1 to read only and reserved fields because updating some fields generates exception. [ bp: Vertically align for better readability. ] Fixes: f6656208 ("x86/mce/therm_throt: Optimize notifications of thermal throttle") Reported-by: NDominik Brodowski <linux@dominikbrodowski.net> Tested-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191128150824.22413-1-srinivas.pandruvada@linux.intel.com
-
- 28 11月, 2019 1 次提交
-
-
The state/owner of the FPU is saved to fpu_fpregs_owner_ctx by pointing to the context that is currently loaded. It never changed during the lifetime of a task - it remained stable/constant. After deferred FPU registers loading until return to userland was implemented, the content of fpu_fpregs_owner_ctx may change during preemption and must not be cached. This went unnoticed for some time and was now noticed, in particular since gcc 9 is caching that load in copy_fpstate_to_sigframe() and reusing it in the retry loop: copy_fpstate_to_sigframe() load fpu_fpregs_owner_ctx and save on stack fpregs_lock() copy_fpregs_to_sigframe() /* failed */ fpregs_unlock() *** PREEMPTION, another uses FPU, changes fpu_fpregs_owner_ctx *** fault_in_pages_writeable() /* succeed, retry */ fpregs_lock() __fpregs_load_activate() fpregs_state_valid() /* uses fpu_fpregs_owner_ctx from stack */ copy_fpregs_to_sigframe() /* succeeds, random FPU content */ This is a comparison of the assembly produced by gcc 9, without vs with this patch: | # arch/x86/kernel/fpu/signal.c:173: if (!access_ok(buf, size)) | cmpq %rdx, %rax # tmp183, _4 | jb .L190 #, |-# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; |-#APP |-# 512 "arch/x86/include/asm/fpu/internal.h" 1 |- movq %gs:fpu_fpregs_owner_ctx,%rax #, pfo_ret__ |-# 0 "" 2 |-#NO_APP |- movq %rax, -88(%rbp) # pfo_ret__, %sfp … |-# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; |- movq -88(%rbp), %rcx # %sfp, pfo_ret__ |- cmpq %rcx, -64(%rbp) # pfo_ret__, %sfp |+# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; |+#APP |+# 512 "arch/x86/include/asm/fpu/internal.h" 1 |+ movq %gs:fpu_fpregs_owner_ctx(%rip),%rax # fpu_fpregs_owner_ctx, pfo_ret__ |+# 0 "" 2 |+# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu; |+#NO_APP |+ cmpq %rax, -64(%rbp) # pfo_ret__, %sfp Use this_cpu_read() instead this_cpu_read_stable() to avoid caching of fpu_fpregs_owner_ctx during preemption points. The Fixes: tag points to the commit where deferred FPU loading was added. Since this commit, the compiler is no longer allowed to move the load of fpu_fpregs_owner_ctx somewhere else / outside of the locked section. A task preemption will change its value and stale content will be observed. [ bp: Massage. ] Debugged-by: NAustin Clements <austin@google.com> Debugged-by: NDavid Chase <drchase@golang.org> Debugged-by: NIan Lance Taylor <ian@airs.com> Fixes: 5f409e20 ("x86/fpu: Defer FPU state load until return to userspace") Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NRik van Riel <riel@surriel.com> Tested-by: NBorislav Petkov <bp@suse.de> Cc: Aubrey Li <aubrey.li@intel.com> Cc: Austin Clements <austin@google.com> Cc: Barret Rhoden <brho@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Chase <drchase@golang.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: ian@airs.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Bleecher Snyder <josharian@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191128085306.hxfa2o3knqtu4wfn@linutronix.de Link: https://bugzilla.kernel.org/show_bug.cgi?id=205663
-
- 27 11月, 2019 13 次提交
-
-
由 Peter Gonda 提交于
Memory encryption support does not have module parameter dependencies and can be moved into the general x86 cpuid __do_cpuid_ent function. This changes maintains current behavior of passing through all of CPUID.8000001F. Suggested-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPeter Gonda <pgonda@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <bp@alien8.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Anthony Steinhauser 提交于
When you successfully write 0 to /sys/devices/cpu/rdpmc, the RDPMC instruction should be disabled unconditionally and immediately (after you close the SYSFS file) by the documentation. Instead, in the current implementation the PMU must be reloaded which happens only eventually some time in the future. Only after that the RDPMC instruction becomes disabled (on ring 3) on the respective core. This change makes the treatment of the 0 value as blocking and as unconditional as the current treatment of the 2 value, only the CR4.PCE bit is naturally set to false instead of true. Signed-off-by: NAnthony Steinhauser <asteinhauser@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: https://lkml.kernel.org/r/20191125054838.137615-1-asteinhauser@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jason A. Donenfeld 提交于
For glue code that's used by Zinc, the actual Crypto API functions might not necessarily exist, and don't need to exist either. Before this patch, there are valid build configurations that lead to a unbuildable kernel. This fixes it to conditionalize those symbols on the existence of the proper config entry. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Acked-by: NArd Biesheuvel <ardb@kernel.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Andy Lutomirski 提交于
Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
seg_segment_reg() should be unreachable with task == current. Rather than confusingly trying to make it work, just explicitly disable this case. (regset->get is used for current in the coredump code, but the ->set interface is only used for ptrace, and you can't ptrace yourself.) Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
A double fault has a decent chance of being recoverable by killing the offending thread. Use die() so that we at least try to recover. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
The old x86_32 doublefault_fn() was old and crufty, and it did not even try to recover. do_double_fault() is much nicer. Rewrite the 32-bit double fault code to sanitize CPU state and call do_double_fault(). This is mostly an exercise i386 archaeology. With this patch applied, 32-bit double faults get a real stack trace, just like 64-bit double faults. [ mingo: merged the patch to a later kernel base. ] Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
There are three problems with the current layout of the doublefault stack and TSS. First, the TSS is only cacheline-aligned, which is not enough -- if the hardware portion of the TSS (struct x86_hw_tss) crosses a page boundary, horrible things happen [0]. Second, the stack and TSS are global, so simultaneous double faults on different CPUs will cause massive corruption. Third, the whole mechanism won't work if user CR3 is loaded, resulting in a triple fault [1]. Let the doublefault stack and TSS share a page (which prevents the TSS from spanning a page boundary), make it percpu, and move it into cpu_entry_area. Teach the stack dump code about the doublefault stack. [0] Real hardware will read past the end of the page onto the next *physical* page if a task switch happens. Virtual machines may have any number of bugs, and I would consider it reasonable for a VM to summarily kill the guest if it tries to task-switch to a page-spanning TSS. [1] Real hardware triple faults. At least some VMs seem to hang. I'm not sure what's going on. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
doublefault.c now only contains 32-bit code. Rename it to doublefault_32.c. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
The 64-bit doublefault handler is much nicer than the 32-bit one. As a first step toward unifying them, make the 64-bit handler self-contained. This should have no effect no functional effect except in the odd case of x86_64 with CONFIG_DOUBLEFAULT=n in which case it will change the logging a bit. This also gets rid of CONFIG_DOUBLEFAULT configurability on 64-bit kernels. It didn't do anything useful -- CONFIG_DOUBLEFAULT=n didn't actually disable doublefault handling on x86_64. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Joerg Roedel 提交于
The job of vmalloc_sync_all() is to help the lazy freeing of vmalloc() ranges: before such vmap ranges are reused we make sure that they are unmapped from every task's page tables. This is really easy on pagetable setups where the kernel page tables are shared between all tasks - this is the case on 32-bit kernels with SHARED_KERNEL_PMD = 1. But on !SHARED_KERNEL_PMD 32-bit kernels this involves iterating over the pgd_list and clearing all pmd entries in the pgds that are cleared in the init_mm.pgd, which is the reference pagetable that the vmalloc() code uses. In that context the current practice of vmalloc_sync_all() iterating until FIX_ADDR_TOP is buggy: for (address = VMALLOC_START & PMD_MASK; address >= TASK_SIZE_MAX && address < FIXADDR_TOP; address += PMD_SIZE) { struct page *page; Because iterating up to FIXADDR_TOP will involve a lot of non-vmalloc address ranges: VMALLOC -> PKMAP -> LDT -> CPU_ENTRY_AREA -> FIX_ADDR This is mostly harmless for the FIX_ADDR and CPU_ENTRY_AREA ranges that don't clear their pmds, but it's lethal for the LDT range, which relies on having different mappings in different processes, and 'synchronizing' them in the vmalloc sense corrupts those pagetable entries (clearing them). This got particularly prominent with PTI, which turns SHARED_KERNEL_PMD off and makes this the dominant mapping mode on 32-bit. To make LDT working again vmalloc_sync_all() must only iterate over the volatile parts of the kernel address range that are identical between all processes. So the correct check in vmalloc_sync_all() is "address < VMALLOC_END" to make sure the VMALLOC areas are synchronized and the LDT mapping is not falsely overwritten. The CPU_ENTRY_AREA and the FIXMAP area are no longer synced either, but this is not really a proplem since their PMDs get established during bootup and never change. This change fixes the ldt_gdt selftest in my setup. [ mingo: Fixed up the changelog to explain the logic and modified the copying to only happen up until VMALLOC_END. ] Reported-by: NBorislav Petkov <bp@suse.de> Tested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Fixes: 7757d607: ("x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32") Link: https://lkml.kernel.org/r/20191126111119.GA110513@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
After the following commit: 05b042a1: ("x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise") 'struct cpu_entry_area' has to be Kconfig invariant, so that we always have a matching CPU_ENTRY_AREA_PAGES size. This commit added a CONFIG_X86_IOPL_IOPERM dependency to tss_struct: 111e7b15: ("x86/ioperm: Extend IOPL config to control ioperm() as well") Which, if CONFIG_X86_IOPL_IOPERM is turned off, reduces the size of cpu_entry_area by two pages, triggering the assert: ./include/linux/compiler.h:391:38: error: call to ‘__compiletime_assert_202’ declared with attribute error: BUILD_BUG_ON failed: (CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE Simplify the Kconfig dependencies and make cpu_entry_area constant size on 32-bit kernels again. Fixes: 05b042a1: ("x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise") Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-