提交 · d22fff81418edc92be534cad8d59da914049bf69 · openanolis / cloud-kernel

31 3月, 2018 4 次提交

perf/x86/intel: Enable C-state residency events for Cannon Lake · 1159e094

由 Harry Pan 提交于 3月 09, 2018

Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.

( The MSR information is based on Intel Software Developers' Manual,
  Vol. 4, Order No. 335592. )
Tested-by: NPuthikorn Voravootivat <puthik@chromium.org>
Signed-off-by: NHarry Pan <harry.pan@intel.com>
Reviewed-by: NBenson Leung <bleung@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan.liang@intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: gs0622@gmail.com
Link: http://lkml.kernel.org/r/20180309121549.630-3-harry.pan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1159e094

perf/x86/intel: Add Cannon Lake support for RAPL profiling · 490d03e8

由 Harry Pan 提交于 3月 09, 2018

This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.

( ESU and power domains refer to Intel Software Developers' Manual,
  Vol. 4, Order No. 335592. )

Usage example:

  $ perf list
  $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10
Tested-by: NPuthikorn Voravootivat <puthik@chromium.org>
Signed-off-by: NHarry Pan <harry.pan@intel.com>
Reviewed-by: NBenson Leung <bleung@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: colin.king@canonical.com
Cc: gs0622@gmail.com
Cc: kan.liang@linux.intel.com
Link: http://lkml.kernel.org/r/20180309121549.630-2-harry.pan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

490d03e8

x86/cpu/tme: Fix spelling: "configuation" -> "configuration" · eaeb8e76

由 Colin Ian King 提交于 3月 13, 2018

Trivial fix to spelling mistake in the pr_err_once() error message text.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180313154709.1015-1-colin.king@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

eaeb8e76

x86/build: Don't pass in -D__KERNEL__ multiple times · d6289f36

由 Cao jin 提交于 3月 16, 2018

Some .<target>.cmd files under arch/x86 are showing two instances of
-D__KERNEL__, like arch/x86/boot/ and arch/x86/realmode/rm/.

__KERNEL__ is already defined in KBUILD_CPPFLAGS in the top Makefile,
so it can be dropped safely.
Signed-off-by: NCao jin <caoj.fnst@cn.fujitsu.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kbuild@vger.kernel.org
Link: http://lkml.kernel.org/r/20180316084944.3997-1-caoj.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d6289f36

29 3月, 2018 3 次提交

perf/x86/pt, coresight: Clean up address filter structure · 6ed70cf3

由 Alexander Shishkin 提交于 3月 29, 2018

This is a cosmetic patch that deals with the address filter structure's
ambiguous fields 'filter' and 'range'. The former stands to mean that the
filter's *action* should be to filter the traces to its address range if
it's set or stop tracing if it's unset. This is confusing and hard on the
eyes, so this patch replaces it with 'action' enum. The 'range' field is
completely redundant (meaning that the filter is an address range as
opposed to a single address trigger), as we can use zero size to mean the
same thing.
Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180329120648.11902-1-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6ed70cf3

Revert "x86/mce/AMD: Collect error info even if valid bits are not set" · e2efacb6

由 Yazen Ghannam 提交于 3月 26, 2018

This reverts commit 4b1e8427.

Software uses the valid bits to decide if the values can be used for
further processing or other actions. So setting the valid bits will have
software act on values that it shouldn't be acting on.

The recommendation to save all the register values does not mean that
the values are always valid.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: bp@suse.de
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20180326191526.64314-1-Yazen.Ghannam@amd.com

e2efacb6

x86/platform/UV: Fix critical UV MMR address error · bd47a85a

由 mike.travis@hpe.com 提交于 3月 28, 2018

A critical error was found testing the fixed UV4 HUB in that an MMR address
was found to be incorrect.  This causes the virtual address space for
accessing the MMIOH1 region to be allocated with the incorrect size.

Fixes: 673aa20c ("x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes")
Signed-off-by: NMike Travis <travis@sgi.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Link: https://lkml.kernel.org/r/20180328174011.041801248@stormcage.americas.sgi.com

bd47a85a

28 3月, 2018 4 次提交

KVM: x86: Fix pv tlb flush dependencies · 17a1079d

由 Wanpeng Li 提交于 3月 24, 2018

PV TLB FLUSH can only be turned on when steal time is enabled.
The condition got reversed during conflict resolution.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Fixes: 4f2f61fc ("KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled")
[Rebased on top of kvm/master and reworded the commit message. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

17a1079d

x86/boot: Fix SEV boot failure from change to __PHYSICAL_MASK_SHIFT · 07344b15

由 Tom Lendacky 提交于 3月 27, 2018

In arch/x86/boot/compressed/kaslr_64.c, CONFIG_AMD_MEM_ENCRYPT support was
initially #undef'd to support SME with minimal effort. When support for
SEV was added, the #undef remained and some minimal support for setting the
encryption bit was added for building identity mapped pagetable entries.

Commit b83ce5ee ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
changed __PHYSICAL_MASK_SHIFT from 46 to 52 in support of 5-level paging.
This change resulted in SEV guests failing to boot because the encryption
bit was no longer being automatically masked out. The compressed boot
path now requires sme_me_mask to be defined in order for the pagetable
functions, such as pud_present(), to properly mask out the encryption bit
(currently bit 47) when evaluating pagetable entries.

Add an sme_me_mask variable in arch/x86/boot/compressed/mem_encrypt.S,
which is set when SEV is active, delete the #undef CONFIG_AMD_MEM_ENCRYPT
from arch/x86/boot/compressed/kaslr_64.c and use sme_me_mask when building
the identify mapped pagetable entries.

Fixes: b83ce5ee ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180327220711.8702.55842.stgit@tlendack-t1.amdoffice.net

07344b15

x86/platform/uv/BAU: Add APIC idt entry · 151ad17f

由 Andrew Banman 提交于 3月 27, 2018

BAU uses the old alloc_initr_gate90 method to setup its interrupt. This
fails silently as the BAU vector is in the range of APIC vectors that are
registered to the spurious interrupt handler. As a consequence BAU
broadcasts are not handled, and the broadcast source CPU hangs.

Update BAU to use new idt structure.

Fixes: dc20b2d5 ("x86/idt: Move interrupt gate initialization to IDT code")
Signed-off-by: NAndrew Banman <abanman@hpe.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NMike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1522188546-196177-1-git-send-email-abanman@hpe.com

151ad17f

x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as well · 9b9a5135

由 Eric Dumazet 提交于 3月 27, 2018

When changing rdmsr_safe_on_cpu() to schedule, it was missed that
__rdmsr_safe_on_cpu() was also used by rdmsrl_safe_on_cpu()

Make rdmsrl_safe_on_cpu() a wrapper instead of copy/pasting the code which
was added for the completion handling.

Fixes: 07cde313 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule")
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180328032233.153055-1-edumazet@google.com

9b9a5135

27 3月, 2018 7 次提交

x86/cpuid: Allow cpuid_read() to schedule · 67bbd7a8

由 Eric Dumazet 提交于 3月 23, 2018

High latencies can be observed caused by a daemon periodically reading
CPUID on all cpus. On KASAN enabled kernels ~10ms latencies can be
observed. Even without KASAN, sending an IPI to a CPU, which is in a deep
sleep state or in a long hard IRQ disabled section, waiting for the answer
can consume hundreds of microseconds.

cpuid_read() is invoked in preemptible context, so it can be converted to
sleep instead of busy wait.

Switching to smp_call_function_single_async() and a completion allows to
reschedule and reduces CPU usage and latencies.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Link: https://lkml.kernel.org/r/20180323215818.127774-2-edumazet@google.com

67bbd7a8

x86/msr: Allow rdmsr_safe_on_cpu() to schedule · 07cde313

由 Eric Dumazet 提交于 3月 23, 2018

High latencies can be observed caused by a daemon periodically reading
various MSR on all cpus. On KASAN enabled kernels ~10ms latencies can be
observed simply reading one MSR. Even without KASAN, sending an IPI to a
CPU, which is in a deep sleep state or in a long hard IRQ disabled section,
waiting for the answer can consume hundreds of microseconds.

All usage sites are in preemptible context, convert rdmsr_safe_on_cpu() to
use a completion instead of busy polling.

Overall daemon cpu usage was reduced by 35 %, and latencies caused by
msr_read() disappeared.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Link: https://lkml.kernel.org/r/20180323215818.127774-1-edumazet@google.com

07cde313

x86/mm: Update comment in detect_tme() regarding x86_phys_bits · 547edaca

由 Kirill A. Shutemov 提交于 3月 15, 2018

As Kai pointed out, the primary reason for adjusting x86_phys_bits is to
reflect that the the address space is reduced and not the ability to
communicate the available physical address space to virtual machines.
Suggested-by: NKai Huang <kai.huang@linux.intel.com>
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: linux-mm@kvack.org
Link: https://lkml.kernel.org/r/20180315134907.9311-2-kirill.shutemov@linux.intel.com

547edaca

x86/alternatives: Fixup alternative_call_2 · bd627103

由 Alexey Dobriyan 提交于 1月 14, 2018

The following pattern fails to compile while the same pattern
with alternative_call() does:

	if (...)
		alternative_call_2(...);
	else
		alternative_call_2(...);

as it expands into

	if (...)
	{
	};	<===
	else
	{
	};
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20180114120504.GA11368@avx2

bd627103

x86/mm/32: Remove unused node_memmap_size_bytes() & CONFIG_NEED_NODE_MEMMAP_SIZE logic · fc5d1073

由 David Rientjes 提交于 3月 26, 2018

node_memmap_size_bytes() has been unused since the v3.9 kernel, so remove it.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Fixes: f03574f2 ("x86-32, mm: Rip out x86_32 NUMA remapping code")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803262325540.256524@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fc5d1073

perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs · 71eb9ee9

由 Stephane Eranian 提交于 3月 23, 2018

this patch fix a bug in how the pebs->real_ip is handled in the PEBS
handler. real_ip only exists in Haswell and later processor. It is
actually the eventing IP, i.e., where the event occurred. As opposed
to the pebs->ip which is the PEBS interrupt IP which is always off
by one.

The problem is that the real_ip just like the IP needs to be fixed up
because PEBS does not record all the machine state registers, and
in particular the code segement (cs). This is why we have the set_linear_ip()
function. The problem was that set_linear_ip() was only used on the pebs->ip
and not the pebs->real_ip.

We have profiles which ran into invalid callstacks because of this.
Here is an example:

 .....  0: ffffffffffffff80 recent entry, marker kernel v
 .....  1: 000000000040044d <= user address in kernel space!
 .....  2: fffffffffffffe00 marker enter user v
 .....  3: 000000000040044d
 .....  4: 00000000004004b6 oldest entry

Debugging output in get_perf_callchain():

 [  857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0

The problem is that the kernel entry in 1: points to a user level
address. How can that be?

The reason is that with PEBS sampling the instruction that caused the event
to occur and the instruction where the CPU was when the interrupt was posted
may be far apart. And sometime during that time window, the privilege level may
change. This happens, for instance, when the PEBS sample is taken close to a
kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level
instruction. But by the time the PMU interrupt fired, the processor had already
entered kernel space. This is why the debug output shows a user address with
user_mode() false.

The problem comes from PEBS not recording the code segment (cs) register.
The register is used in x86_64 to determine if executing in kernel vs user
space. This is okay because the kernel has a software workaround called
set_linear_ip(). But the issue in setup_pebs_sample_data() is that
set_linear_ip() is never called on the real_ip value when it is available
(Haswell and later) and precise_ip > 1.

This patch fixes this problem and eliminates the callchain discrepancy.

The patch restructures the code around set_linear_ip() to minimize the number
of times the IP has to be set.
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

71eb9ee9

perf/x86: Update rdpmc_always_available static key to the modern API · 631fe154

由 Davidlohr Bueso 提交于 3月 26, 2018

No changes in refcount semantics -- use DEFINE_STATIC_KEY_FALSE()
for initialization and replace:

  static_key_slow_inc|dec()   =>   static_branch_inc|dec()
  static_key_false()          =>   static_branch_unlikely()

Added a '_key' suffix to rdpmc_always_available, for better self-documentation.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/20180326210929.5244-5-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

631fe154

26 3月, 2018 2 次提交

x86/apic: Finish removing unused callbacks · e25283bf

由 David Rientjes 提交于 3月 25, 2018

The ->cpu_mask_to_apicid() and ->vector_allocation_domain() callbacks are
now unused, so remove them.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: baab1e84 ("x86/apic: Remove unused callbacks")
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803251403540.80485@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

e25283bf

powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs · 52396500

由 Nicholas Piggin 提交于 3月 23, 2018

The SLB bad address handler's trap number fixup does not preserve the
low bit that indicates nonvolatile GPRs have not been saved. This
leads save_nvgprs to skip saving them, and subsequent functions and
return from interrupt will think they are saved.

This causes kernel branch-to-garbage debugging to not have correct
registers, can also cause userspace to have its registers clobbered
after a segfault.

Fixes: f0f558b1 ("powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

52396500

25 3月, 2018 1 次提交

x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from KBUILD_CFLAGS · e847f6aa

由 Sven Wegener 提交于 3月 24, 2018

The kernel build system already takes care of generating the dependency
files. Having the additional -MD in KBUILD_CFLAGS leads to stray
.<pid>.d files in the build directory when we call the cc-option macro.
Signed-off-by: NSven Wegener <sven.wegener@stealer.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/alpine.LNX.2.21.1803242219380.30139@titan.int.lan.stealer.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

e847f6aa

24 3月, 2018 7 次提交

ARM: 8750/1: deflate_xip_data.sh: minor fixes · 1b8837b6

由 Nicolas Pitre 提交于 3月 12, 2018

Send nm complaints about broken pipe (when sed exits early) to /dev/null.
All errors should be printed to stderr.
Don't trap on normal exit so the trap can return an error code.
Signed-off-by: NNicolas Pitre <nico@linaro.org>
Tested-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>

1b8837b6

ARM: 8748/1: mm: Define vdso_start, vdso_end as array · 73b9160d

由 Jinbum Park 提交于 3月 06, 2018

Define vdso_start, vdso_end as array to avoid compile-time analysis error
for the case of built with CONFIG_FORTIFY_SOURCE.

and, since vdso_start, vdso_end are used in vdso.c only,
move extern-declaration from vdso.h to vdso.c.

If kernel is built with CONFIG_FORTIFY_SOURCE,
compile-time error happens at this code.
- if (memcmp(&vdso_start, "177ELF", 4))

The size of "&vdso_start" is recognized as 1 byte, but n is 4,
So that compile-time error is reported.
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJinbum Park <jinb.park7@gmail.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>

73b9160d

ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMU · ad43fc9a

由 Arnd Bergmann 提交于 2月 28, 2018

Without CONFIG_MMU, this results in a build failure:

./arch/arm/include/asm/memory.h:92:23: error: initializer element is not constant
 #define VECTORS_BASE  vectors_base
arch/arm/mm/dump.c:32:4: note: in expansion of macro 'VECTORS_BASE'
  { VECTORS_BASE, "Vectors" },

arch/arm/mm/dump.c:71:11: error: 'L_PTE_USER' undeclared here (not in a function); did you mean 'VTIME_USER'?
   .mask = L_PTE_USER,
           ^~~~~~~~~~

Obviously the feature only makes sense with an MMU, so let's add the
dependency here.

Fixes: a8e53c15 ("ARM: 8737/1: mm: dump: add checking for writable and executable")
Acked-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>

ad43fc9a

ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[] · 1328f020

由 Fabio Estevam 提交于 1月 22, 2018

Commit 384b38b6 ("ARM: 7873/1: vfp: clear vfp_current_hw_state
for dying cpu") fixed the cpu dying notifier by clearing
vfp_current_hw_state[]. However commit e5b61baf ("arm: Convert VFP
hotplug notifiers to state machine") incorrectly used the original
vfp_force_reload() function in the cpu dying notifier.

Fix it by going back to clearing vfp_current_hw_state[].

Fixes: e5b61baf ("arm: Convert VFP hotplug notifiers to state machine")
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: NKohji Okuno <okuno.kohji@jp.panasonic.com>
Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>

1328f020

x86/entry/64: Don't use IST entry for #BP stack · d8ba61ba

由 Andy Lutomirski 提交于 7月 23, 2015

There's nothing IST-worthy about #BP/int3.  We don't allow kprobes
in the small handful of places in the kernel that run at CPL0 with
an invalid stack, and 32-bit kernels have used normal interrupt
gates for #BP forever.

Furthermore, we don't allow kprobes in places that have usergs while
in kernel mode, so "paranoid" is also unnecessary.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org

d8ba61ba

x86/efi: Free efi_pgd with free_pages() · 06ace26f

由 Waiman Long 提交于 3月 22, 2018

The efi_pgd is allocated as PGD_ALLOCATION_ORDER pages and therefore must
also be freed as PGD_ALLOCATION_ORDER pages with free_pages().

Fixes: d9e9a641 ("x86/mm/pti: Allocate a separate user PGD")
Signed-off-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1521746333-19593-1-git-send-email-longman@redhat.com

06ace26f

KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0 · 9d1887ef

由 Sean Christopherson 提交于 3月 05, 2018

Segment registers must be synchronized prior to any code that may
trigger a call to emulation_required()/guest_state_valid(), e.g.
vmx_set_cr0().  Because preparing vmcs02 writes segmentation fields
directly, i.e. doesn't use vmx_set_segment(), emulation_required
will not be re-evaluated when synchronizing the segment registers,
which can result in L0 incorrectly starting emulation of L2.

Fixes: 8665c3f9 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full")
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
[Move all of prepare_vmcs02_full earlier, not just segment registers. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9d1887ef

23 3月, 2018 10 次提交

powerpc/mm: Fixup tlbie vs store ordering issue on POWER9 · a5d4b589

由 Aneesh Kumar K.V 提交于 3月 23, 2018

On POWER9, under some circumstances, a broadcast TLB invalidation
might complete before all previous stores have drained, potentially
allowing stale stores from becoming visible after the invalidation.
This works around it by doubling up those TLB invalidations which was
verified by HW to be sufficient to close the risk window.

This will be documented in a yet-to-be-published errata.

Fixes: 1a472c9d ("powerpc/mm/radix: Add tlbflush routines")
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Enable the feature in the DT CPU features code for all Power9,
      rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a5d4b589

powerpc/mm/radix: Move the functions that does the actual tlbie closer · 243fee32

由 Aneesh Kumar K.V 提交于 3月 23, 2018

No functionality change. Just code movement to ease code changes later
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

243fee32

powerpc/mm/radix: Remove unused code · 99491e2d

由 Aneesh Kumar K.V 提交于 3月 23, 2018

These function are not used in the code. Remove them.
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

99491e2d

powerpc/mm: Workaround Nest MMU bug with TLB invalidations · 80a4ae20

由 Benjamin Herrenschmidt 提交于 3月 23, 2018

On POWER9 the Nest MMU may fail to invalidate some translations when
doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only
and not the page walk cache.

This works around it by forcing such invalidations to escalate to
RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in
use for the context.

Fixes: 03b8abed ("cxl: Enable global TLBIs for cxl contexts")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
[balbirs: fixed spelling and coding style to quiesce checkpatch.pl]
Tested-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

80a4ae20

powerpc/mm: Add tracking of the number of coprocessors using a context · aff6f8cb

由 Benjamin Herrenschmidt 提交于 3月 23, 2018

Currently, when using coprocessors (which use the Nest MMU), we
simply increment the active_cpu count to force all TLB invalidations
to be come broadcast.

Unfortunately, due to an errata in POWER9, we will need to know
more specifically that coprocessors are in use.

This maintains a separate copros counter in the MMU context for
that purpose.

NB. The commit mentioned in the fixes tag below is not at fault for
the bug we're fixing in this commit and the next, but this fix applies
on top the infrastructure it introduced.

Fixes: 03b8abed ("cxl: Enable global TLBIs for cxl contexts")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

aff6f8cb

KVM: PPC: Book3S HV: Fix duplication of host SLB entries · cda4a147

由 Paul Mackerras 提交于 3月 22, 2018

Since commit 6964e6a4 ("KVM: PPC: Book3S HV: Do SLB load/unload
with guest LPCR value loaded", 2018-01-11), we have been seeing
occasional machine check interrupts on POWER8 systems when running
KVM guests, due to SLB multihit errors.

This turns out to be due to the guest exit code reloading the host
SLB entries from the SLB shadow buffer when the SLB was not previously
cleared in the guest entry path. This can happen because the path
which skips from the guest entry code to the guest exit code without
entering the guest now does the skip before the SLB is cleared and
loaded with guest values, but the host values are loaded after the
point in the guest exit path that we skip to.

To fix this, we move the code that reloads the host SLB values up
so that it occurs just before the point in the guest exit code (the
label guest_bypass:) where we skip to from the guest entry path.
Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 6964e6a4 ("KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded")
Tested-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

cda4a147

x86/mm: implement free pmd/pte page interfaces · 28ee90fe

由 Toshi Kani 提交于 3月 22, 2018

Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which
clear a given pud/pmd entry and free up lower level page table(s).

The address range associated with the pud/pmd entry must have been
purged by INVLPG.

Link: http://lkml.kernel.org/r/20180314180155.19492-3-toshi.kani@hpe.com
Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Reported-by: NLei Li <lious.lilei@hisilicon.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

28ee90fe

mm/vmalloc: add interfaces to free unmapped page table · b6bdb751

由 Toshi Kani 提交于 3月 22, 2018

On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
create pud/pmd mappings.  A kernel panic was observed on arm64 systems
with Cortex-A75 in the following steps as described by Hanjun Guo.

 1. ioremap a 4K size, valid page table will build,
 2. iounmap it, pte0 will set to 0;
 3. ioremap the same address with 2M size, pgd/pmd is unchanged,
    then set the a new value for pmd;
 4. pte0 is leaked;
 5. CPU may meet exception because the old pmd is still in TLB,
    which will lead to kernel panic.

This panic is not reproducible on x86.  INVLPG, called from iounmap,
purges all levels of entries associated with purged address on x86.  x86
still has memory leak.

The patch changes the ioremap path to free unmapped page table(s) since
doing so in the unmap path has the following issues:

 - The iounmap() path is shared with vunmap(). Since vmap() only
   supports pte mappings, making vunmap() to free a pte page is an
   overhead for regular vmap users as they do not need a pte page freed
   up.

 - Checking if all entries in a pte page are cleared in the unmap path
   is racy, and serializing this check is expensive.

 - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
   Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
   purge.

Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
clear a given pud/pmd entry and free up a page for the lower level
entries.

This patch implements their stub functions on x86 and arm64, which work
as workaround.

[akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
Fixes: e61ce6ad ("mm: change ioremap to set up huge I/O mappings")
Reported-by: NLei Li <lious.lilei@hisilicon.com>
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6bdb751

h8300: remove extraneous __BIG_ENDIAN definition · 1705f7c5

由 Arnd Bergmann 提交于 3月 22, 2018

A bugfix I did earlier caused a build regression on h8300, which defines
the __BIG_ENDIAN macro in a slightly different way than the generic
code:

  arch/h8300/include/asm/byteorder.h:5:0: warning: "__BIG_ENDIAN" redefined

We don't need to define it here, as the same macro is already provided
by the linux/byteorder/big_endian.h, and that version does not conflict.

While this is a v4.16 regression, my earlier patch also got backported
to the 4.14 and 4.15 stable kernels, so we need the fixup there as well.

Link: http://lkml.kernel.org/r/20180313120752.2645129-1-arnd@arndb.de
Fixes: 101110f6 ("Kbuild: always define endianess in kconfig.h")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1705f7c5

powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened · ff6781fd

由 Nicholas Piggin 提交于 3月 21, 2018

force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.

Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.

This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.

Fixes: 1d607bb3 ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: NCarol L. Soto <clsoto@us.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ff6781fd

22 3月, 2018 2 次提交

MIPS: ralink: Fix booting on MT7621 · a63d706e

由 NeilBrown 提交于 3月 21, 2018

Since commit 3af5a67c ("MIPS: Fix early CM probing") the MT7621 has
not been able to boot.

This commit caused mips_cm_probe() to be called before
mt7621.c::proc_soc_init().

prom_soc_init() has a comment explaining that mips_cm_probe() "wipes out
the bootloader config" and means that configuration registers are no
longer available. It has some code to re-enable this config.

Before this re-enable code is run, the sysc register cannot be read, so
when SYSC_REG_CHIP_NAME0 is read, a garbage value is returned and
panic() is called.

If we move the config-repair code to the top of prom_soc_init(), the
registers can be read and boot can proceed.

Very occasionally, the first register read after the reconfiguration
returns garbage, so add a call to __sync().

Fixes: 3af5a67c ("MIPS: Fix early CM probing")
Signed-off-by: NNeilBrown <neil@brown.name>
Reviewed-by: NMatt Redfearn <matt.redfearn@mips.com>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.5+
Patchwork: https://patchwork.linux-mips.org/patch/18859/Signed-off-by: NJames Hogan <jhogan@kernel.org>

a63d706e

MIPS: ralink: Remove ralink_halt() · 891731f6

由 NeilBrown 提交于 3月 20, 2018

ralink_halt() does nothing that machine_halt() doesn't already do, so it
adds no value.

It actually causes incorrect behaviour due to the "unreachable()" at the
end. This tells the compiler that the end of the function will never be
reached, which isn't true. The compiler responds by not adding a
'return' instruction, so control simply moves on to whatever bytes come
afterwards in memory. In my tested, that was the ralink_restart()
function. This means that an attempt to 'halt' the machine would
actually cause a reboot.

So remove ralink_halt() so that a 'halt' really does halt.

Fixes: c06e836a ("MIPS: ralink: adds reset code")
Signed-off-by: NNeilBrown <neil@brown.name>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.9+
Patchwork: https://patchwork.linux-mips.org/patch/18851/Signed-off-by: NJames Hogan <jhogan@kernel.org>

891731f6

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功