提交 · 92faa7bea3e7592673109e32c75d50f8ce6d5ec6 · openanolis / cloud-kernel

16 5月, 2018 2 次提交

arm64: Remove duplicate include · 92faa7be

由 Vincenzo Frascino 提交于 4月 13, 2018

"make includecheck" detected few duplicated includes in arch/arm64.

This patch removes the double inclusions.
Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

92faa7be

arm64: remove no-op macro VMLINUX_SYMBOL() · 5c636aa0

由 Masahiro Yamada 提交于 5月 09, 2018

VMLINUX_SYMBOL() is no-op unless CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
is defined. It has ever been selected only by BLACKFIN and METAG.
VMLINUX_SYMBOL() is unneeded for ARM64-specific code.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

5c636aa0

15 5月, 2018 1 次提交

arm64: Increase ARCH_DMA_MINALIGN to 128 · ebc7e21e

由 Catalin Marinas 提交于 5月 11, 2018

This patch increases the ARCH_DMA_MINALIGN to 128 so that it covers the
currently known Cache Writeback Granule (CTR_EL0.CWG) on arm64 and moves
the fallback in cache_line_size() from L1_CACHE_BYTES to this constant.
In addition, it warns (and taints) if the CWG is larger than
ARCH_DMA_MINALIGN as this is not safe with non-coherent DMA.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

ebc7e21e

11 5月, 2018 1 次提交

Revert "arm64: Increase the max granular size" · d93277b9

由 Catalin Marinas 提交于 5月 11, 2018

This reverts commit 97303480.

Commit 97303480 ("arm64: Increase the max granular size") increased
the cache line size to 128 to match Cavium ThunderX, apparently for some
performance benefit which could not be confirmed. This change, however,
has an impact on the network packet allocation in certain circumstances,
requiring slightly over a 4K page with a significant performance
degradation. The patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache
line).

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

d93277b9

06 5月, 2018 1 次提交

KVM: x86: remove APIC Timer periodic/oneshot spikes · ecf08dad

由 Anthoine Bourgeois 提交于 4月 29, 2018

Since the commit "8003c9ae: add APIC Timer periodic/oneshot mode VMX
preemption timer support", a Windows 10 guest has some erratic timer
spikes.

Here the results on a 150000 times 1ms timer without any load:
	  Before 8003c9ae | After 8003c9ae
Max           1834us          |  86000us
Mean          1100us          |   1021us
Deviation       59us          |    149us
Here the results on a 150000 times 1ms timer with a cpu-z stress test:
	  Before 8003c9ae | After 8003c9ae
Max          32000us          | 140000us
Mean          1006us          |   1997us
Deviation      140us          |  11095us

The root cause of the problem is starting hrtimer with an expiry time
already in the past can take more than 20 milliseconds to trigger the
timer function.  It can be solved by forward such past timers
immediately, rather than submitting them to hrtimer_start().
In case the timer is periodic, update the target expiration and call
hrtimer_start with it.

v2: Check if the tsc deadline is already expired. Thank you Mika.
v3: Execute the past timers immediately rather than submitting them to
hrtimer_start().
v4: Rearm the periodic timer with advance_periodic_target_expiration() a
simpler version of set_target_expiration(). Thank you Paolo.

Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAnthoine Bourgeois <anthoine.bourgeois@blade-group.com>
8003c9ae ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ecf08dad

04 5月, 2018 2 次提交

arm64: vgic-v2: Fix proxying of cpuif access · b220244d

由 James Morse 提交于 5月 04, 2018

Proxying the cpuif accesses at EL2 makes use of vcpu_data_guest_to_host
and co, which check the endianness, which call into vcpu_read_sys_reg...
which isn't mapped at EL2 (it was inlined before, and got moved OoL
with the VHE optimizations).

The result is of course a nice panic. Let's add some specialized
cruft to keep the broken platforms that require this hack alive.

But, this code used vcpu_data_guest_to_host(), which expected us to
write the value to host memory, instead we have trapped the guest's
read or write to an mmio-device, and are about to replay it using the
host's readl()/writel() which also perform swabbing based on the host
endianness. This goes wrong when both host and guest are big-endian,
as readl()/writel() will undo the guest's swabbing, causing the
big-endian value to be written to device-memory.

What needs doing?
A big-endian guest will have pre-swabbed data before storing, undo this.
If its necessary for the host, writel() will re-swab it.

For a read a big-endian guest expects to swab the data after the load.
The hosts's readl() will correct for host endianness, giving us the
device-memory's value in the register. For a big-endian guest, swab it
as if we'd only done the load.

For a little-endian guest, nothing needs doing as readl()/writel() leave
the correct device-memory value in registers.

Tested on Juno with that rarest of things: a big-endian 64K host.
Based on a patch from Marc Zyngier.
Reported-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Fixes: bf8feb39 ("arm64: KVM: vgic-v2: Add GICV access from HYP")
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

b220244d

KVM: arm64: Fix order of vcpu_write_sys_reg() arguments · 1975fa56

由 James Morse 提交于 5月 02, 2018

A typo in kvm_vcpu_set_be()'s call:
| vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr)
causes us to use the 32bit register value as an index into the sys_reg[]
array, and sail off the end of the linear map when we try to bring up
big-endian secondaries.

| Unable to handle kernel paging request at virtual address ffff80098b982c00
| Mem abort info:
|  ESR = 0x96000045
|  Exception class = DABT (current EL), IL = 32 bits
|   SET = 0, FnV = 0
|   EA = 0, S1PTW = 0
| Data abort info:
|   ISV = 0, ISS = 0x00000045
|   CM = 0, WnR = 1
| swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a
| [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000
| Internal error: Oops: 96000045 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323
| Hardware name: ARM Juno development board (r1) (DT)
| pstate: 60000005 (nZCv daif -PAN -UAO)
| pc : vcpu_write_sys_reg+0x50/0x134
| lr : vcpu_write_sys_reg+0x50/0x134

| Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b)
| Call trace:
|  vcpu_write_sys_reg+0x50/0x134
|  kvm_psci_vcpu_on+0x14c/0x150
|  kvm_psci_0_2_call+0x244/0x2a4
|  kvm_hvc_call_handler+0x1cc/0x258
|  handle_hvc+0x20/0x3c
|  handle_exit+0x130/0x1ec
|  kvm_arch_vcpu_ioctl_run+0x340/0x614
|  kvm_vcpu_ioctl+0x4d0/0x840
|  do_vfs_ioctl+0xc8/0x8d0
|  ksys_ioctl+0x78/0xa8
|  sys_ioctl+0xc/0x18
|  el0_svc_naked+0x30/0x34
| Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8)
|---[ end trace 4b4a4f9628596602 ]---

Fix the order of the arguments.

Fixes: 8d404c4c ("KVM: arm64: Rewrite system register accessors to read/write functions")
CC: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

1975fa56

03 5月, 2018 4 次提交

parisc: Fix section mismatches · 8d73b180

由 Helge Deller 提交于 4月 20, 2018

Fix three section mismatches:
1) Section mismatch in reference from the function ioread8() to the
   function .init.text:pcibios_init_bridge()
2) Section mismatch in reference from the function free_initmem() to the
   function .init.text:map_pages()
3) Section mismatch in reference from the function ccio_ioc_init() to
   the function .init.text:count_parisc_driver()
Signed-off-by: NHelge Deller <deller@gmx.de>

8d73b180

parisc: drivers.c: Fix section mismatches · b819439f

由 Helge Deller 提交于 4月 20, 2018

Fix two section mismatches in drivers.c:
1) Section mismatch in reference from the function alloc_tree_node() to
   the function .init.text:create_tree_node().
2) Section mismatch in reference from the function walk_native_bus() to
   the function .init.text:alloc_pa_dev().
Signed-off-by: NHelge Deller <deller@gmx.de>

b819439f

bpf, x64: fix memleak when not converging on calls · 39f56ca9

由 Daniel Borkmann 提交于 5月 02, 2018

The JIT logic in jit_subprogs() is as follows: for all subprogs we
allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here),
and pass it to bpf_int_jit_compile(). If a failure occurred during
JIT and prog->jited is not set, then we bail out from attempting to
JIT the whole program, and punt to the interpreter instead. In case
JITing went successful, we fixup BPF call offsets and do another
pass to bpf_int_jit_compile() (extra_pass is true at that point) to
complete JITing calls. Given that requires to pass JIT context around
addrs and jit_data from x86 JIT are freed in the extra_pass in
bpf_int_jit_compile() when calls are involved (if not, they can
be freed immediately). However, if in the original pass, the JIT
image didn't converge then we leak addrs and jit_data since image
itself is NULL, the prog->is_func is set and extra_pass is false
in that case, meaning both will become unreachable and are never
cleaned up, therefore we need to free as well on !image. Only x64
JIT is affected.

Fixes: 1c2a088a ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

39f56ca9

bpf, x64: fix memleak when not converging after image · 3aab8884

由 Daniel Borkmann 提交于 5月 02, 2018

While reviewing x64 JIT code, I noticed that we leak the prior allocated
JIT image in the case where proglen != oldproglen during the JIT passes.
Prior to the commit e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT
compiler") we would just break out of the loop, and using the image as the
JITed prog since it could only shrink in size anyway. After e0ee9c12,
we would bail out to out_addrs label where we free addrs and jit_data but
not the image coming from bpf_jit_binary_alloc().

Fixes: e0ee9c12 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

3aab8884

02 5月, 2018 5 次提交

x86/cpu: Restore CPUID_8000_0008_EBX reload · c65732e4

由 Thomas Gleixner 提交于 4月 30, 2018

The recent commt which addresses the x86_phys_bits corruption with
encrypted memory on CPUID reload after a microcode update lost the reload
of CPUID_8000_0008_EBX as well.

As a consequence IBRS and IBRS_FW are not longer detected

Restore the behaviour by bringing the reload of CPUID_8000_0008_EBX
back. This restore has a twist due to the convoluted way the cpuid analysis
works:

CPUID_8000_0008_EBX is used by AMD to enumerate IBRB, IBRS, STIBP. On Intel
EBX is not used. But the speculation control code sets the AMD bits when
running on Intel depending on the Intel specific speculation control
bits. This was done to use the same bits for alternatives.

The change which moved the 8000_0008 evaluation out of get_cpu_cap() broke
this nasty scheme due to ordering. So that on Intel the store to
CPUID_8000_0008_EBX clears the IBRB, IBRS, STIBP bits which had been set
before by software.

So the actual CPUID_8000_0008_EBX needs to go back to the place where it
was and the phys/virt address space calculation cannot touch it.

In hindsight this should have used completely synthetic bits for IBRB,
IBRS, STIBP instead of reusing the AMD bits, but that's for 4.18.

/me needs to find time to cleanup that steaming pile of ...

Fixes: d94a155c ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption")
Reported-by: NJörg Otte <jrg.otte@gmail.com>
Reported-by: NTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NJörg Otte <jrg.otte@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kirill.shutemov@linux.intel.com
Cc: Borislav Petkov <bp@alien8.de
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1805021043510.1668@nanos.tec.linutronix.de

c65732e4

x86/tsc: Fix mark_tsc_unstable() · e3b4f790

由 Peter Zijlstra 提交于 4月 30, 2018

mark_tsc_unstable() also needs to affect tsc_early, Now that
clocksource_mark_unstable() can be used on a clocksource irrespective of
its registration state, use it on both tsc_early and tsc.

This does however require cs->list to be initialized empty, otherwise it
cannot tell the registation state before registation.

Fixes: aa83c457 ("x86/tsc: Introduce early tsc clocksource")
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NDiego Viola <diego.viola@gmail.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: len.brown@intel.com
Cc: rjw@rjwysocki.net
Cc: rui.zhang@intel.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180430100344.533326547@infradead.org

e3b4f790

x86/tsc: Always unregister clocksource_tsc_early · e9088add

由 Peter Zijlstra 提交于 4月 30, 2018

Don't leave the tsc-early clocksource registered if it errors out
early.

This was reported by Diego, who on his Core2 era machine got TSC
invalidated while it was running with tsc-early (due to C-states).
This results in keeping tsc-early with very bad effects.
Reported-and-Tested-by: NDiego Viola <diego.viola@gmail.com>
Fixes: aa83c457 ("x86/tsc: Introduce early tsc clocksource")
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: len.brown@intel.com
Cc: rjw@rjwysocki.net
Cc: diego.viola@gmail.com
Cc: rui.zhang@intel.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180430100344.350507853@infradead.org

e9088add

hexagon: export csum_partial_copy_nocheck · 330e261c

由 Arnd Bergmann 提交于 4月 06, 2018

This is needed to link ipv6 as a loadable module, which in turn happens
in allmodconfig.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>

330e261c

hexagon: add memset_io() helper · a57ab96e

由 Arnd Bergmann 提交于 4月 06, 2018

We already have memcpy_toio(), but not memset_io(), so let's
add the obvious version to allow building an allmodconfig kernel
without errors like

drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_bo_move_memcpy':
drivers/gpu/drm/ttm/ttm_bo_util.c:390:3: error: implicit declaration of function 'memset_io' [-Werror=implicit-function-declaration]
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRichard Kuo <rkuo@codeaurora.org>

a57ab96e

01 5月, 2018 2 次提交

sparc: vio: use put_device() instead of kfree() · 00ad691a

由 Arvind Yadav 提交于 4月 25, 2018

Never directly free @dev after calling device_register(), even
if it returned an error. Always use put_device() to give up the
reference initialized.
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00ad691a

sparc64: Fix mistake in oradax license text · d3c68d0b

由 Rob Gardner 提交于 4月 20, 2018

The license text in both oradax files mistakenly specifies "version 3" of
the GNU General Public License. This is corrected to specify "version 2".
Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
Signed-off-by: NJonathan Helman <jonathan.helman@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3c68d0b

28 4月, 2018 1 次提交

x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPI · 5e62493f

由 KarimAllah Ahmed 提交于 4月 17, 2018

Move DISABLE_EXITS KVM capability bits to the UAPI just like the rest of
capabilities.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

5e62493f

27 4月, 2018 8 次提交

kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use · a468f2db

由 Junaid Shahid 提交于 4月 26, 2018

Currently, KVM flushes the TLB after a change to the APIC access page
address or the APIC mode when EPT mode is enabled. However, even in
shadow paging mode, a TLB flush is needed if VPIDs are being used, as
specified in the Intel SDM Section 29.4.5.

So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will
flush if either EPT or VPIDs are in use.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a468f2db

x86/entry/64/compat: Preserve r8-r11 in int $0x80 · 8bb2610b

由 Andy Lutomirski 提交于 4月 17, 2018

32-bit user code that uses int $80 doesn't care about r8-r11.  There is,
however, some 64-bit user code that intentionally uses int $0x80 to invoke
32-bit system calls.  From what I've seen, basically all such code assumes
that r8-r15 are all preserved, but the kernel clobbers r8-r11.  Since I
doubt that there's any code that depends on int $0x80 zeroing r8-r11,
change the kernel to preserve them.

I suspect that very little user code is broken by the old clobber, since
r8-r11 are only rarely allocated by gcc, and they're clobbered by function
calls, so they only way we'd see a problem is if the same function that
invokes int $0x80 also spills something important to one of these
registers.

The current behavior seems to date back to the historical commit
"[PATCH] x86-64 merge for 2.6.4".  Before that, all regs were
preserved.  I can't find any explanation of why this change was made.

Update the test_syscall_vdso_32 testcase as well to verify the new
behavior, and it strengthens the test to make sure that the kernel doesn't
accidentally permute r8..r15.
Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org

8bb2610b

x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds · 1a512c08

由 Arnd Bergmann 提交于 4月 24, 2018

A bugfix broke the x32 shmid64_ds and msqid64_ds data structure layout
(as seen from user space)  a few years ago: Originally, __BITS_PER_LONG
was defined as 64 on x32, so we did not have padding after the 64-bit
__kernel_time_t fields, After __BITS_PER_LONG got changed to 32,
applications would observe extra padding.

In other parts of the uapi headers we seem to have a mix of those
expecting either 32 or 64 on x32 applications, so we can't easily revert
the path that broke these two structures.

Instead, this patch decouples x32 from the other architectures and moves
it back into arch specific headers, partially reverting the even older
commit 73a2d096 ("x86: remove all now-duplicate header files").

It's not clear whether this ever made any difference, since at least
glibc carries its own (correct) copy of both of these header files,
so possibly no application has ever observed the definitions here.

Based on a suggestion from H.J. Lu, I tried out the tool from
https://github.com/hjl-tools/linux-header to find other such
bugs, which pointed out the same bug in statfs(), which also has
a separate (correct) copy in glibc.

Fixes: f4b4aae1 ("x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: "H . J . Lu" <hjl.tools@gmail.com>
Cc: Jeffrey Walton <noloader@gmail.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180424212013.3967461-1-arnd@arndb.de

1a512c08

x86/setup: Do not reserve a crash kernel region if booted on Xen PV · 3db3eb28

由 Petr Tesarik 提交于 4月 25, 2018

Xen PV domains cannot shut down and start a crash kernel. Instead,
the crashing kernel makes a SCHEDOP_shutdown hypercall with the
reason code SHUTDOWN_crash, cf. xen_crash_shutdown() machine op in
arch/x86/xen/enlighten_pv.c.

A crash kernel reservation is merely a waste of RAM in this case. It
may also confuse users of kexec_load(2) and/or kexec_file_load(2).
When flags include KEXEC_ON_CRASH or KEXEC_FILE_ON_CRASH,
respectively, these syscalls return success, which is technically
correct, but the crash kexec image will never be actually used.
Signed-off-by: NPetr Tesarik <ptesarik@suse.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: xen-devel@lists.xenproject.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jean Delvare <jdelvare@suse.de>
Link: https://lkml.kernel.org/r/20180425120835.23cef60c@ezekiel.suse.cz

3db3eb28

arm64: avoid instrumenting atomic_ll_sc.o · 3789c122

由 Mark Rutland 提交于 4月 27, 2018

Our out-of-line atomics are built with a special calling convention,
preventing pointless stack spilling, and allowing us to patch call sites
with ARMv8.1 atomic instructions.

Instrumentation inserted by the compiler may result in calls to
functions not following this special calling convention, resulting in
registers being unexpectedly clobbered, and various problems resulting
from this.

For example, if a kernel is built with KCOV and ARM64_LSE_ATOMICS, the
compiler inserts calls to __sanitizer_cov_trace_pc in the prologues of
the atomic functions. This has been observed to result in spurious
cmpxchg failures, leading to a hang early on in the boot process.

This patch avoids such issues by preventing instrumentation of our
out-of-line atomics.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3789c122

powerpc/kvm/booke: Fix altivec related build break · b2d7ecbe

由 Laurentiu Tudor 提交于 4月 26, 2018

Add missing "altivec unavailable" interrupt injection helper
thus fixing the linker error below:

arch/powerpc/kvm/emulate_loadstore.o: In function `kvmppc_check_altivec_disabled':
arch/powerpc/kvm/emulate_loadstore.c: undefined reference to `.kvmppc_core_queue_vec_unavail'

Fixes: 09f98496 ("KVM: PPC: Book3S: Add MMIO emulation for VMX instructions")
Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b2d7ecbe

powerpc: Fix deadlock with multiple calls to smp_send_stop · 6029755e

由 Nicholas Piggin 提交于 4月 27, 2018

smp_send_stop can lock up the IPI path for any subsequent calls,
because the receiving CPUs spin in their handler function. This
started becoming a problem with the addition of an smp_send_stop
call in the reboot path, because panics can reboot after doing
their own smp_send_stop.

The NMI IPI variant was fixed with ac61c115 ("powerpc: Fix
smp_send_stop NMI IPI handling"), which leaves the smp_call_function
variant.

This is fixed by having smp_send_stop only ever do the
smp_call_function once. This is a bit less robust than the NMI IPI
fix, because any other call to smp_call_function after smp_send_stop
could deadlock, but that has always been the case, and it was not
been a problem before.

Fixes: f2748bdf ("powerpc/powernv: Always stop secondaries before reboot/shutdown")
Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6029755e

x86/cpu/intel: Add missing TLB cpuid values · b837913f

由 jacek.tomaka@poczta.fm 提交于 4月 24, 2018

Make kernel print the correct number of TLB entries on Intel Xeon Phi 7210
(and others)

Before:
[ 0.320005] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
After:
[ 0.320005] Last level dTLB entries: 4KB 256, 2MB 128, 4MB 128, 1GB 16

The entries do exist in the official Intel SMD but the type column there is
incorrect (states "Cache" where it should read "TLB"), but the entries for
the values 0x6B, 0x6C and 0x6D are correctly described as 'Data TLB'.
Signed-off-by: NJacek Tomaka <jacek.tomaka@poczta.fm>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180423161425.24366-1-jacekt@dugeo.com

b837913f

26 4月, 2018 10 次提交

arm64: fix possible spectre-v1 in ptrace_hbp_get_event() · 19791a7c

由 Mark Rutland 提交于 4月 25, 2018

It's possible for userspace to control idx. Sanitize idx when using it
as an array index.

Found by smatch.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

19791a7c

ARM: defconfig: Update Gemini defconfig · c12d7e9f

由 Linus Walleij 提交于 4月 20, 2018

This updates the Gemini defconfig with a config that will bring
up most of the recently merged and updated devices to some
functional level:

- We enable high resolution timers (the right thing to do)
- Enable CMA for the framebuffer, and the new TVE200
  framebuffer driver and the Ilitek ILI9322 driver for
  graphics on the D-Link DIR-685. HIGHMEM support comes in
  as part of this.
- Enable networking and the new Cortina Gemini ethernet
  driver.
- Enable MDIO over GPIO and the Realtek PHY devices used on
  several of these systems.
- Enable I2C over GPIO and SPI over GPIO which is used on
  several of these devices.
- Enable the Thermal framework, GPIO fan control and LM75 sensor
  adding cooling on the D-Link DNS-313 with no userspace
  involved even if only the kernel is working, rock solid
  thermal for this platform.
- Enable JEDEC flash probing to support the Eon flash chip in
  D-Link DNS-313.
- Enable LED disk triggers for the NAS type devices.
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

c12d7e9f

ARM: s3c24xx: jive: Fix some GPIO names · ef740508

由 Linus Walleij 提交于 4月 23, 2018

One of the bitbanged SPI hosts had wrongly named GPIO lines due to
sloppiness by yours truly.

Cc: arm@kernel.org
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

ef740508

ARM: dts: Fix NAS4220B pin config · 1c3bc8fb

由 Linus Walleij 提交于 4月 17, 2018

The DTS file for the NAS4220B had the pin config for the
ethernet interface set to the pins in the SL3512 SoC while
this system is using SL3516. Fix it by referencing the
right SL3516 pins instead of the SL3512 pins.

Cc: stable@vger.kernel.org
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Reported-by: NAndreas Fiedler <andreas.fiedler@gmx.net>
Reported-by: NRoman Yeryomin <roman@advem.lv>
Tested-by: NRoman Yeryomin <roman@advem.lv>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

1c3bc8fb

x86/smpboot: Don't use mwait_play_dead() on AMD systems · da6fa7ef

由 Yazen Ghannam 提交于 4月 03, 2018

Recent AMD systems support using MWAIT for C1 state. However, MWAIT will
not allow deeper cstates than C1 on current systems.

play_dead() expects to use the deepest state available.  The deepest state
available on AMD systems is reached through SystemIO or HALT. If MWAIT is
available, it is preferred over the other methods, so the CPU never reaches
the deepest possible state.

Don't try to use MWAIT to play_dead() on AMD systems. Instead, use CPUIDLE
to enter the deepest state advertised by firmware. If CPUIDLE is not
available then fallback to HALT.
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20180403140228.58540-1-Yazen.Ghannam@amd.com

da6fa7ef

x86/mm: Make vmemmap and vmalloc base address constants unsigned long · 14d12bb8

由 Jiri Kosina 提交于 4月 12, 2018

Commits 9b46a051 ("x86/mm: Initialize vmemmap_base at boot-time") and 
a7412546 ("x86/mm: Adjust vmalloc base and size at boot-time") lost the 
type information for __VMALLOC_BASE_L4, __VMALLOC_BASE_L5, 
__VMEMMAP_BASE_L4 and __VMEMMAP_BASE_L5 constants.

Declare them explicitly unsigned long again.

Fixes: 9b46a051 ("x86/mm: Initialize vmemmap_base at boot-time")
Fixes: a7412546 ("x86/mm: Adjust vmalloc base and size at boot-time")
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: N"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1804121437350.28129@cbobk.fhfr.pm

14d12bb8

x86/vector: Remove the unused macro FPU_IRQ · 7d878817

由 Dou Liyang 提交于 4月 26, 2018

The macro FPU_IRQ has never been used since v3.10, So remove it.
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180426060832.27312-1-douly.fnst@cn.fujitsu.com

7d878817

x86/vector: Remove the macro VECTOR_OFFSET_START · e3072805

由 Dou Liyang 提交于 4月 25, 2018

Now, Linux uses matrix allocator for vector assignment, the original
assignment code which used VECTOR_OFFSET_START has been removed.

So remove the stale macro as well.

Fixes: commit 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
Signed-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180425020553.17210-1-douly.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

e3072805

x86/cpufeatures: Enumerate cldemote instruction · 91241305

由 Fenghua Yu 提交于 4月 23, 2018

cldemote is a new instruction in future x86 processors. It hints
to hardware that a specified cache line should be moved ("demoted")
from the cache(s) closest to the processor core to a level more
distant from the processor core. This instruction is faster than
snooping to make the cache line available for other cores.

cldemote instruction is indicated by the presence of the CPUID
feature flag CLDEMOTE (CPUID.(EAX=0x7, ECX=0):ECX[bit25]).

More details on cldemote instruction can be found in the latest
Intel Architecture Instruction Set Extensions and Future Features
Programming Reference.
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: "Ashok Raj" <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/1524508162-192587-1-git-send-email-fenghua.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

91241305

perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1 · 4e949e9b

由 Kan Liang 提交于 4月 25, 2018

The SMM freeze feature was introduced since PerfMon V2. But the current
code unconditionally enables the feature for all platforms. It can
generate #GP exception, if the related FREEZE_WHILE_SMM bit is set for
the machine with PerfMon V1.

To disable the feature for PerfMon V1, perf needs to
- Remove the freeze_on_smi sysfs entry by moving intel_pmu_attrs to
  intel_pmu, which is only applied to PerfMon V2 and later.
- Check the PerfMon version before flipping the SMM bit when starting CPU

Fixes: 6089327f ("perf/x86: Add sysfs entry to freeze counters on SMI")
Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: ak@linux.intel.com
Cc: eranian@google.com
Cc: acme@redhat.com
Link: https://lkml.kernel.org/r/1524682637-63219-1-git-send-email-kan.liang@linux.intel.com

4e949e9b

25 4月, 2018 3 次提交

bpf, x64: fix JIT emission for dead code · 1612a981

由 Gianluca Borello 提交于 4月 25, 2018

Commit 2a5418a1 ("bpf: improve dead code sanitizing") replaced dead
code with a series of ja-1 instructions, for safety. That made JIT
compilation much more complex for some BPF programs. One instance of such
programs is, for example:

bool flag = false
...
/* A bunch of other code */
...
if (flag)
        do_something()

In some cases llvm is not able to remove at compile time the code for
do_something(), so the generated BPF program ends up with a large amount
of dead instructions. In one specific real life example, there are two
series of ~500 and ~1000 dead instructions in the program. When the
verifier replaces them with a series of ja-1 instructions, it causes an
interesting behavior at JIT time.

During the first pass, since all the instructions are estimated at 64
bytes, the ja-1 instructions end up being translated as 5 bytes JMP
instructions (0xE9), since the jump offsets become increasingly large (>
127) as each instruction gets discovered to be 5 bytes instead of the
estimated 64.

Starting from the second pass, the first N instructions of the ja-1
sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets
become <= 127 this time. In particular, N is defined as roughly 127 / (5
- 2) ~= 42. So, each further pass will make the subsequent N JMP
instructions shrink from 5 to 2 bytes, making the image shrink every time.
This means that in order to have the entire program converge, there need
to be, in the real example above, at least ~1000 / 42 ~= 24 passes just
for translating the dead code. If we add this number to the passes needed
to translate the other non dead code, it brings such program to 40+
passes, and JIT doesn't complete. Ultimately the userspace loader fails
because such BPF program was supposed to be part of a prog array owner
being JITed.

While it is certainly possible to try to refactor such programs to help
the compiler remove dead code, the behavior is not really intuitive and it
puts further burden on the BPF developer who is not expecting such
behavior. To make things worse, such programs are working just fine in all
the kernel releases prior to the ja-1 fix.

A possible approach to mitigate this behavior consists into noticing that
for ja-1 instructions we don't really need to rely on the estimated size
of the previous and current instructions, we know that a -1 BPF jump
offset can be safely translated into a 0xEB instruction with a jump offset
of -2.

Such fix brings the BPF program in the previous example to complete again
in ~9 passes.

Fixes: 2a5418a1 ("bpf: improve dead code sanitizing")
Signed-off-by: NGianluca Borello <g.borello@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

1612a981

tracing/x86: Update syscall trace events to handle new prefixed syscall func names · 1c758a22

由 Steven Rostedt (VMware) 提交于 4月 17, 2018

Arnaldo noticed that the latest kernel is missing the syscall event system
directory in x86. I bisected it down to d5a00528 ("syscalls/core,
syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()").

The system call trace events are special, as there is only one trace event
for all system calls (the raw_syscalls). But a macro that wraps the system
calls creates meta data for them that copies the name to find the system
call that maps to the system call table (the number). At boot up, it does a
kallsyms lookup of the system call table to find the function that maps to
the meta data of the system call. If it does not find a function, then that
system call is ignored.

Because the x86 system calls had "__x64_", or "__ia32_" prefixed to the
"sys" for the names, they do not match the default compare algorithm. As
this was a problem for power pc, the algorithm can be overwritten by the
architecture. The solution is to have x86 have its own algorithm to do the
compare and this brings back the system call trace events.

Link: http://lkml.kernel.org/r/20180417174128.0f3457f0@gandalf.local.homeReported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: NDominik Brodowski <linux@dominikbrodowski.net>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Fixes: d5a00528 ("syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()")
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

1c758a22

powerpc: Fix smp_send_stop NMI IPI handling · ac61c115

由 Nicholas Piggin 提交于 4月 25, 2018

The NMI IPI handler for a receiving CPU increments nmi_ipi_busy_count
over the handler function call, which causes later smp_send_nmi_ipi()
callers to spin until the call is finished.

The stop_this_cpu() function never returns, so the busy count is never
decremeted, which can cause the system to hang in some cases. For
example panic() will call smp_send_stop() early on which calls
stop_this_cpu() on other CPUs, then later in the reboot path,
pnv_restart() will call smp_send_stop() again, which hangs.

Fix this by adding a special case to the stop_this_cpu() handler to
decrement the busy count, because it will never return.

Now that the NMI/non-NMI versions of stop_this_cpu() are different,
split them out into separate functions rather than doing #ifdef tricks
to share the body between the two functions.

Fixes: 6bed3237 ("powerpc: use NMI IPI for smp_send_stop")
Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
[mpe: Split out the functions, tweak change log a bit]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ac61c115

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功