提交 · 0967160ad615985c7c35443156ea9aecc60c37b8 · openanolis / cloud-kernel

01 2月, 2015 3 次提交

x86_64, entry: Remove the syscall exit audit and schedule optimizations · 96b6352c

由 Andy Lutomirski 提交于 7月 07, 2014

We used to optimize rescheduling and audit on syscall exit. Now
that the full slow path is reasonably fast, remove these
optimizations. Syscall exit auditing is now handled exclusively by
syscall_trace_leave.

This adds something like 10ns to the previously optimized paths on
my computer, presumably due mostly to SAVE_REST / RESTORE_REST.

I think that we should eventually replace both the syscall and
non-paranoid interrupt exit slow paths with a pair of C functions
along the lines of the syscall entry hooks.

Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.netAcked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>

96b6352c

x86_64, entry: Use sysret to return to userspace when possible · 2a23c6b8

由 Andy Lutomirski 提交于 7月 22, 2014

The x86_64 entry code currently jumps through complex and
inconsistent hoops to try to minimize the impact of syscall exit
work.  For a true fast-path syscall, almost nothing needs to be
done, so returning is just a check for exit work and sysret.  For a
full slow-path return from a syscall, the C exit hook is invoked if
needed and we join the iret path.

Using iret to return to userspace is very slow, so the entry code
has accumulated various special cases to try to do certain forms of
exit work without invoking iret.  This is error-prone, since it
duplicates assembly code paths, and it's dangerous, since sysret
can malfunction in interesting ways if used carelessly.  It's
also inefficient, since a lot of useful cases aren't optimized
and therefore force an iret out of a combination of paranoia and
the fact that no one has bothered to write even more asm code
to avoid it.

I would argue that this approach is backwards.  Rather than trying
to avoid the iret path, we should instead try to make the iret path
fast.  Under a specific set of conditions, iret is unnecessary.  In
particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not
set, and both SS and CS are as expected, then
movq 32(%rsp),%rsp;sysret does the same thing as iret.  This set of
conditions is nearly always satisfied on return from syscalls, and
it can even occasionally be satisfied on return from an irq.

Even with the careful checks for sysret applicability, this cuts
nearly 80ns off of the overhead from syscalls with unoptimized exit
work.  This includes tracing and context tracking, and any return
that invokes KVM's user return notifier.  For example, the cost of
getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to
~280ns on my computer.

This may allow the removal and even eventual conversion to C
of a respectable amount of exit asm.

This may require further tweaking to give the full benefit on Xen.

It may be worthwhile to adjust signal delivery and exec to try hit
the sysret path.

This does not optimize returns to 32-bit userspace.  Making the same
optimization for CS == __USER32_CS is conceptually straightforward,
but it will require some tedious code to handle the differences
between sysretl and sysexitl.

Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.netSigned-off-by: NAndy Lutomirski <luto@amacapital.net>

2a23c6b8

x86, traps: Fix ist_enter from userspace · b926e6f6

由 Andy Lutomirski 提交于 1月 31, 2015

context_tracking_user_exit() has no effect if in_interrupt() returns true,
so ist_enter() didn't work. Fix it by calling exception_enter(), and thus
context_tracking_user_exit(), before incrementing the preempt count.

This also adds an assertion that will catch the problem reliably if
CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced.

Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net
Fixes: 95927475 x86, traps: Track entry into and exit from IST context
Reported-and-tested-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>

b926e6f6

31 1月, 2015 1 次提交

arc: mm: Fix build failure · e262eb93

由 Guenter Roeck 提交于 1月 29, 2015

Fix misspelled define.

Fixes: 33692f27 ("vm: add VM_FAULT_SIGSEGV handling support")
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e262eb93

30 1月, 2015 6 次提交

KVM: x86: check LAPIC presence when building apic_map · df04d1d1

由 Radim Krčmář 提交于 1月 29, 2015

We forgot to re-check LAPIC after splitting the loop in commit
173beedc (KVM: x86: Software disabled APIC should still deliver
NMIs, 2014-11-02).
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Fixes: 173beedcSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>

df04d1d1

arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault · 0d3e4d4f

由 Marc Zyngier 提交于 1月 05, 2015

When handling a fault in stage-2, we need to resync I$ and D$, just
to be sure we don't leave any old cache line behind.

That's very good, except that we do so using the *user* address.
Under heavy load (swapping like crazy), we may end up in a situation
where the page gets mapped in stage-2 while being unmapped from
userspace by another CPU.

At that point, the DC/IC instructions can generate a fault, which
we handle with kvm->mmu_lock held. The box quickly deadlocks, user
is unhappy.

Instead, perform this invalidation through the kernel mapping,
which is guaranteed to be present. The box is much happier, and so
am I.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

0d3e4d4f

arm/arm64: KVM: Invalidate data cache on unmap · 363ef89f

由 Marc Zyngier 提交于 12月 19, 2014

Let's assume a guest has created an uncached mapping, and written
to that page. Let's also assume that the host uses a cache-coherent
IO subsystem. Let's finally assume that the host is under memory
pressure and starts to swap things out.

Before this "uncached" page is evicted, we need to make sure
we invalidate potential speculated, clean cache lines that are
sitting there, or the IO subsystem is going to swap out the
cached view, loosing the data that has been written directly
into memory.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

363ef89f

arm/arm64: KVM: Use set/way op trapping to track the state of the caches · 3c1e7165

由 Marc Zyngier 提交于 12月 19, 2014

Trying to emulate the behaviour of set/way cache ops is fairly
pointless, as there are too many ways we can end-up missing stuff.
Also, there is some system caches out there that simply ignore
set/way operations.

So instead of trying to implement them, let's convert it to VA ops,
and use them as a way to re-enable the trapping of VM ops. That way,
we can detect the point when the MMU/caches are turned off, and do
a full VM flush (which is what the guest was trying to do anyway).

This allows a 32bit zImage to boot on the APM thingy, and will
probably help bootloaders in general.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3c1e7165

arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device() · eab8d653

由 Laurent Pinchart 提交于 1月 23, 2015

Commit 4bb25789 ("arm: dma-mapping: plumb our iommu mapping ops
into arch_setup_dma_ops") moved the setting of the DMA operations from
arm_iommu_attach_device() to arch_setup_dma_ops() where the DMA
operations to be used are selected based on whether the device is
connected to an IOMMU. However, the IOMMU detection scheme requires the
IOMMU driver to be ported to the new IOMMU of_xlate API. As no driver
has been ported yet, this effectively breaks all IOMMU ARM users that
depend on the IOMMU being handled transparently by the DMA mapping API.

Fix this by restoring the setting of DMA IOMMU ops in
arm_iommu_attach_device() and splitting the rest of the function into a
new internal __arm_iommu_attach_device() function, called by
arch_setup_dma_ops().
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Tested-by: NHeiko Stuebner <heiko@sntech.de>
Signed-off-by: NOlof Johansson <olof@lixom.net>

eab8d653

vm: add VM_FAULT_SIGSEGV handling support · 33692f27

由 Linus Torvalds 提交于 1月 29, 2015

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works.  However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV.  And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space.  And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it.  They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this.  A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Reported-and-tested-by: NTakashi Iwai <tiwai@suse.de>
Tested-by: NJan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33692f27

29 1月, 2015 4 次提交

ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds · 77cf5166

由 Magnus Damm 提交于 1月 29, 2015

As of commit 9a1091ef ("irqchip: gic: Support hierarchy irq
domain."), the Lager legacy board support is known to be broken.

The IRQ numbers of the GIC are now virtual, and no longer match the
hardcoded hardware IRQ numbers in the legacy platform board code.

To fix this issue specific to non-multiplatform r8a7790 and Lager:
 1) Instantiate the GIC from platform board code and also
 2) Skip over the DT arch timer as well as
 3) Force delay setup based on DT CPU frequency

With these 3 fixes in place interrupts on Lager are now unbroken.

Partially based on legacy GIC fix by Geert Uytterhoeven, thanks to
him for the initial work.
Signed-off-by: NMagnus Damm <damm+renesas@opensource.se>
Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>

77cf5166

x86, vdso: teach 'make clean' remove vdso64 binaries · 050835e9

由 Andrey Skvortsov 提交于 1月 28, 2015

After 'make clean' vdso64.so and vdso64.dbg.so were left in arch/x86/vdso/.

Link: http://lkml.kernel.org/r/1422453867-17326-1-git-send-email-andrej.skvortzov@gmail.comSigned-off-by: NAndrey Skvortsov <andrej.skvortzov@gmail.com>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>

050835e9

ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds · 974b072f

由 Magnus Damm 提交于 1月 28, 2015

As of commit 9a1091ef ("irqchip: gic: Support hierarchy irq
domain."), the APE6EVM legacy board support is known to be broken.

The IRQ numbers of the GIC are now virtual, and no longer match the
hardcoded hardware IRQ numbers in the legacy platform board code.

To fix this issue specific to non-muliplatform r8a73a4 and APE6EVM:
 1) Instantiate the GIC from platform board code and also
 2) Skip over the DT arch timer as well as
 3) Force delay setup based on DT CPU frequency

With these 3 fixes in place interrupts on APE6EVM are now unbroken.

Partially based on legacy GIC fix by Geert Uytterhoeven, thanks to
him for the initial work.
Signed-off-by: NMagnus Damm <damm+renesas@opensource.se>
Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>

974b072f

ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled · dcad6887

由 Thomas Petazzoni 提交于 1月 28, 2015

Since commit f2c3c67f (merge commit that adds commit "ARM: mvebu:
completely disable hardware I/O coherency"), we disable I/O coherency
on Armada EBU platforms.

However, we continue to initialize the coherency fabric, because this
coherency fabric is needed on Armada XP for inter-CPU
coherency. Unfortunately, due to this, we also continued to execute
the coherency fabric initialization code for Armada 375/38x, which
switched the PL310 into I/O coherent mode. This has the effect of
disabling the outer cache sync operation: this is needed when I/O
coherency is enabled to work around a PCIe/L2 deadlock. But obviously,
when I/O coherency is disabled, having the outer cache sync operation
is crucial.

Therefore, this commit fixes the armada_375_380_coherency_init() so
that the PL310 is switched to I/O coherent mode only if I/O coherency
is enabled.

Without this fix, all devices using DMA are broken on Armada 375/38x.
Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Cc: <stable@vger.kernel.org> # v3.8+

dcad6887

28 1月, 2015 3 次提交

perf/x86/intel: Add model number for Airmont · ef454cae

由 Kan Liang 提交于 1月 22, 2015

Intel Airmont supports the same architectural and non-architectural
performance monitoring events as Silvermont.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421913053-99803-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ef454cae

perf/rapl: Fix crash in rapl_scale() · 98b008df

由 Stephane Eranian 提交于 1月 22, 2015

This patch fixes a systematic crash in rapl_scale()
due to an invalid pointer.

The bug was introduced by commit:

  89cbc767 ("x86: Replace __get_cpu_var uses")

The fix is simple. Just put the parenthesis where it needs
to be, i.e., around rapl_pmu. To my surprise, the compiler
was not complaining about passing an integer instead of a
pointer.
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Tested-by: NVince Weaver <vincent.weaver@maine.edu>
Fixes: 89cbc767 ("x86: Replace __get_cpu_var uses")
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: cl@linux.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150122203834.GA10228@thinkpadSigned-off-by: NIngo Molnar <mingo@kernel.org>

98b008df

perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization · c05199e5

由 Kan Liang 提交于 1月 20, 2015

There were some issues about the uncore driver tried to access
non-existing boxes, which caused boot crashes. These issues have
been all fixed. But we should avoid boot failures if that ever
happens again.

This patch intends to prevent this kind of potential issues.
It moves uncore_box_init out of driver initialization. The box
will be initialized when it's first enabled.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421729665-5912-1-git-send-email-kan.liang@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

c05199e5

27 1月, 2015 1 次提交

x86, build: replace Perl script with Shell script · d69911a6

由 Kees Cook 提交于 1月 26, 2015

Commit e6023367 ("x86, kaslr: Prevent .bss from overlaping initrd")
added Perl to the required build environment.  This reimplements in
shell the Perl script used to find the size of the kernel with bss and
brk added.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reported-by: NRob Landley <rob@landley.net>
Acked-by: NRob Landley <rob@landley.net>
Cc: Anca Emanuel <anca.emanuel@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Junjie Mao <eternal.n08@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d69911a6

26 1月, 2015 1 次提交

ARM: sunxi: dt: Fix aliases · 117a2cc3

由 Maxime Ripard 提交于 1月 22, 2015

Commit f77d55a3 ("serial: 8250_dw: get index of serial line from DT
aliases") made the serial driver now use the serial aliases to get the tty
number, pointing out that our aliases have been wrong all along.

Remove them from the DTSI and add custom ones in the relevant boards.
Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com>

117a2cc3

25 1月, 2015 1 次提交

ARM: dts: imx6sx: correct i.MX6sx sdb board enet phy address · 9143e398

由 Nimrod Andy 提交于 1月 20, 2015

The commit (3d125f9c) cause i.MX6SX sdb enet cannot work. The cause is
the commit add mdio node with un-correct phy address.

The patch just correct i.MX6sx sdb board enet phy address.

V2:
* As Shawn's suggestion that unit-address should match 'reg' property, so
  update ethernet-phy unit-address.
Acked-by: NStefan Agner <stefan@agner.ch>
Signed-off-by: NFugang Duan <B38611@freescale.com>
Acked-by: NShawn Guo <shawn.guo@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9143e398

24 1月, 2015 1 次提交

arm64: dts: add baud rate to Juno stdout-path · e0b21800

由 Robin Murphy 提交于 1月 22, 2015

Without explicit command-line parameters, the Juno UART ends up running
at 57600 baud in the kernel, which is at odds with the 115200 baud used
by the rest of the firmware. Since commit 7914a7c5 now lets us
fix this by specifying default options in stdout-path, do so.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NOlof Johansson <olof@lixom.net>

e0b21800

23 1月, 2015 10 次提交

KVM: x86: SYSENTER emulation is broken · f3747379

由 Nadav Amit 提交于 1月 01, 2015

SYSENTER emulation is broken in several ways:
1. It misses the case of 16-bit code segments completely (CVE-2015-0239).
2. MSR_IA32_SYSENTER_CS is checked in 64-bit mode incorrectly (bits 0 and 1 can
   still be set without causing #GP).
3. MSR_IA32_SYSENTER_EIP and MSR_IA32_SYSENTER_ESP are not masked in
   legacy-mode.
4. There is some unneeded code.

Fix it.

Cc: stable@vger.linux.org
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f3747379

KVM: x86: Fix of previously incomplete fix for CVE-2014-8480 · 63ea0a49

由 Nadav Amit 提交于 1月 08, 2015

STR and SLDT with rip-relative operand can cause a host kernel oops.
Mark them as DstMem as well.

Cc: stable@vger.linux.org
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

63ea0a49

arm64: dump: Fix implicit inclusion of definition for PCI_IOBASE · 284be285

由 Mark Brown 提交于 1月 22, 2015

Since c9465b4e (arm64: add support to dump the kernel page tables)
allmodconfig has failed to build on arm64 as a result of:

../arch/arm64/mm/dump.c:55:20: error: 'PCI_IOBASE' undeclared here (not in a function)

Fix this by explicitly including io.h to ensure that a definition is
present.
Signed-off-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

284be285

x86/tsc: Change Fast TSC calibration failed from error to info · 52045217

由 Alexandre Demers 提交于 12月 09, 2014

Many users see this message when booting without knowning that it is
of no importance and that TSC calibration may have succeeded by
another way.

As explained by Paul Bolle in
http://lkml.kernel.org/r/1348488259.1436.22.camel@x61.thuisdomein

  "Fast TSC calibration failed" should not be considered as an error
  since other calibration methods are being tried afterward. At most,
  those send a warning if they fail (not an error). So let's change
  the message from error to warning.

[ tglx: Make if pr_info. It's really not important at all ]

Fixes: c767a54b x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>
Signed-off-by: NAlexandre Demers <alexandre.f.demers@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1418106470-6906-1-git-send-email-alexandre.f.demers@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

52045217

x86/apic: Re-enable PCI_MSI support for non-SMP X86_32 · 38a1dfda

由 Bryan O'Donoghue 提交于 1月 22, 2015

Commit 0dbc6078 ('x86, build, pci: Fix PCI_MSI build on !SMP')
introduced the dependency that X86_UP_APIC is only available when
PCI_MSI is false. This effectively prevents PCI_MSI support on 32bit
UP systems because it disables both APIC and IO-APIC. But APIC support
is architecturally required for PCI_MSI.

The intention of the patch was to enforce APIC support when PCI_MSI is
enabled, but failed to do so.

Remove the !PCI_MSI dependency from X86_UP_APIC and enforce
X86_UP_APIC when PCI_MSI support is enabled on 32bit UP systems.

[ tglx: Massaged changelog ]

Fixes 0dbc6078 'x86, build, pci: Fix PCI_MSI build on !SMP'
Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1421967529-9037-1-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

38a1dfda

x86, mm: Change cachemode exports to non-gpl · 31bb7723

由 Juergen Gross 提交于 1月 22, 2015

Commit 281d4078 ("x86: Make page cache mode a real type")
introduced the symbols __cachemode2pte_tbl and __pte2cachemode_tbl and
exported them via EXPORT_SYMBOL_GPL. The exports are part of a
replacement of code which has been EXPORT_SYMBOL before these changes
resulting in build breakage of out-of-tree non-gpl modules.

Change EXPORT_SYMBOL_GPL to EXPORT-SYMBOL for these two symbols.

Fixes: 281d4078 "x86: Make page cache mode a real type"
Reported-and-tested-by: NSteven Noonan <steven@uplinklabs.net>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NToshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/1421926997-28615-1-git-send-email-jgross@suse.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

31bb7723

x86, tls: Interpret an all-zero struct user_desc as "no segment" · 3669ef9f

由 Andy Lutomirski 提交于 1月 22, 2015

The Witcher 2 did something like this to allocate a TLS segment index:

        struct user_desc u_info;
        bzero(&u_info, sizeof(u_info));
        u_info.entry_number = (uint32_t)-1;

        syscall(SYS_set_thread_area, &u_info);

Strictly speaking, this code was never correct.  It should have set
read_exec_only and seg_not_present to 1 to indicate that it wanted
to find a free slot without putting anything there, or it should
have put something sensible in the TLS slot if it wanted to allocate
a TLS entry for real.  The actual effect of this code was to
allocate a bogus segment that could be used to exploit espfix.

The set_thread_area hardening patches changed the behavior, causing
set_thread_area to return -EINVAL and crashing the game.

This changes set_thread_area to interpret this as a request to find
a free slot and to leave it empty, which isn't *quite* what the game
expects but should be close enough to keep it working.  In
particular, using the code above to allocate two segments will
allocate the same segment both times.

According to FrostbittenKing on Github, this fixes The Witcher 2.

If this somehow still causes problems, we could instead allocate
a limit==0 32-bit data segment, but that seems rather ugly to me.

Fixes: 41bdc785 x86/tls: Validate TLS entries to protect espfix
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/0cb251abe1ff0958b8e468a9a9a905b80ae3a746.1421954363.git.luto@amacapital.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

3669ef9f

x86, tls, ldt: Stop checking lm in LDT_empty · e30ab185

由 Andy Lutomirski 提交于 1月 22, 2015

32-bit programs don't have an lm bit in their ABI, so they can't
reliably cause LDT_empty to return true without resorting to memset.
They shouldn't need to do this.

This should fix a longstanding, if minor, issue in all 64-bit kernels
as well as a potential regression in the TLS hardening code.

Fixes: 41bdc785 x86/tls: Validate TLS entries to protect espfix
Cc: stable@vger.kernel.org
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/72a059de55e86ad5e2935c80aa91880ddf19d07c.1421954363.git.luto@amacapital.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e30ab185

x86, mpx: Fix potential performance issue on unmaps · c922228e

由 Dave Hansen 提交于 1月 08, 2015

The 3.19 merge window saw some TLB modifications merged which caused a
performance regression. They were fixed in commit 045bbb9fa.

Once that fix was applied, I also noticed that there was a small
but intermittent regression still present.  It was not present
consistently enough to bisect reliably, but I'm fairly confident
that it came from (my own) MPX patches.  The source was reading
a relatively unused field in the mm_struct via arch_unmap.

I also noted that this code was in the main instruction flow of
do_munmap() and probably had more icache impact than we want.

This patch does two things:
1. Adds a static (via Kconfig) and dynamic (via cpuid) check
   for MPX with cpu_feature_enabled().  This keeps us from
   reading that cacheline in the mm and trades it for a check
   of the global CPUID variables at least on CPUs without MPX.
2. Adds an unlikely() to ensure that the MPX call ends up out
   of the main instruction flow in do_munmap().  I've added
   a detailed comment about why this was done and why we want
   it even on systems where MPX is present.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: luto@amacapital.net
Cc: Dave Hansen <dave@sr71.net>
Link: http://lkml.kernel.org/r/20150108223021.AEEAB987@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c922228e

x86, mpx: Explicitly disable 32-bit MPX support on 64-bit kernels · 814564a0

由 Dave Hansen 提交于 1月 08, 2015

We had originally planned on submitting MPX support in one patch
set.  We eventually broke it up in to two pieces for easier
review.  One of the features that didn't make the first round
was supporting 32-bit binaries on 64-bit kernels.

Once we split the set up, we never added code to restrict 32-bit
binaries from _using_ MPX on 64-bit kernels.

The 32-bit bounds tables are a different format than the 64-bit
ones.  Without this patch, the kernel will try to read a 32-bit
binary's tables as if they were the 64-bit version.  They will
likely be noticed as being invalid rather quickly and the app
will get killed, but that's kinda mean.

This patch adds an explicit check, and will make a 64-bit kernel
essentially behave as if it has no MPX support when called from
a 32-bit binary.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave@sr71.net>
Link: http://lkml.kernel.org/r/20150108223020.9E9AA511@viggo.jf.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

814564a0

22 1月, 2015 2 次提交

nios2: fix kuser trampoline address · d24c8163

由 Ley Foon Tan 提交于 1月 22, 2015

__kuser_sigtramp address should be 0x1044 instead of 0x1040.
Signed-off-by: NLey Foon Tan <lftan@altera.com>

d24c8163

powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared · 0eb13208

由 Shreyas B. Prabhu 提交于 1月 14, 2015

LPCR_PECE1 bit controls whether decrementer interrupts are allowed to
cause exit from power-saving mode. While waking up from winkle, restoring
LPCR with LPCR_PECE1 set (i.e Decrementer interrupts allowed) can cause
issue in the following scenario:

- All the threads in a core are offlined. The core enters deep winkle.
- Spurious interrupt wakes up a thread in the core. Here LPCR is restored
  with LPCR_PECE1 bit set.
- Since it was a spurious interrupt on a offline thread, the thread clears
  the interrupt and goes back to winkle.
- Here before the thread executes winkle and puts the core into deep winkle,
  if a decrementer interrupt occurs on any of the sibling threads in the core
  that thread wakes up.
- Since in offline loop we are flushing interrupt only in case of external
  interrupt, the decrementer interrupt does not get flushed. So at this stage
  the thread is stuck in this is loop of waking up at 0x100 due to decrementer
  interrupt, not flushing the interrupt as only external interrupts get flushed,
  entering winkle, waking up at 0x100 again.

Fix this by programming PORE to restore LPCR with LPCR_PECE1 bit
cleared when waking up from winkle.
Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

0eb13208

21 1月, 2015 2 次提交

ARM: 8292/1: mm: fix size rounding-down of arm_add_memory() function · 909ba297

由 Masahiro Yamada 提交于 1月 20, 2015

The current rounding of "size" is wrong:

 - If "start" is sufficiently near the next page boundary, "size"
   is decremented by more than enough and the last page is lost.

 - If "size" is sufficiently small, it is wrapped around and gets
   a bogus value.
Signed-off-by: NMasahiro Yamada <yamada.m@jp.panasonic.com>
Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

909ba297

ARM: dts: sun4i: Add simplefb node with de_fe0-de_be0-lcd0-hdmi pipeline · 8cedd662

由 Hans de Goede 提交于 1月 19, 2015

Testing has shown that on sun4i the display backend engine does not have
deep enough fifo-s causing flickering / tearing in full-hd mode due to
fifo underruns. This can be avoided by letting the display frontend engine
do the dma from memory, and then letting it feed the data directly into
the backend unmodified, as the frontend does have deep enough fifo-s.

Note since u-boot-v2015.01 has been released using the de_be0-lcd0-hdmi
pipeline on sun4i, we need to keep that one around too (unfortunately).
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NMaxime Ripard <maxime.ripard@free-electrons.com>

8cedd662

20 1月, 2015 5 次提交

x86, hyperv: Mark the Hyper-V clocksource as being continuous · 32c6590d

由 K. Y. Srinivasan 提交于 1月 12, 2015

The Hyper-V clocksource is continuous; mark it accordingly.
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Acked-by: jasowang@redhat.com
Cc: gregkh@linuxfoundation.org
Cc: devel@linuxdriverproject.org
Cc: olaf@aepfle.de
Cc: apw@canonical.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1421108762-3331-1-git-send-email-kys@microsoft.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

32c6590d

x86: Don't rely on VMWare emulating PAT MSR correctly · 9d34cfdf

由 Juergen Gross 提交于 1月 12, 2015

VMWare seems not to emulate the PAT MSR correctly: reaeding
MSR_IA32_CR_PAT returns 0 even after writing another value to it.

Commit bd809af1 triggers this VMWare bug when the kernel is
booted as a VMWare guest.

Detect this bug and don't use the read value if it is 0.

Fixes: bd809af1 "x86: Enable PAT to use cache mode translation tables"
Reported-and-tested-by: NJongman Heo <jongman.heo@samsung.com>
Acked-by: NAlok N Kataria <akataria@vmware.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Link: http://lkml.kernel.org/r/1421039745-14335-1-git-send-email-jgross@suse.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9d34cfdf

x86, irq: Properly tag virtualization entry in /proc/interrupts · 4a0d3107

由 Jan Beulich 提交于 1月 16, 2015

The mis-naming likely was a copy-and-paste effect.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/54B9408B0200007800055E8B@mail.emea.novell.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4a0d3107

x86, boot: Skip relocs when load address unchanged · f285f4a2

由 Kees Cook 提交于 1月 15, 2015

On 64-bit, relocation is not required unless the load address gets
changed. Without this, relocations do unexpected things when the kernel
is above 4G.
Reported-by: NBaoquan He <bhe@redhat.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Tested-by: NThomas D. <whissi@whissi.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Junjie Mao <eternal.n08@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20150116005146.GA4212@www.outflux.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f285f4a2

arm64: Add dtb files to archclean rule · c7c52e48

由 Jungseok Lee 提交于 1月 19, 2015

As dts files have been reorganised under vendor subdirs, dtb files
cannot be removed with "make distclean" now. Thus, this patch moves
dtb files under archclean rule and removes unnecessary entries.

Cc: Robert Richter <rrichter@cavium.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NJungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c7c52e48

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功