提交 · 40d3f02851577da27b5cbb1538888301245ef1e7 · openanolis / cloud-kernel

06 7月, 2015 2 次提交

x86/espfix: Init espfix on the boot CPU side · 20d5e4a9

由 Zhu Guihua 提交于 7月 03, 2015

As we alloc pages with GFP_KERNEL in init_espfix_ap() which is
called before we enable local irqs, so the lockdep sub-system
would (correctly) trigger a warning about the potentially
blocking API.

So we allocate them on the boot CPU side when the secondary CPU is
brought up by the boot CPU, and hand them over to the secondary
CPU.

And we use alloc_pages_node() with the secondary CPU's node, to
make sure the espfix stack is NUMA-local to the CPU that is
going to use it.
Signed-off-by: NZhu Guihua <zhugh.fnst@cn.fujitsu.com>
Cc: <bp@alien8.de>
Cc: <luto@amacapital.net>
Cc: <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c97add2670e9abebb90095369f0cfc172373ac94.1435824469.git.zhugh.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

20d5e4a9

x86/espfix: Add 'cpu' parameter to init_espfix_ap() · 1db87563

由 Zhu Guihua 提交于 7月 03, 2015

Add a CPU index parameter to init_espfix_ap(), so that the
parameter could be propagated to the function for espfix
page allocation.
Signed-off-by: NZhu Guihua <zhugh.fnst@cn.fujitsu.com>
Cc: <bp@alien8.de>
Cc: <luto@amacapital.net>
Cc: <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cde3fcf1b3211f3f03feb1a995bce3fee850f0fc.1435824469.git.zhugh.fnst@cn.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1db87563

12 11月, 2014 1 次提交

x86, espfix: Remove stale ptemask · 585e4777

由 Borislav Petkov 提交于 7月 14, 2014

Previous versions of the espfix had a single function which did setup
the pagetables. It was later split into BSP and AP version. Drop unused
leftovers after that split.
Signed-off-by: NBorislav Petkov <bp@suse.de>

585e4777

15 7月, 2014 1 次提交

x86/espfix/xen: Fix allocation of pages for paravirt page tables · 8762e509

由 Boris Ostrovsky 提交于 7月 09, 2014

init_espfix_ap() is currently off by one level when informing hypervisor
that allocated pages will be used for ministacks' page tables.

The most immediate effect of this on a PV guest is that if
'stack_page = __get_free_page()' returns a non-zeroed-out page the hypervisor
will refuse to use it for a page table (which it shouldn't be anyway). This will
result in warnings by both Xen and Linux.

More importantly, a subsequent write to that page (again, by a PV guest) is
likely to result in fatal page fault.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1404926298-5565-1-git-send-email-boris.ostrovsky@oracle.comReviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>

8762e509

02 5月, 2014 1 次提交

x86, espfix: Move espfix definitions into a separate header file · e1fe9ed8

由 H. Peter Anvin 提交于 5月 01, 2014

Sparse warns that the percpu variables aren't declared before they are
defined.  Rather than hacking around it, move espfix definitions into
a proper header file.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e1fe9ed8

01 5月, 2014 1 次提交

x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack · 3891a04a

由 H. Peter Anvin 提交于 4月 29, 2014

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer.  This
causes some 16-bit software to break, but it also leaks kernel state
to user space.  We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

    b3b42ac2 x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart.  When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace.  The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
  and copy (as opposed to map) the IRET frame there, and for the
  suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.
Reported-by: NBrian Gerst <brgerst@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Lutomriski <amluto@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dirk Hohndel <dirk@hohndel.org>
Cc: Arjan van de Ven <arjan.van.de.ven@intel.com>
Cc: comex <comexk@gmail.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org> # consider after upstream merge

3891a04a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功