提交 · e27b72df01109c689062caeba1defa013b759e0e · openeuler / Kernel

20 8月, 2015 18 次提交

xen/PMU: Describe vendor-specific PMU registers · e27b72df

由 Boris Ostrovsky 提交于 8月 10, 2015

AMD and Intel PMU register initialization and helpers that determine
whether a register belongs to PMU.

This and some of subsequent PMU emulation code is somewhat similar to
Xen's PMU implementation.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

e27b72df

xen/PMU: Initialization code for Xen PMU · 65d0cf0b

由 Boris Ostrovsky 提交于 8月 10, 2015

Map shared data structure that will hold CPU registers, VPMU context,
V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills
this information in its handler and passes it to the guest for further
processing.

Set up PMU VIRQ.

Now that perf infrastructure will assume that PMU is available on a PV
guest we need to be careful and make sure that accesses via RDPMC
instruction don't cause fatal traps by the hypervisor. Provide a nop
RDPMC handler.

For the same reason avoid issuing a warning on a write to APIC's LVTPC.

Both of these will be made functional in later patches.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

65d0cf0b

xen/PMU: Sysfs interface for setting Xen PMU mode · 5f141548

由 Boris Ostrovsky 提交于 8月 10, 2015

Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

5f141548

xen: remove no longer needed p2m.h · cb3eb850

由 Juergen Gross 提交于 7月 17, 2015

Cleanup by removing arch/x86/xen/p2m.h as it isn't needed any more.

Most definitions in this file are used in p2m.c only. Move those into
p2m.c.

set_phys_range_identity() is already declared in
arch/x86/include/asm/xen/page.h, add __init annotation there.

MAX_REMAP_RANGES isn't used at all, just delete it.

The only define left is P2M_PER_PAGE which is moved to page.h as well.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

cb3eb850

xen: allow more than 512 GB of RAM for 64 bit pv-domains · c70727a5

由 Juergen Gross 提交于 7月 17, 2015

64 bit pv-domains under Xen are limited to 512 GB of RAM today. The
main reason has been the 3 level p2m tree, which was replaced by the
virtual mapped linear p2m list. Parallel to the p2m list which is
being used by the kernel itself there is a 3 level mfn tree for usage
by the Xen tools and eventually for crash dump analysis. For this tree
the linear p2m list can serve as a replacement, too. As the kernel
can't know whether the tools are capable of dealing with the p2m list
instead of the mfn tree, the limit of 512 GB can't be dropped in all
cases.

This patch replaces the hard limit by a kernel parameter which tells
the kernel to obey the 512 GB limit or not. The default is selected by
a configuration parameter which specifies whether the 512 GB limit
should be active per default for domUs (domain save/restore/migration
and crash dump analysis are affected).

Memory above the domain limit is returned to the hypervisor instead of
being identity mapped, which was wrong anyway.

The kernel configuration parameter to specify the maximum size of a
domain can be deleted, as it is not relevant any more.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

c70727a5

xen: move p2m list if conflicting with e820 map · 70e61199

由 Juergen Gross 提交于 7月 17, 2015

Check whether the hypervisor supplied p2m list is placed at a location
which is conflicting with the target E820 map. If this is the case
relocate it to a new area unused up to now and compliant to the E820
map.

As the p2m list might by huge (up to several GB) and is required to be
mapped virtually, set up a temporary mapping for the copied list.

For pvh domains just delete the p2m related information from start
info instead of reserving the p2m memory, as we don't need it at all.

For 32 bit kernels adjust the memblock_reserve() parameters in order
to cover the page tables only. This requires to memblock_reserve() the
start_info page on it's own.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

70e61199

xen: add explicit memblock_reserve() calls for special pages · 6c2681c8

由 Juergen Gross 提交于 7月 17, 2015

Some special pages containing interfaces to xen are being reserved
implicitly only today. The memblock_reserve() call to reserve them is
meant to reserve the p2m list supplied by xen. It is just reserving
not only the p2m list itself, but some more pages up to the start of
the xen built page tables.

To be able to move the p2m list to another pfn range, which is needed
for support of huge RAM, this memblock_reserve() must be split up to
cover all affected reserved pages explicitly.

The affected pages are:
- start_info page
- xenstore ring (might be missing, mfn is 0 in this case)
- console ring (not for initial domain)
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

6c2681c8

xen: check for initrd conflicting with e820 map · 4b9c1537

由 Juergen Gross 提交于 7月 17, 2015

Check whether the initrd is placed at a location which is conflicting
with the target E820 map. If this is the case relocate it to a new
area unused up to now and compliant to the E820 map.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

4b9c1537

xen: check pre-allocated page tables for conflict with memory map · 04414baa

由 Juergen Gross 提交于 7月 17, 2015

Check whether the page tables built by the domain builder are at
memory addresses which are in conflict with the target memory map.
If this is the case just panic instead of running into problems
later.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

04414baa

xen: check for kernel memory conflicting with memory layout · 808fdb71

由 Juergen Gross 提交于 7月 17, 2015

Checks whether the pre-allocated memory of the loaded kernel is in
conflict with the target memory map. If this is the case, just panic
instead of run into problems later, as there is nothing we can do
to repair this situation.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

808fdb71

xen: find unused contiguous memory area · 9ddac5b7

由 Juergen Gross 提交于 7月 17, 2015

For being able to relocate pre-allocated data areas like initrd or
p2m list it is mandatory to find a contiguous memory area which is
not yet in use and doesn't conflict with the memory map we want to
be in effect.

In case such an area is found reserve it at once as this will be
required to be done in any case.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

9ddac5b7

xen: check memory area against e820 map · e612b4a7

由 Juergen Gross 提交于 7月 17, 2015

Provide a service routine to check a physical memory area against the
E820 map. The routine will return false if the complete area is RAM
according to the E820 map and true otherwise.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

e612b4a7

xen: split counting of extra memory pages from remapping · 5097cdf6

由 Juergen Gross 提交于 7月 17, 2015

Memory pages in the initial memory setup done by the Xen hypervisor
conflicting with the target E820 map are remapped. In order to do this
those pages are counted and remapped in xen_set_identity_and_remap().

Split the counting from the remapping operation to be able to setup
the needed memory sizes in time but doing the remap operation at a
later time. This enables us to simplify the interface to
xen_set_identity_and_remap() as the number of remapped and released
pages is no longer needed here.

Finally move the remapping further down to prepare relocating
conflicting memory contents before the memory might be clobbered by
xen_set_identity_and_remap(). This requires to not destroy the Xen
E820 map when the one for the system is being constructed.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

5097cdf6

xen: move static e820 map to global scope · 69632ecf

由 Juergen Gross 提交于 7月 17, 2015

Instead of using a function local static e820 map in xen_memory_setup()
and calling various functions in the same source with the map as a
parameter use a map directly accessible by all functions in the source.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

69632ecf

xen: eliminate scalability issues from initial mapping setup · 8f5b0c63

由 Juergen Gross 提交于 7月 17, 2015

Direct Xen to place the initial P->M table outside of the initial
mapping, as otherwise the 1G (implementation) / 2G (theoretical)
restriction on the size of the initial mapping limits the amount
of memory a domain can be handed initially.

As the initial P->M table is copied rather early during boot to
domain private memory and it's initial virtual mapping is dropped,
the easiest way to avoid virtual address conflicts with other
addresses in the kernel is to use a user address area for the
virtual address of the initial P->M table. This allows us to just
throw away the page tables of the initial mapping after the copy
without having to care about address invalidation.

It should be noted that this patch won't enable a pv-domain to USE
more than 512 GB of RAM. It just enables it to be started with a
P->M table covering more memory. This is especially important for
being able to boot a Dom0 on a system with more than 512 GB memory.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Based-on-patch-by: NJan Beulich <jbeulich@suse.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

8f5b0c63

xen: don't build mfn tree if tools don't need it · d51e8b3e

由 Juergen Gross 提交于 7月 17, 2015

In case the Xen tools indicate they don't need the p2m 3 level tree
as they support the virtual mapped linear p2m list, just omit building
the tree.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

d51e8b3e

xen: save linear p2m list address in shared info structure · 4b9c9a11

由 Juergen Gross 提交于 7月 17, 2015

The virtual address of the linear p2m list should be stored in the
shared info structure read by the Xen tools to be able to support
64 bit pv-domains larger than 512 GB. Additionally the linear p2m
list interface includes a generation count which is changed prior
to and after each mapping change of the p2m list. Reading the
generation count the Xen tools can detect changes of the mappings
and re-read the p2m list eventually.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Acked-by: NKonrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

4b9c9a11

x86/xen: fix non-ANSI declaration of xen_has_pv_devices() · 772f95e3

由 Colin Ian King 提交于 7月 16, 2015

xen_has_pv_devices() has no parameters, so use the normal void
parameter convention to make it match the prototype in the header file
include/xen/platform_pci.h.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

772f95e3

10 8月, 2015 1 次提交

x86/xen: build "Xen PV" APIC driver for domU as well · fc5fee86

由 Jason A. Donenfeld 提交于 8月 10, 2015

It turns out that a PV domU also requires the "Xen PV" APIC
driver. Otherwise, the flat driver is used and we get stuck in busy
loops that never exit, such as in this stack trace:

(gdb) target remote localhost:9999
Remote debugging using localhost:9999
__xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
56              while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY)
(gdb) bt
 #0  __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
 #1  __default_send_IPI_shortcut (shortcut=<optimized out>,
dest=<optimized out>, vector=<optimized out>) at
./arch/x86/include/asm/ipi.h:75
 #2  apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54
 #3  0xffffffff81011336 in arch_irq_work_raise () at
arch/x86/kernel/irq_work.c:47
 #4  0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at
kernel/irq_work.c:100
 #5  0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633
 #6  0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized
out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>,
fmt=<optimized out>, args=<optimized out>)
    at kernel/printk/printk.c:1778
 #7  0xffffffff816010c8 in printk (fmt=<optimized out>) at
kernel/printk/printk.c:1868
 #8  0xffffffffc00013ea in ?? ()
 #9  0x0000000000000000 in ?? ()

Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

fc5fee86

31 7月, 2015 1 次提交

x86/xen: Probe target addresses in set_aliased_prot() before the hypercall · aa1acff3

由 Andy Lutomirski 提交于 7月 30, 2015

The update_va_mapping hypercall can fail if the VA isn't present
in the guest's page tables.  Under certain loads, this can
result in an OOPS when the target address is in unpopulated vmap
space.

While we're at it, add comments to help explain what's going on.

This isn't a great long-term fix.  This code should probably be
changed to use something like set_memory_ro.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Vrabel <dvrabel@cantab.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: security@kernel.org <security@kernel.org>
Cc: <stable@vger.kernel.org>
Cc: xen-devel <xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

aa1acff3

08 6月, 2015 3 次提交

x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 · b2502b41

由 Ingo Molnar 提交于 6月 08, 2015

The 'system_call' entry points differ starkly between native 32-bit and 64-bit
kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
64-bit it's the SYSCALL entry point.

This is pretty confusing when looking at generic code, and it also obscures
the nature of the entry point at the assembly level.

So unangle this by splitting the name into its two uses:

	system_call (32) -> entry_INT80_32
	system_call (64) -> entry_SYSCALL_64

As per the generic naming scheme for x86 system call entry points:

	entry_MNEMONIC_qualifier

where 'qualifier' is one of _32, _64 or _compat.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b2502b41

x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points:... · 4c8cd0c5

由 Ingo Molnar 提交于 6月 08, 2015

x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat

So the SYSENTER instruction is pretty quirky and it has different behavior
depending on bitness and CPU maker.

Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
in both of the cases.

Split the name into its two uses:

	ia32_sysenter_target (32)    -> entry_SYSENTER_32
	ia32_sysenter_target (64)    -> entry_SYSENTER_compat

As per the generic naming scheme for x86 system call entry points:

	entry_MNEMONIC_qualifier

where 'qualifier' is one of _32, _64 or _compat.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4c8cd0c5

x86/asm/entry: Rename compat syscall entry points · 2cd23553

由 Ingo Molnar 提交于 6月 08, 2015

Rename the following system call entry points:

	ia32_cstar_target       -> entry_SYSCALL_compat
	ia32_syscall            -> entry_INT80_compat

The generic naming scheme for x86 system call entry points is:

	entry_MNEMONIC_qualifier

where 'qualifier' is one of _32, _64 or _compat.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

2cd23553

07 6月, 2015 1 次提交

x86/mm/pat: Emulate PAT when it is disabled · 9cd25aac

由 Borislav Petkov 提交于 6月 04, 2015

In the case when PAT is disabled on the command line with
"nopat" or when virtualization doesn't support PAT (correctly) -
see

  9d34cfdf ("x86: Don't rely on VMWare emulating PAT MSR correctly").

we emulate it using the PWT and PCD cache attribute bits. Get
rid of boot_pat_state while at it.

Based on a conglomerate patch from Toshi Kani.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NToshi Kani <toshi.kani@hp.com>
Acked-by: NJuergen Gross <jgross@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arnd@arndb.de
Cc: hch@lst.de
Cc: hmh@hmh.eng.br
Cc: konrad.wilk@oracle.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: linux-nvdimm@lists.01.org
Cc: stefan.bader@canonical.com
Cc: yigal@plexistor.com
Link: http://lkml.kernel.org/r/1433436928-31903-3-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

9cd25aac

03 6月, 2015 1 次提交

x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h> · d6472302

由 Stephen Rothwell 提交于 6月 02, 2015

Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so
remove it from there and fix up the resulting build problems
triggered on x86 {64|32}-bit {def|allmod|allno}configs.

The breakages were triggering in places where x86 builds relied
on vmalloc() facilities but did not include <linux/vmalloc.h>
explicitly and relied on the implicit inclusion via <asm/io.h>.

Also add:

  - <linux/init.h> to <linux/io.h>
  - <asm/pgtable_types> to <asm/io.h>

... which were two other implicit header file dependencies.
Suggested-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
[ Tidied up the changelog. ]
Acked-by: NDavid Miller <davem@davemloft.net>
Acked-by: NTakashi Iwai <tiwai@suse.de>
Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
Acked-by: NVinod Koul <vinod.koul@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Colin Cross <ccross@android.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: James E.J. Bottomley <JBottomley@odin.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Suma Ramars <sramars@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

d6472302

19 5月, 2015 2 次提交

x86/fpu: Simplify fpu__cpu_init() · 21c4cd10

由 Ingo Molnar 提交于 4月 26, 2015

After the latest round of cleanups, fpu__cpu_init() has become
a simple call to fpu__init_cpu().

Rename fpu__init_cpu() to fpu__cpu_init() and remove the
extra layer.
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

21c4cd10

x86/fpu: Rename fpu_init() to fpu__cpu_init() · 3a9c4b0d

由 Ingo Molnar 提交于 4月 03, 2015

fpu_init() is a bit of a misnomer in that it (falsely) creates the
impression that it's related to the (old) fpu_finit() function,
which initializes FPU ctx state.

Rename it to fpu__cpu_init() to make its boot time initialization
clear, and to move it to the fpu__*() namespace.

Also fix and extend its comment block to point out that it's
called not only on the boot CPU, but on secondary CPUs as well.
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

3a9c4b0d

11 5月, 2015 1 次提交

locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS · 62c7a1e9

由 Ingo Molnar 提交于 5月 11, 2015

Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS
in arch/x86/kernel/paravirt_patch_32.c, while the symbol is
called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S')

But the typo was natural: the proper English term for such
a generic object would be 'queued spinlocks' - so rename
this and related symbols accordingly to the plural form.
Reported-by: NValentin Rothberg <valentinrothberg@gmail.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

62c7a1e9

08 5月, 2015 3 次提交

x86/entry: Define 'cpu_current_top_of_stack' for 64-bit code · 3a23208e

由 Denys Vlasenko 提交于 4月 24, 2015

32-bit code has PER_CPU_VAR(cpu_current_top_of_stack).
64-bit code uses somewhat more obscure: PER_CPU_VAR(cpu_tss + TSS_sp0).

Define the 'cpu_current_top_of_stack' macro on CONFIG_X86_64
as well so that the PER_CPU_VAR(cpu_current_top_of_stack)
expression can be used in both 32-bit and 64-bit code.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429889495-27850-3-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

3a23208e

x86/entry: Stop using PER_CPU_VAR(kernel_stack) · 63332a84

由 Denys Vlasenko 提交于 4月 24, 2015

PER_CPU_VAR(kernel_stack) is redundant:

  - On the 64-bit build, we can use PER_CPU_VAR(cpu_tss + TSS_sp0).
  - On the 32-bit build, we can use PER_CPU_VAR(cpu_current_top_of_stack).

PER_CPU_VAR(kernel_stack) will be deleted by a separate change.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429889495-27850-1-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

63332a84

locking/pvqspinlock, x86: Enable PV qspinlock for Xen · e95e6f17

由 David Vrabel 提交于 4月 24, 2015

This patch adds the necessary Xen specific code to allow Xen to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel J Blueman <daniel@numascale.com>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-12-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

e95e6f17

06 5月, 2015 1 次提交

hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests · a71dbdaa

由 Boris Ostrovsky 提交于 5月 04, 2015

Commit 61f01dd9 ("x86_64, asm: Work around AMD SYSRET SS descriptor
attribute issue") makes AMD processors set SS to __KERNEL_DS in
__switch_to() to deal with cases when SS is NULL.

This breaks Xen PV guests who do not want to load SS with__KERNEL_DS.

Since the problem that the commit is trying to address would have to be
fixed in the hypervisor (if it in fact exists under Xen) there is no
reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here.

This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features
op which will clear this flag. (And since this structure is no longer
HVM-specific we should do some renaming).
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: NSander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

a71dbdaa

30 4月, 2015 1 次提交

xen: Suspend ticks on all CPUs during suspend · 2b953a5e

由 Boris Ostrovsky 提交于 4月 28, 2015

Commit 77e32c89 ("clockevents: Manage device's state separately for
the core") decouples clockevent device's modes from states. With this
change when a Xen guest tries to resume, it won't be calling its
set_mode op which needs to be done on each VCPU in order to make the
hypervisor aware that we are in oneshot mode.

This happens because clockevents_tick_resume() (which is an intermediate
step of resuming ticks on a processor) doesn't call clockevents_set_state()
anymore and because during suspend clockevent devices on all VCPUs (except
for the one doing the suspend) are left in ONESHOT state. As result, during
resume the clockevents state machine will assume that device is already
where it should be and doesn't need to be updated.

To avoid this problem we should suspend ticks on all VCPUs during
suspend.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

2b953a5e

22 4月, 2015 1 次提交

x86, paravirt, xen: Remove the 64-bit ->irq_enable_sysexit() pvop · aac82d31

由 Andy Lutomirski 提交于 4月 03, 2015

We don't use irq_enable_sysexit on 64-bit kernels any more.
Remove all the paravirt and Xen machinery to support it on
64-bit kernels.
Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8a03355698fe5b94194e9e7360f19f91c1b2cf1f.1428100853.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

aac82d31

15 4月, 2015 1 次提交

x86: expose number of page table levels on Kconfig level · 98233368

由 Kirill A. Shutemov 提交于 4月 14, 2015

We would want to use number of page table level to define mm_struct.
Let's expose it as CONFIG_PGTABLE_LEVELS.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98233368

02 4月, 2015 1 次提交

x86/cpu: Factor out common CPU initialization code, fix 32-bit Xen PV guests · 3f85483b

由 Boris Ostrovsky 提交于 4月 01, 2015

Some of x86 bare-metal and Xen CPU initialization code is common
between the two and therefore can be factored out to avoid code
duplication.

As a side effect, doing so will also extend the fix provided by
commit a7fcf28d ("x86/asm/entry: Replace this_cpu_sp0() with
current_top_of_stack() to x86_32") to 32-bit Xen PV guests.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: konrad.wilk@oracle.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1427897534-5086-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

3f85483b

01 4月, 2015 2 次提交

tick/xen: Provide and use tick_suspend_local() and tick_resume_local() · f46481d0

由 Thomas Gleixner 提交于 3月 25, 2015

Xen calls on every cpu into tick_resume() which is just wrong.
tick_resume() is for the syscore global suspend/resume
invocation. What XEN really wants is a per cpu local resume
function.

Provide a tick_resume_local() function and use it in XEN.

Also provide a complementary tick_suspend_local() and modify
tick_unfreeze() and tick_freeze(), respectively, to use the
new local tick resume/suspend functions.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
[ Combined two patches, rebased, modified subject/changelog. ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1698741.eezk9tnXtG@vostro.rjw.lan
[ Merged to latest timers/core. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

f46481d0

clockevents: Make suspend/resume calls explicit · 4ffee521

由 Thomas Gleixner 提交于 3月 25, 2015

clockevents_notify() is a leftover from the early design of the
clockevents facility. It's really not a notification mechanism,
it's a multiplex call.

We are way better off to have explicit calls instead of this
monstrosity. Split out the suspend/resume() calls and invoke
them directly from the call sites.

No locking required at this point because these calls happen
with interrupts disabled and a single cpu online.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
[ Rebased on top of 4.0-rc5. ]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/713674030.jVm1qaHuPf@vostro.rjw.lan
[ Rebased on top of latest timers/core. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4ffee521

25 3月, 2015 1 次提交

x86/asm/entry: Get rid of KERNEL_STACK_OFFSET · ef593260

由 Denys Vlasenko 提交于 3月 19, 2015

PER_CPU_VAR(kernel_stack) was set up in a way where it points
five stack slots below the top of stack.

Presumably, it was done to avoid one "sub $5*8,%rsp"
in syscall/sysenter code paths, where iret frame needs to be
created by hand.

Ironically, none of them benefits from this optimization,
since all of them need to allocate additional data on stack
(struct pt_regs), so they still have to perform subtraction.

This patch eliminates KERNEL_STACK_OFFSET.

PER_CPU_VAR(kernel_stack) now points directly to top of stack.
pt_regs allocations are adjusted to allocate iret frame as well.
Hopefully we can merge it later with 32-bit specific
PER_CPU_VAR(cpu_current_top_of_stack) variable...

Net result in generated code is that constants in several insns
are changed.

This change is necessary for changing struct pt_regs creation
in SYSCALL64 code path from MOV to PUSH instructions.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ef593260

23 3月, 2015 1 次提交

x86/xen: prepare p2m list for memory hotplug · 633d6f17

由 Juergen Gross 提交于 3月 20, 2015

Commit 054954eb ("xen: switch to linear
virtual mapped sparse p2m list") introduced a regression regarding to
memory hotplug for a pv-domain: as the virtual space for the p2m list
is allocated for the to be expected memory size of the domain only,
hotplugged memory above that size will not be usable by the domain.

Correct this by using a configurable size for the p2m list in case of
memory hotplug enabled (default supported memory size is 512 GB for
64 bit domains and 4 GB for 32 bit domains).
Signed-off-by: NJuergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org> # 3.19+
Reviewed-by: NDaniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

633d6f17

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功