提交 · af6fc858a35b90e89ea7a7ee58e66628c55c776b · openeuler / Kernel

27 2月, 2015 1 次提交

x86/xen: correct bug in p2m list initialization · b8f05c88

由 Juergen Gross 提交于 2月 27, 2015

Commit 054954eb ("xen: switch to
linear virtual mapped sparse p2m list") introduced an error.

During initialization of the p2m list a p2m identity area mapped by
a complete identity pmd entry has to be split up into smaller chunks
sometimes, if a non-identity pfn is introduced in this area.

If this non-identity pfn is not at index 0 of a p2m page the new
p2m page needed is initialized with wrong identity entries, as the
identity pfns don't start with the value corresponding to index 0,
but with the initial non-identity pfn. This results in weird wrong
mappings.

Correct the wrong initialization by starting with the correct pfn.

Cc: stable@vger.kernel.org # 3.19
Reported-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Tested-by: NStefan Bader <stefan.bader@canonical.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

b8f05c88

24 2月, 2015 2 次提交

x86/xen: Initialize cr4 shadow for 64-bit PV(H) guests · 5054daa2

由 Boris Ostrovsky 提交于 2月 23, 2015

Commit 1e02ce4c ("x86: Store a per-cpu shadow copy of CR4")
introduced CR4 shadows.

These shadows are initialized in early boot code. The commit missed
initialization for 64-bit PV(H) guests that this patch adds.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

5054daa2

x86/xen: Make sure X2APIC_ENABLE bit of MSR_IA32_APICBASE is not set · 31795b47

由 Boris Ostrovsky 提交于 2月 11, 2015

Commit d524165c ("x86/apic: Check x2apic early") tests X2APIC_ENABLE
bit of MSR_IA32_APICBASE when CONFIG_X86_X2APIC is off and panics
the kernel when this bit is set.

Xen's PV guests will pass this MSR read to the hypervisor which will
return its version of the MSR, where this bit might be set. Make sure
we clear it before returning MSR value to the caller.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

31795b47

18 2月, 2015 1 次提交

x86/spinlocks/paravirt: Fix memory corruption on unlock · d6abfdb2

由 Raghavendra K T 提交于 2月 06, 2015

Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:

                prev = *lock;
                add_smp(&lock->tickets.head, TICKET_LOCK_INC);

                /* add_smp() is a full mb() */

                if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
                        __ticket_unlock_slowpath(lock, prev);

which is *exactly* the kind of things you cannot do with spinlocks,
because after you've done the "add_smp()" and released the spinlock
for the fast-path, you can't access the spinlock any more.  Exactly
because a fast-path lock might come in, and release the whole data
structure.

Linus suggested that we should not do any writes to lock after unlock(),
and we can move slowpath clearing to fastpath lock.

So this patch implements the fix with:

 1. Moving slowpath flag to head (Oleg):
    Unlocked locks don't care about the slowpath flag; therefore we can keep
    it set after the last unlock, and clear it again on the first (try)lock.
    -- this removes the write after unlock. note that keeping slowpath flag would
    result in unnecessary kicks.
    By moving the slowpath flag from the tail to the head ticket we also avoid
    the need to access both the head and tail tickets on unlock.

 2. use xadd to avoid read/write after unlock that checks the need for
    unlock_kick (Linus):
    We further avoid the need for a read-after-release by using xadd;
    the prev head value will include the slowpath flag and indicate if we
    need to do PV kicking of suspended spinners -- on modern chips xadd
    isn't (much) more expensive than an add + load.

Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
 benchmark overcommit %improve
 kernbench  1x           -0.13
 kernbench  2x            0.02
 dbench     1x           -1.77
 dbench     2x           -0.63

[Jeremy: Hinted missing TICKET_LOCK_INC for kick]
[Oleg: Moved slowpath flag to head, ticket_equals idea]
[PeterZ: Added detailed changelog]
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Tested-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: a.ryabinin@samsung.com
Cc: dave@stgolabs.net
Cc: hpa@zytor.com
Cc: jasowang@redhat.com
Cc: jeremy@goop.org
Cc: paul.gortmaker@windriver.com
Cc: riel@redhat.com
Cc: tglx@linutronix.de
Cc: waiman.long@hp.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20150215173043.GA7471@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d6abfdb2

04 2月, 2015 1 次提交

x86: Clean up cr4 manipulation · 375074cc

由 Andy Lutomirski 提交于 10月 24, 2014

CR4 manipulation was split, seemingly at random, between direct
(write_cr4) and using a helper (set/clear_in_cr4).  Unfortunately,
the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code,
which only a small subset of users actually wanted.

This patch replaces all cr4 access in functions that don't leave cr4
exactly the way they found it with new helpers cr4_set_bits,
cr4_clear_bits, and cr4_set_bits_and_update_boot.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Vince Weaver <vince@deater.net>
Cc: "hillf.zj" <hillf.zj@alibaba-inc.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

375074cc

28 1月, 2015 9 次提交

xen: mark grant mapped pages as foreign · 8da7633f

由 Jennifer Herbert 提交于 12月 24, 2014

Use the "foreign" page flag to mark pages that have a grant map.  Use
page->private to store information of the grant (the granting domain
and the grant reference).
Signed-off-by: NJennifer Herbert <jennifer.herbert@citrix.com>
Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

8da7633f

x86/xen: require ballooned pages for grant maps · 0ae65f49

由 Jennifer Herbert 提交于 12月 24, 2014

Ballooned pages are always used for grant maps which means the
original frame does not need to be saved in page->index nor restored
after the grant unmap.

This allows the workaround in netback for the conflicting use of the
(unionized) page->index and page->pfmemalloc to be removed.
Signed-off-by: NJennifer Herbert <jennifer.herbert@citrix.com>
Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

0ae65f49

xen: remove scratch frames for ballooned pages and m2p override · 0bb599fd

由 David Vrabel 提交于 1月 05, 2015

The scratch frame mappings for ballooned pages and the m2p override
are broken.  Remove them in preparation for replacing them with
simpler mechanisms that works.

The scratch pages did not ensure that the page was not in use.  In
particular, the foreign page could still be in use by hardware.  If
the guest reused the frame the hardware could read or write that
frame.

The m2p override did not handle the same frame being granted by two
different grant references.  Trying an M2P override lookup in this
case is impossible.

With the m2p override removed, the grant map/unmap for the kernel
mappings (for x86 PV) can be easily batched in
set_foreign_p2m_mapping() and clear_foreign_p2m_mapping().
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

0bb599fd

xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() · 853d0289

由 David Vrabel 提交于 1月 05, 2015

When unmapping grants, instead of converting the kernel map ops to
unmap ops on the fly, pre-populate the set of unmap ops.

This allows the grant unmap for the kernel mappings to be trivially
batched in the future.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reviewed-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>

853d0289

x86/xen: cleanup arch/x86/xen/mmu.c · 270b7933

由 Juergen Gross 提交于 1月 28, 2015

Remove a nested ifdef.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

270b7933

x86/xen: add some __init annotations in arch/x86/xen/mmu.c · bf9d834a

由 Juergen Gross 提交于 1月 28, 2015

The file arch/x86/xen/mmu.c has some functions that can be annotated
with "__init".
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

bf9d834a

x86/xen: add some __init and static annotations in arch/x86/xen/setup.c · a3f52396

由 Juergen Gross 提交于 1月 28, 2015

Some more functions in arch/x86/xen/setup.c can be made "__init".
xen_ignore_unusable() can be made "static".
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

a3f52396

x86/xen: use correct types for addresses in arch/x86/xen/setup.c · 3ba5c867

由 Juergen Gross 提交于 1月 28, 2015

In many places in arch/x86/xen/setup.c wrong types are used for
physical addresses (u64 or unsigned long long). Use phys_addr_t
instead.

Use macros already defined instead of open coding them.

Correct some other type mismatches.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

3ba5c867

x86/xen: cleanup arch/x86/xen/setup.c · f0feed10

由 Juergen Gross 提交于 1月 28, 2015

Remove extern declarations in arch/x86/xen/setup.c which are either
not used or redundant. Move needed other extern declarations to
xen-ops.h
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

f0feed10

26 1月, 2015 1 次提交

x86,xen: use current->state helpers · 57b6b99b

由 Davidlohr Bueso 提交于 1月 26, 2015

Call __set_current_state() instead of assigning the new state directly.
These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP environments,
keeping track of who changed the state.
Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

57b6b99b

21 1月, 2015 1 次提交

x86/xen: prefer TSC over xen clocksource for dom0 · 94dd85f6

由 Palik, Imre 提交于 1月 13, 2015

In Dom0's the use of the TSC clocksource (whenever it is stable enough to
be used) instead of the Xen clocksource should not cause any issues, as
Dom0 VMs never live-migrated.  The TSC clocksource is somewhat more
efficient than the Xen paravirtualised clocksource, thus it should have
higher rating.

This patch decreases the rating of the Xen clocksource in Dom0s to 275.
Which is half-way between the rating of the TSC clocksource (300) and the
hpet clocksource (250).

Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: NImre Palik <imrep@amazon.de>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

94dd85f6

19 1月, 2015 1 次提交

x86/xen/p2m: Replace ACCESS_ONCE with READ_ONCE · 1760f1eb

由 Christian Borntraeger 提交于 12月 07, 2014

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the p2m code to replace ACCESS_ONCE with READ_ONCE.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Acked-by: NDavid Vrabel <david.vrabel@citrix.com>

1760f1eb

13 1月, 2015 1 次提交

x86/xen: properly retrieve NMI reason · f221b04f

由 Jan Beulich 提交于 1月 13, 2015

Using the native code here can't work properly, as the hypervisor would
normally have cleared the two reason bits by the time Dom0 gets to see
the NMI (if passed to it at all). There's a shared info field for this,
and there's an existing hook to use - just fit the two together. This
is particularly relevant so that NMIs intended to be handled by APEI /
GHES actually make it to the respective handler.

Note that the hook can (and should) be used irrespective of whether
being in Dom0, as accessing port 0x61 in a DomU would be even worse,
while the shared info field would just hold zero all the time. Note
further that hardware NMI handling for PVH doesn't currently work
anyway due to missing code in the hypervisor (but it is expected to
work the native rather than the PV way).
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

f221b04f

12 1月, 2015 4 次提交

xen: check for zero sized area when invalidating memory · 9a17ad7f

由 Juergen Gross 提交于 1月 12, 2015

With the introduction of the linear mapped p2m list setting memory
areas to "invalid" had to be delayed. When doing the invalidation
make sure no zero sized areas are processed.
Signed-off-by: NJuegren Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

9a17ad7f

xen: use correct type for physical addresses · e86f9496

由 Juergen Gross 提交于 1月 12, 2015

When converting a pfn to a physical address be sure to use 64 bit
wide types or convert the physical address to a pfn if possible.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

e86f9496

xen: correct race in alloc_p2m_pmd() · f241b0b8

由 Juergen Gross 提交于 1月 12, 2015

When allocating a new pmd for the linear mapped p2m list a check is
done for not introducing another pmd when this just happened on
another cpu. In this case the old pte pointer was returned which
points to the p2m_missing or p2m_identity page. The correct value
would be the pointer to the found new page.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

f241b0b8

xen: correct error for building p2m list on 32 bits · 82c92ed1

由 Juergen Gross 提交于 1月 12, 2015

In xen_rebuild_p2m_list() for large areas of invalid or identity
mapped memory the pmd entries on 32 bit systems are initialized
wrong. Correct this error.
Suggested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

82c92ed1

08 1月, 2015 4 次提交

x86/xen: avoid freeing static 'name' when kasprintf() fails · 7be0772d

由 Vitaly Kuznetsov 提交于 1月 05, 2015

In case kasprintf() fails in xen_setup_timer() we assign name to the
static string "<timer kasprintf failed>". We, however, don't check
that fact before issuing kfree() in xen_teardown_timer(), kernel is
supposed to crash with 'kernel BUG at mm/slub.c:3341!'

Solve the issue by making name a fixed length string inside struct
xen_clock_event_device. 16 bytes should be enough.
Suggested-by: NLaszlo Ersek <lersek@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

7be0772d

x86/xen: add extra memory for remapped frames during setup · a97dae1a

由 David Vrabel 提交于 1月 07, 2015

If the non-RAM regions in the e820 memory map are larger than the size
of the initial balloon, a BUG was triggered as the frames are remaped
beyond the limit of the linear p2m.  The frames are remapped into the
initial balloon area (xen_extra_mem) but not enough of this is
available.

Ensure enough extra memory regions are added for these remapped
frames.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>

a97dae1a

x86/xen: don't count how many PFNs are identity mapped · bc7142cf

由 David Vrabel 提交于 1月 07, 2015

This accounting is just used to print a diagnostic message that isn't
very useful.
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>

bc7142cf

x86/xen: Free bootmem in free_p2m_page() during early boot · 701a261a

由 Boris Ostrovsky 提交于 1月 07, 2015

With recent changes in p2m we now have legitimate cases when
p2m memory needs to be freed during early boot (i.e. before
slab is initialized).
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

701a261a

23 12月, 2014 1 次提交

x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer() · 8b8cd8a3

由 Boris Ostrovsky 提交于 12月 22, 2014

There is no reason for having it and, with commit 250a1ac6 ("x86,
smpboot: Remove pointless preempt_disable() in
native_smp_prepare_cpus()"), it prevents HVM guests from booting.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

8b8cd8a3

11 12月, 2014 1 次提交

xen: switch to post-init routines in xen mmu.c earlier · cdfa0bad

由 Juergen Gross 提交于 12月 10, 2014

With the virtual mapped linear p2m list the post-init mmu operations
must be used for setting up the p2m mappings, as in case of
CONFIG_FLATMEM the init routines may trigger BUGs.

paging_init() sets up all infrastructure needed to switch to the
post-init mmu ops done by xen_post_allocator_init(). With the virtual
mapped linear p2m list we need some mmu ops during setup of this list,
so we have to switch to the correct mmu ops as soon as possible.

The p2m list is usable from the beginning, just expansion requires to
have established the new linear mapping. So the call of
xen_remap_memory() had to be introduced, but this is not due to the
mmu ops requiring this.

Summing it up: calling xen_post_allocator_init() not directly after
paging_init() was conceptually wrong in the beginning, it just didn't
matter up to now as no functions used between the two calls needed
some critical mmu ops (e.g. alloc_pte). This has changed now, so I
corrected it.
Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

cdfa0bad

08 12月, 2014 2 次提交

xen: annotate xen_set_identity_and_remap_chunk() with __init · 76f0a486

由 Juergen Gross 提交于 12月 08, 2014

Commit 5b8e7d80 removed the __init
annotation from xen_set_identity_and_remap_chunk(). Add it again.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

76f0a486

xen: introduce helper functions to do safe read and write accesses · 90fff3ea

由 Juergen Gross 提交于 12月 05, 2014

Introduce two helper functions to safely read and write unsigned long
values from or to memory when the access may fault because the mapping
is non-present or read-only.

These helpers can be used instead of open coded uses of __get_user()
and __put_user() avoiding the need to do casts to fix sparse warnings.

Use the helpers in page.h and p2m.c. This will fix the sparse
warnings when doing "make C=1".
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

90fff3ea

04 12月, 2014 9 次提交

xen: Speed up set_phys_to_machine() by using read-only mappings · 2e917175