提交 · 6b6b4792f89346e47437682c7ba3438e6681c0f9 · OpenHarmony / kernel_linux

05 12月, 2009 1 次提交

由 Chris Wright 提交于 12月 04, 2009

Commit ae21ee65 "PCI: acs p2p upsteram
forwarding enabling" doesn't actually enable ACS.

Add a function to pci core to allow an IOMMU to request that ACS
be enabled.  The existing mechanism of using iommu_found() in the pci
core to know when ACS should be enabled doesn't actually work due to
initialization order;  iommu has only been detected not initialized.

Have Intel and AMD IOMMUs request ACS, and Xen does as well during early
init of dom0.

Cc: Allen Kay <allen.m.kay@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NChris Wright <chrisw@sous-sol.org>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

5d990b62

04 12月, 2009 8 次提交

xen: call clock resume notifier on all CPUs · f6eafe36

由 Ian Campbell 提交于 11月 25, 2009

tick_resume() is never called on secondary processors. Presumably this
is because they are offlined for suspend on native and so this is
normally taken care of in the CPU onlining path. Under Xen we keep all
CPUs online over a suspend.

This patch papers over the issue for me but I will investigate a more
generic, less hacky, way of doing to the same.

tick_suspend is also only called on the boot CPU which I presume should
be fixed too.
Signed-off-by: NIan Campbell <Ian.Campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>

f6eafe36

xen: use iret for return from 64b kernel to 32b usermode · 6aaf5d63

由 Jeremy Fitzhardinge 提交于 11月 25, 2009

If Xen wants to return to a 32b usermode with sysret it must use the
right form.  When using VCGF_in_syscall to trigger this, it looks at
the code segment and does a 32b sysret if it is FLAT_USER_CS32.
However, this is different from __USER32_CS, so it fails to return
properly if we use the normal Linux segment.

So avoid the whole mess by dropping VCGF_in_syscall and simply use
plain iret to return to usermode.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NJan Beulich <jbeulich@novell.com>
Cc: Stable Kernel <stable@kernel.org>

6aaf5d63

xen: register runstate info for boot CPU early · 499d19b8

由 Jeremy Fitzhardinge 提交于 11月 24, 2009

printk timestamping uses sched_clock, which in turn relies on runstate
info under Xen.  So make sure we set it up before any printks can
be called.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

499d19b8

xen: register runstate on secondary CPUs · 02889672

由 Ian Campbell 提交于 11月 24, 2009

The commit "xen: re-register runstate area earlier on resume" caused us
to never try and setup the runstate area for secondary CPUs. Ensure that
we do this...
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

02889672

xen: register timer interrupt with IRQF_TIMER · f350c792

由 Ian Campbell 提交于 11月 24, 2009

Otherwise the timer is disabled by dpm_suspend_noirq() which in turn prevents
correct operation of stop_machine on multi-processor systems and breaks
suspend.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

f350c792

xen: correctly restore pfn_to_mfn_list_list after resume · fa24ba62

由 Ian Campbell 提交于 11月 21, 2009

pvops kernels >= 2.6.30 can currently only be saved and restored once. The
second attempt to save results in:

ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000
ERROR Internal error: Failed to map/save the p2m frame list

I finally narrowed it down to:

commit cdaead6b
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Date: Fri Feb 27 15:34:59 2009 -0800

xen: split construction of p2m mfn tables from registration

Build the p2m_mfn_list_list early with the rest of the p2m table, but
register it later when the real shared_info structure is in place.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

The unforeseen side-effect of this change was to cause the mfn list list to not
be rebuilt on resume. Prior to this change it would have been rebuilt via
xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list().

Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend().
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

fa24ba62

xen: restore runstate_info even if !have_vcpu_info_placement · 3905bb2a

由 Jeremy Fitzhardinge 提交于 11月 21, 2009

Even if have_vcpu_info_placement is not set, we still need to set up
the runstate area on each resumed vcpu.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

3905bb2a

xen: re-register runstate area earlier on resume. · be012920

由 Ian Campbell 提交于 11月 21, 2009

This is necessary to ensure the runstate area is available to
xen_sched_clock before any calls to printk which will require it in
order to provide a timestamp.

I chose to pull the xen_setup_runstate_info out of xen_time_init into
the caller in order to maintain parity with calling
xen_setup_runstate_info separately from calling xen_time_resume.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

be012920

17 11月, 2009 1 次提交

x86, mm: Clean up and simplify NX enablement · 4763ed4d

由 H. Peter Anvin 提交于 11月 13, 2009

The 32- and 64-bit code used very different mechanisms for enabling
NX, but even the 32-bit code was enabling NX in head_32.S if it is
available.  Furthermore, we had a bewildering collection of tests for
the available of NX.

This patch:

a) merges the 32-bit set_nx() and the 64-bit check_efer() function
   into a single x86_configure_nx() function.  EFER control is left
   to the head code.

b) eliminates the nx_enabled variable entirely.  Things that need to
   test for NX enablement can verify __supported_pte_mask directly,
   and cpu_has_nx gives the supported status of NX.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Chris Wright <chrisw@sous-sol.org>
LKML-Reference: <1258154897-6770-5-git-send-email-hpa@zytor.com>
Acked-by: NKees Cook <kees.cook@canonical.com>

4763ed4d

05 11月, 2009 1 次提交

xen: move Xen-testing predicates to common header · 1ccbf534

由 Jeremy Fitzhardinge 提交于 10月 06, 2009

Move xen_domain and related tests out of asm-x86 to xen/xen.h so they
can be included whenever they are necessary.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>

1ccbf534

04 11月, 2009 2 次提交

cpumask: Use modern cpumask style in Xen · d7d3756c

由 Rusty Russell 提交于 11月 03, 2009

Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
LKML-Reference: <200911031458.38406.rusty@rustcorp.com.au>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d7d3756c

xen: mask extended topology info in cpuid · 82d64699

由 Jeremy Fitzhardinge 提交于 10月 22, 2009

A Xen guest never needs to know about extended topology, and knowing
would just confuse it.

This patch just zeros ebx in leaf 0xb which indicates no topology info,
preventing a crash under Xen on cpus which support this leaf.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

82d64699

29 10月, 2009 1 次提交

percpu: make percpu symbols in xen unique · c6e22f9e

由 Tejun Heo 提交于 10月 29, 2009

This patch updates percpu related symbols in xen such that percpu
symbols are unique and don't clash with local symbols.  This serves
two purposes of decreasing the possibility of global percpu symbol
collision and allowing dropping per_cpu__ prefix from percpu symbols.

* arch/x86/xen/smp.c, arch/x86/xen/time.c, arch/ia64/xen/irq_xen.c:
  add xen_ prefix to percpu variables

* arch/ia64/xen/time.c: add xen_ prefix to percpu variables, drop
  processed_ prefix and make them static

Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
which cause name clashes" patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>

c6e22f9e

28 10月, 2009 1 次提交

xen: set up mmu_ops before trying to set any ptes · 973df35e

由 Jeremy Fitzhardinge 提交于 10月 27, 2009

xen_setup_stackprotector() ends up trying to set page protections,
so we need to have vm_mmu_ops set up before trying to do so.
Failing to do so causes an early boot crash.

[ Impact: Fix early crash under Xen. ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

973df35e

02 10月, 2009 1 次提交

const: constify remaining file_operations · 828c0950

由 Alexey Dobriyan 提交于 10月 01, 2009

[akpm@linux-foundation.org: fix KVM]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

828c0950

24 9月, 2009 1 次提交

cpumask: use mm_cpumask() wrapper: x86 · 78f1c4d6

由 Rusty Russell 提交于 9月 24, 2009

Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).

It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

78f1c4d6

22 9月, 2009 1 次提交

xen: check EFER for NX before setting up GDT mapping · b75fe4e5

由 Jeremy Fitzhardinge 提交于 9月 21, 2009

x86-64 assumes NX is available by default, so we need to
explicitly check for it before using NX.  Some first-generation
Intel x86-64 processors didn't support NX, and even recent systems
allow it to be disabled in BIOS.

[ Impact: prevent Xen crash on NX-less 64-bit machines ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>

b75fe4e5

16 9月, 2009 1 次提交

x86: Move get/set_wallclock to x86_platform_ops · 7bd867df

由 Feng Tang 提交于 9月 10, 2009

get/set_wallclock() have already a set of platform dependent
implementations (default, EFI, paravirt). MRST will add another
variant.

Moving them to platform ops simplifies the existing code and minimizes
the effort to integrate new variants.
Signed-off-by: NFeng Tang <feng.tang@intel.com>
LKML-Reference: <new-submission>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

7bd867df

10 9月, 2009 3 次提交

xen: use stronger barrier after unlocking lock · 2496afbf

由 Yang Xiaowei 提交于 9月 09, 2009

We need to have a stronger barrier between releasing the lock and
checking for any waiting spinners.  A compiler barrier is not sufficient
because the CPU's ordering rules do not prevent the read xl->spinners
from happening before the unlock assignment, as they are different
memory locations.

We need to have an explicit barrier to enforce the write-read ordering
to different memory locations.

Because of it, I can't bring up > 4 HVM guests on one SMP machine.

[ Code and commit comments expanded -J ]

[ Impact: avoid deadlock when using Xen PV spinlocks ]
Signed-off-by: NYang Xiaowei <xiaowei.yang@intel.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

2496afbf

xen: only enable interrupts while actually blocking for spinlock · 4d576b57

由 Jeremy Fitzhardinge 提交于 9月 09, 2009

Where possible we enable interrupts while waiting for a spinlock to
become free, in order to reduce big latency spikes in interrupt handling.

However, at present if we manage to pick up the spinlock just before
blocking, we'll end up holding the lock with interrupts enabled for a
while.  This will cause a deadlock if we recieve an interrupt in that
window, and the interrupt handler tries to take the lock too.

Solve this by shrinking the interrupt-enabled region to just around the
blocking call.

[ Impact: avoid race/deadlock when using Xen PV spinlocks ]
Reported-by: N"Yang, Xiaowei" <xiaowei.yang@intel.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

4d576b57

xen: make -fstack-protector work under Xen · 577eebea

由 Jeremy Fitzhardinge 提交于 8月 27, 2009

-fstack-protector uses a special per-cpu "stack canary" value.
gcc generates special code in each function to test the canary to make
sure that the function's stack hasn't been overrun.

On x86-64, this is simply an offset of %gs, which is the usual per-cpu
base segment register, so setting it up simply requires loading %gs's
base as normal.

On i386, the stack protector segment is %gs (rather than the usual kernel
percpu %fs segment register).  This requires setting up the full kernel
GDT and then loading %gs accordingly.  We also need to make sure %gs is
initialized when bringing up secondary cpus too.

To keep things consistent, we do the full GDT/segment register setup on
both architectures.

Because we need to avoid -fstack-protected code before setting up the GDT
and because there's no way to disable it on a per-function basis, several
files need to have stack-protector inhibited.

[ Impact: allow Xen booting with stack-protector enabled ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

577eebea

01 9月, 2009 1 次提交

x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT · 0cc0213e

由 H. Peter Anvin 提交于 8月 31, 2009

For some reason, the _safe MSR functions returned -EFAULT, not -EIO.
However, the only user which cares about the return code as anything
other than a boolean is the MSR driver, which wants -EIO.  Change it
to -EIO across the board.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

0cc0213e

31 8月, 2009 8 次提交

x86: Move tsc_calibration to x86_init_ops · 2d826404

由 Thomas Gleixner 提交于 8月 20, 2009

TSC calibration is modified by the vmware hypervisor and paravirt by
separate means. Moorestown wants to add its own calibration routine as
well. So make calibrate_tsc a proper x86_init_ops function and
override it by paravirt or by the early setup of the vmware
hypervisor.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

2d826404

x86: Add timer_init to x86_init_ops · 845b3944

由 Thomas Gleixner 提交于 8月 19, 2009

The timer init code is convoluted with several quirks and the paravirt
timer chooser. Figuring out which code path is actually taken is not
for the faint hearted.

Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and
replace the paravirt time chooser and the remaining x86 quirk with a
simple x86_init_ops function.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

845b3944

x86: Move percpu clockevents setup to x86_init_ops · 736decac

由 Thomas Gleixner 提交于 8月 19, 2009

paravirt overrides the setup of the default apic timers as per cpu
timers. Moorestown needs to override that as well.

Move it to x86_init_ops setup and create a separate x86_cpuinit struct
which holds the function for the secondary evtl. hotplugabble CPUs.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

736decac

x86: Move xen_post_allocator_init into xen_pagetable_setup_done · f1d7062a

由 Thomas Gleixner 提交于 8月 20, 2009

We really do not need two paravirt/x86_init_ops functions which are
called in two consecutive source lines. Move the only user of
post_allocator_init into the already existing pagetable_setup_done
function.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

f1d7062a

T
x86: Move paravirt pagetable_setup to x86_init_ops · 030cb6c0
由 Thomas Gleixner 提交于 8月 20, 2009
```
Replace more paravirt hackery by proper x86_init_ops.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
```
030cb6c0

x86: Move paravirt banner printout to x86_init_ops · 6f30c1ac

由 Thomas Gleixner 提交于 8月 20, 2009

Replace another obscure paravirt magic and move it to
x86_init_ops. Such a hook is also useful for embedded and special
hardware.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6f30c1ac

x86: Replace ARCH_SETUP by a proper x86_init_ops · 42bbdb43

由 Thomas Gleixner 提交于 8月 20, 2009

ARCH_SETUP is a horrible leftover from the old arch/i386 mach support
code. It still has a lonely user in xen. Move it to x86_init_ops.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

42bbdb43

x86: Move irq_init to x86_init_ops · 66bcaf0b

由 Thomas Gleixner 提交于 8月 20, 2009

irq_init is overridden by x86_quirks and by paravirts. Unify the whole
mess and make it an unconditional x86_init_ops function which defaults
to the standard function and can be overridden by the early platform
code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

66bcaf0b

27 8月, 2009 1 次提交

x86: Move memory_setup to x86_init_ops · 6b18ae3e

由 Thomas Gleixner 提交于 8月 20, 2009

memory_setup is overridden by x86_quirks and by paravirts with weak
functions and quirks. Unify the whole mess and make it an
unconditional x86_init_ops function which defaults to the standard
function and can be overridden by the early platform code.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

6b18ae3e

26 8月, 2009 2 次提交

x86, xen: Initialize cx to suppress warning · 7adb4df4

由 H. Peter Anvin 提交于 8月 25, 2009

Initialize cx before calling xen_cpuid(), in order to suppress the
"may be used uninitialized in this function" warning.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>

7adb4df4

x86, xen: Suppress WP test on Xen · d560bc61

由 Jeremy Fitzhardinge 提交于 8月 25, 2009

Xen always runs on CPUs which properly support WP enforcement in
privileged mode, so there's no need to test for it.

This also works around a crash reported by Arnd Hannemann, though I
think its just a band-aid for that case.
Reported-by: NArnd Hannemann <hannemann@nets.rwth-aachen.de>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d560bc61

20 8月, 2009 1 次提交

xen: rearrange things to fix stackprotector · ce2eef33

由 Jeremy Fitzhardinge 提交于 8月 17, 2009

Make sure the stack-protector segment registers are properly set up
before calling any functions which may have stack-protection compiled
into them.

[ Impact: prevent Xen early-boot crash when stack-protector is enabled ]
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

ce2eef33

16 5月, 2009 1 次提交

x86: Fix performance regression caused by paravirt_ops on native kernels · b4ecc126

由 Jeremy Fitzhardinge 提交于 5月 13, 2009

Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.

It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:

 | commit 8efcbab6
 | Date:   Mon Jul 7 12:07:51 2008 -0700
 |
 |     paravirt: introduce a "lock-byte" spinlock implementation

The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test).  The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons).  This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions.  But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).

If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:

		nopv		Pv-nospin	Pv-spin
CPU cycles	100.00%		99.89%		102.18%
instructions	100.00%		100.10%		100.15%
CPI		100.00%		99.79%		102.03%
cache ref	100.00%		100.84%		100.28%
cache miss	100.00%		90.47%		88.56%
cache miss rate	100.00%		89.72%		88.31%
branches	100.00%		99.93%		100.04%
branch miss	100.00%		103.66%		107.72%
branch miss rt	100.00%		103.73%		107.67%
wallclock	100.00%		99.90%		102.20%

The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.

(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates.  Not too surprising, but it suggests that
the non-pvops kernel is over-inlined.  On the flipside,
the branch misses go up correspondingly...)

So, what's the fix?

Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls.  For example, the compiler
generated code for paravirtualized _spin_lock is:

<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq  *0xffffffff805a5b30
<_spin_lock+22>:	retq

The indirect call will get patched to:
<_spin_lock+0>:		mov    %gs:0xb4c8,%rax
<_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
<_spin_lock+15>:	callq <__ticket_spin_lock>
<_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
<_spin_lock+22>:	retq

One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled).  That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case.  The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.

The other obvious answer is to disable pv-spinlocks.  Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin).  Still it is a
reasonable short-term workaround.

[ Impact: fix pvops performance regression when running native ]
Analysed-by: N"Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: N"Li Xin" <xin.li@intel.com>
Analysed-by: N"Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b4ecc126

13 5月, 2009 1 次提交

xen: use header for EXPORT_SYMBOL_GPL · 44408ad7

由 Randy Dunlap 提交于 5月 12, 2009

mmu.c needs to #include module.h to prevent these warnings:

 arch/x86/xen/mmu.c:239: warning: data definition has no type or storage class
 arch/x86/xen/mmu.c:239: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
 arch/x86/xen/mmu.c:239: warning: parameter names (without types) in function declaration

[ Impact: cleanup ]
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

44408ad7

09 5月, 2009 3 次提交

xen: cache cr0 value to avoid trap'n'emulate for read_cr0 · a789ed5f

由 Jeremy Fitzhardinge 提交于 4月 24, 2009

stts() is implemented in terms of read_cr0/write_cr0 to update the
state of the TS bit.  This happens during context switch, and so
is fairly performance critical.  Rather than falling back to
a trap-and-emulate native read_cr0, implement our own by caching
the last-written value from write_cr0 (the TS bit is the only one
we really care about).

Impact: optimise Xen context switches
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

a789ed5f

xen/x86-64: clean up warnings about IST-using traps · b80119bb

由 Jeremy Fitzhardinge 提交于 4月 24, 2009

Ignore known IST-using traps. Aside from the debugger traps, they're
low-level faults which Xen will handle for us, so the kernel needn't
worry about them. Keep warning in case unknown trap starts using IST.

Impact: suppress spurious warnings
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

b80119bb

xen/x86-64: fix breakpoints and hardware watchpoints · 6cac5a92

由 Jeremy Fitzhardinge 提交于 3月 29, 2009

Native x86-64 uses the IST mechanism to run int3 and debug traps on
an alternative stack.  Xen does not do this, and so the frames were
being misinterpreted by the ptrace code.  This change special-cases
these two exceptions by using Xen variants which run on the normal
kernel stack properly.

Impact: avoid crash or bad data when IST trap is invoked under Xen
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

6cac5a92

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多